A New Kind of Science: The NKS Forum (http://forum.wolframscience.com/index.php)
- Applied NKS (http://forum.wolframscience.com/forumdisplay.php?forumid=4)

Posted by Stedman on 12-20-2005 06:21 AM:

Non-deterministic state machines: Do they "make sense"?

After reading "A New Kind of Science" it appears that the CA and other systems (Turing machines, regsiter machines, etc.) discussed are deterministic. I am also aware that there is another class of so called "non-deterministic" finite state machines, which for instance solve so called NP problems in P time.

My question, and it may perhaps be the case that NKS has the answer, is this: Basically, I don't really "buy" non-determinstic machines. Non-determinism in general just really bothers me. After a whole year of quatum mechanics classes, I still don't really "buy" QM either. I mean, sure I can work the problems within the formalism of operators on Hilbert space, but in terms of the semiphilsophical assertion that the values of "observables" measured behave non-determinstically--I'm unsatisfied with it. And here's why.

I'm not really trying to get into a physics discussion here by the way (any physicists in the house?)--I'm simply trying to use QM as an analogy for my real question, which is what exactly is a non-deterministic machine and can it be emmulated by a deterministic universal machine like Rule 110?

But going with the analogy for a second...
The idea is that before a measurement, the values of some observable (say momentum) are known only as a probability distribution. But then after the measurement (after the wavefunction collapses) the value is totally known. That is, momentum is say 5 g.m/s. Now what this sounds like to me is that we are basically getting "something" for "nothing". More specifically, what I mean is that any system where probability distributions are true at a base level seems like it is basically just arbitrarily "choosing" what value to make appear. But this makes it seem like the system is grabbing information out of thin air. From nothing but a probability distribution, it in effect tosses you an actual solid value.

In the same way, non-deterministic machines seem to me like they are "cheating". Say you have an NP problem where a binary string of length n must be chosen with the correct 1's and 0's in the correct places. Then as my professor once said, a non-deterministic machine would be like an oracle that at each step, would "know" what value to choose, and thus after simply passing over the string once (ie after n steps) the problem would be solved. But I guess my question is, how does it know?. I feel like the information to know this has to come from somewhere.

Now, I'd be willing to accept a breakdown of conservation of momentum for instance, since this would just entail a loss of a symmetry constraint on the equation of motion. But this "non-conservation of information" just described seems to me like a deeper problem. Is it?

What is the real problem here? Am I just not thinking generally enough? Are non-deterministic machines a well-defined mathematical construct that one can speak of in terms of a finite set and some functions? If so, great, but could they ever apply to "real life"? A more formal question along the same line would be: can a non-determinsitic machine be emulated by a univeral deterministic machine like Rule 110?

Thanks so much for any information/enlightenment. Also, any suggestions for good texs on the subject of non-determinsitic machines, NP-complete problems, etc.?

Best,
--Stedman

Posted by estrabd on 12-20-2005 04:35 PM:

Re: Non-deterministic state machines: Do they "make sense"?

Originally posted by Stedman
After reading "A New Kind of Science" it appears that the CA and other systems (Turing machines, regsiter machines, etc.) discussed are deterministic. I am also aware that there is another class of so called "non-deterministic" finite state machines, which for instance solve so called NP problems in P time.

FSAs describe regular languages, Turing Machines (think your desktop PC) perform computations.

Firstly, DFA == NFA == RE. DFA are "regular language" acceptors/generators, and rely on states that are aware only of themselves and where to go next on a given input. Whereas a DFA returns a single state on a given input, an NFA may return a set of states. There is also a class of NFAs that provide transitions on epsilon (i.e., a null symbol). An e-transition is like a "ladder" in shoots n' ladders where you may continue on to the next state with out any input.

In a NFA, a string may follow multiple paths concurrently, but ultimately all invalid paths "die on the vine". If any of these paths put the NFA into a final or accepting state, then the string being tested is considered "valid" for the regular language described by the NFA. An NFA may be converted to a DFA using the well known "subset construction" algorithm.

Non-deterministic Turing Machines solve NP-complete problems in polynomial time. It is like it computes all possible solutions concurrently, and at the end of the computation, only the correct answer survives.

Modern computers are deterministic, so any executing over any non-deterministict type of machine (FA or Turing) must be simulated.

I don't recall anything explicitly stated in the book, but I can't imagine why you can't model a non-deterministic machine using the methods described in NKS. If the examples in the book were explicitly deterministic, then this was probably shown for clarity.

Non-determinism as applied to all of these abstract models simply means that an infinite number of paths may be followed concurrently - think parallel processing.

My question, and it may perhaps be the case that NKS has the answer, is this: Basically, I don't really "buy" non-determinstic machines. Non-determinism in general just really bothers me. After a whole year of quatum mechanics classes, I still don't really "buy" QM either.

Not sure about QM, but non-determinism conceptually is sound and is typically used as a way to describe a deterministic model in a much more compact way. For exampe, an NFA of N nodes may represent an DFA of 2^N nodes.

I'm not really trying to get into a physics discussion here by the way (any physicists in the house?)--I'm simply trying to use QM as an analogy for my real question, which is [b]what exactly is a non-deterministic machine and can it be emmulated by a deterministic universal machine like Rule 110?

A non-deterministic machine computes all solutions at once. I would assume that using the 2-d rule 110, you are limited to computing only one solution at a time. However, I look at it like this - if you want to simulate a non-deterministic turing machine using rule 110, then for each line of computation, you'd need an additional dimension (or "slice") representing the TM (built using 110) ready to perform a concurrent line of execution. This analogy is akin to simulating an N-TM using multiple processors.

In the same way, non-deterministic machines seem to me like they are "cheating". Say you have an NP problem where a binary string of length n must be chosen with the correct 1's and 0's in the correct places. Then as my professor once said, a non-deterministic machine would be like an oracle that at each step, would "know" what value to choose, and thus after simply passing over the string once (ie after n steps) the problem would be solved. But I guess my question is, how does it know?. I feel like the information to know this has to come from somewhere.

The oracle is not the same thing as a N-TM. In the problem you give above, not deterministically guessing the string would be like guessing and checking all strings of length n at once. 1 and only 1 of the guesses will be correct, but in your non-deterministic world, you are able to test all of them concurrently. To get the correct string, simply select the one that tested correctly.
Now, I'd be willing to accept a breakdown of conservation of momentum for instance, since this would just entail a loss of a symmetry constraint on the equation of motion. But this "non-conservation of information" just described seems to me like a deeper problem. Is it?

It's like "lossy" data compression. JPGs compress the data image in such a way that it throws out information not required for humans to interpret what it is. When creating a JPG, you determine how much loss is acceptible. Once you have a JPG, it is impossible to get all of the original data back.

What is the real problem here? Am I just not thinking generally enough? Are non-deterministic machines a well-defined mathematical construct that one can speak of in terms of a finite set and some functions? If so, great, but could they ever apply to "real life"? A more formal question along the same line would be: can a non-determinsitic machine be emulated by a univeral deterministic machine like Rule 110?

The problem is that you are not fully grasping what is meant by non-determinism. A non-deterministic machine may be modeled quite easily using many different types of notations, but at the end of the day any practical implemenation of a n-d machine must be done on determinisitic machines. You can simulate non-determinism by running it on multiple cpus, but at the end of the day it is not trully non-deterministic. It is my understanding that this is what quantum computing promises.

Thanks so much for any information/enlightenment. Also, any suggestions for good texs on the subject of non-determinsitic machines, NP-complete problems, etc.?

J.E. Hopcroft, R. Motwani, J.D. Ullman; Introduction to Automata Theory, Languages, and Computation;

A.V. Aho, R. Sethi, J.D. Ullman; Compilers Principles, Techniques, and Tools;

Best,
--Stedman

Posted by Jason Cawley on 12-20-2005 05:08 PM:

Several specific idealizations need to be combined to reach a set that is not computable by a deterministic machine. First I'll chop off several simpler cases that often confuse people about the real issue, showing that it is not determinism vs. multiple possible outputs, nor a closed system vs. one in continuous contact with an external environment (another formulation) that are essential features, but a cardinality issue that is much more difficult to reach in a realistic example.

Suppose I make a multi-way CA that chooses at each step whether to act like 0 or like 1. It now has two initial conditions - the ordinary one n bits long, and another with one bit per step, which "calls the coin flips" as you run through the steps. (In Mathematica terms, a FoldList reads in one bit per step from the second initial). If I wanted to, I could then pair up my two initial conditions and enumerate all of them (with a pairing function e.g.). This is formally equivalent to a non-deterministic system that generates the new bit ex nihilo from step to step. But it is deterministic - it just has a more elaborate initial condition and feeds that initial condition into the evolution is a slightly different way. Since they are in one to one correspondance, the computations they reach are the same.

OK, but suppose I allow faster branching, a thousand fold at each step. Then I need 10 bits per step to call the coin flips (2^10 = 1024), but the same scheme works. Same computations reached. To reach a larger space, I need an unbounded per step input. In contact with an external environment is not enough, the information channel needs to be limitless. Any size as large as you please that is not exceeded at any step, guareenteed, means the space reached is the same. Only the possibility of an infinite amount of information entering on a single step, can change the cardinality reached, and break the one to one correspondance. If the amount of branching is variable at each step, same logic applies. If that amount is bounded, you reach the deterministically computables only, nothing added.

Suppose I allow an infinite input at each step, but I know the computation eventually halts. Then again I reach only the computables. To see this, imagine the computation printed out like a CA evolution running down the page, but in the mathematical idealization we have passed to the infinite width limit, and we are flipping a coin for each cell. Well, just turn the paper sideways. Countable, same issue as before, infinite on one axis but not on the other. I encode all the random flips for location i into one mega-digit of my initial (then pair them all up with the ordinary initial). I can do so because the number of steps is finite.

So multiway is not enough. External interaction is not enough. To break the countable cardinality, I need a potentially infinite information channel (countable infinite but not finite) at each of a potentially infinite number of steps (countable infinite but not finite). A countable by countable input is a real cardinality input. If my system evolves according to a real number cardinality initial condition, using an amount of information that is beyond just infinite, but continuum infinity, then and only then I can (though need not) reach a space of possible outcomes larger than the computables. This needed information amount can always be viewed as an initial condition of a sort. Non-deterministic countable infinite initials with halting problems (thus potentially infinite time as well) map to the reals rather than the integers.

If you start at the reals, the computation part adds nothing. You aren't computable because you aren't even name-able, as Chaitin likes to put it. You just have infinitely many unspecified "leaves" of possibility, in no necessary relation to any computational or causative "trunk". Things just happen to be so, no (followable) explanation possible.

Strictly, this only follows if you have not only reals, but the distinction between one real and another matters for outcomes. If all the reals in some ball with positive measure are indistinguishable (within a bounded region you can't tell apart) on the outcome side, you can find a countable cover, by only asking where some representative from within that ball lands. And it doesn't matter how small that ball is, so long as one scale exists below which outcomes are indistinguishable. So in addition to reals, you need infinite sensitivity to the details hidden down potentially infinitely far in the real in question. (Otherwise, you've got a cut off below which it doesn't matter, get a finite length again, turn the paper...).

People frequently speak of continuity as the culprit giving rise to infinite cardinality issues, but this is somewhat misleading. In the native context of analysis, continuity is precisely what tames real number infinities, by giving us that "below epsilon, I don't care, because I will remain within delta" cut off. If a process is programmable complex, however i.e. if it depends on details of its exact digit sequence the way a program depends on every bit, not just on the first n bits with the remainder irrelevant detail - and in addition its specification requires a continuum infinite amount of information, then whackiness ensues.

Philosophically, I think continuum infinities are precisely about "distinguishability breakdown". They arise naturally from trying to consider all possible relations between infinite sets. Individual relations over infinite sets work fine, all possible relations over finite sets also work fine, and even all possible relations of less than infinite sensitivity (some form of continuity analog), over infinite sets, would work fine.

But trying to distinguish all possible infinitely sensitive relations between infinite classes of objects, is trying to distinguish what by hypothesis you've made indistinguishable. You are left with the bare idea "things one has no way of telling apart might in principle still be different". As an abstract mental fact, that can be entertained. Spinning whole mathematics on it is more than a bit like the proverbial angels on the head of the pin. It is in fact just the modernized form of that historical discussion.

The Church-Turing thesis tries to deal with these issues by postulating that only computable processes are realizable in our universe. Of any given outcome, it is in general formally undecidedable whether some deterministic machine-initial pair produces that outcome. We know from enumeration arguments that most possible strings are not computable. But we don't have any (ok, much - see below) reason to think those "possibles" are physically possible as opposed to possible as mathematical abstractions.

The one place people wonder about possible real-infinity processes in our universe is in quantum mechanics or QFT. There we take averages over a space of possible paths, which is continuum infinite. We observe multiway behavior at the level we can distinguish states. When we impute those point-wise, a state at each point, incidentally. One speculative possibility is we might be looking at halves of real possibles with positive measure, real "spannedness", with the extra imputed freedom restricted in reality to the other "endpoint" of an extended event.

There is some confusion over whether the space of reachable computations for QC set ups is any larger than the computables, as a result. The conventional understanding is that it reaches just the computables but may do so faster, by exploiting multi-way evolution. Almost arbitrarily faster for some specific computations - see the previous post on parallel processing. If you can do a countable infinity of ordinary computations in parallel, in finite time you can reach the same computables a deterministic machine can reach in infinite time. This corresponds to one of the "turn the paper 90 degrees" transpositions above.

You don't get full continuum infinities for the countable-cover reason - the projections into measurables are continuous maps. At times some have nevertheless claimed real cardinalities might be reached by QC processes, but that is largely a misunderstanding of the issues. If a QC system actually accesses a countable infinity information channel at each step, then in principle leaving it in "unitary evolution" (no measurement projecting onto some chosen countable cover) would "visit" a continuum infinite state-space. But perform any measurement and the computation is finite in the time dimension, and you are back in computable-land. You just may have gotten there in finite time rather than infinite time - if you really had access to a countably-infinite parallel process (rather than just a bounded multiway process) for finite time.

Since QFT is our most successful fundamental theory and it is both non-deterministic and uses continuum infinities to specify states and processes, there are reasonable doubts about the thesis that everything physically realizable is deterministically computable. But the requirements for getting beyond the computables are more stringent that many suppose. I hope this helps.