[complexity <=> irreducibility] - A New Kind of Science: The NKS ForumA New Kind of Science: The NKS Forum
Pages:1
complexity <=> irreducibility
(Click here to view the original thread with full colors/images)
Posted by: Mike Lin
The Principle of Computational Equivalence implies that observed complexity is associated with computational irreducibility.
Chapter 10 is a pretty convincing circumstantial argument that lack of repetition and nesting basically equates to observed complexity, and the highly speculative argument on pp.632-635 extends this to all possible means of observation. On pp.745-748 Wolfram tries to link observed complexity with irreducibility. I think his argument here is confusing and problematic.
In fact Wolfram ends up giving a specific example of an anomaly to his claim (I will use the Kuhnian term "anomaly" instead of "counterexample" since we aren't talking about mathematical certainties here). It turns out that the base-10 digit sequences of the powers of 2 (1, 2, 4, 8, ...) look complex when you map them out as a picture, so you might expect that it is irreducible to compute them (i.e. you could only find the n'th line of the picture by knowing most of the previous lines). But actually there is an ancient algorithm that lets you find it quickly, the square-and-multiply method for fast exponentiation. So this apparently complicated system is actually reducible.
I am pretty appalled by the way Wolfram waves this off: "I strongly suspect systems like this are very rare", then change the subject.
The example cited, fast exponentiation, is not some kind of obscure, useless contraption. Everybody in computer science learns about it, usually as a freshman. I am sure that Mathematica uses one of its more advanced cousins. It is an important system and any problems it raises must be addressed.
If in fact it is at all common for apparently complex phenomena to be reducible, then this makes it more difficult to accept the PCE. Of course Wolfram could be right and this could be "very rare", but the basic and ancient nature of this system makes me question why I should believe that this is the case.
In any case, I am glad that Wolfram at least mentioned the fast exponentiation example. Had he just excised it from the book, it might have taken someone a long time to notice that it raises some doubts.
Posted by: Adam B.
Before I post, let me please try to summarize your points (listed as points 1 - 5) so that I am on the right footing:
(1) According to Wolfram, lack of visual repetition and nesting equates to observed complexity, which normally implies some sort of computational irreducibility.
(2) The base-10 digit sequenqes of the powers of 2 appear complex.
(3) The well-known fast exponentiation algorithm can predict the output of every "line" of output of the base-10 digit sequences of powers of 2.
(4) Because the fast exponentiation algorithm is less expensive, computationally speaking, than the recursive method at finding every line of output, it serves as a potential counterexample to Wolfram's belief that visual complexity is the result of computational irreducibility, or something close to irreducibility.
(5) Because the potential counterexample is old (and also because it is widespread, as compsci students know), you question Wolfram's claim that such systems are rare.
Mike, first off, wonderful post. This is one of the more challenging aspects of NKS, and I am still not quite sure what the answers to the questions these issues raise are. But here are some things to consider to test the validity of the points:
(A) Regarding (2), what if the digits are represented in some other base, like base-2? The base-2 sequence does not appear complex. Does the fact that we can find a visually non-complex representation of a system say anything about whether something is actually complex? You may want to closely follow the section on processes of perception and analysis.
(B) Regarding (3) and (4), the fast exponentiation algorithm is a consequence of well-defined (even in ancient times) mathematical operations, so we should expect that it will be able to predict the outcome at each step. There are many, many algorithms that do such things for number systems that under one representation appear visually complex but in fact are tightly linked to predictable algebraic rules. When we define the rules for the growth of a potentially non-universal Euclidean system, we can usually reduce those rules to be less computationally expensive.
(C) Regarding (5), while it is tempting to use the "well, it has been around for thousands of years" reasoning to assert that visually complex, yet computationally reducible systems are non-rare, one should carefully consider that such logic is potentially a converse accident. In other words, even if that one counterexample is strong, the evidence of one case does not imply a preponderance of evidence of other such cases, even if a lot of time has passed since the creation of such an algorithm.
To sum up, there are a lot of essentially Euclidean systems for which determining the size, and thus the sequence, of a line of random-looking output generated in a given base is quite simple, so it is clear that such systems exist. But do these types of systems contradict the PCE? What are the implications of something being universal or non-universal, yet visually complex (under a given representation)?
Well, that's about all I can think of for today on this post. Perhaps somebody with more experience can help out here.
Posted by: Mike Lin
Hi Adam, thanks for your thoughts.
First, a general clarification. I want to stress that I am avoiding use of the term "counterexample". Because Wolfram uses qualifiers like "essentially all", "almost always", etc., there is no such thing as a counterexample to his claims. Actually, the qualifiers nonwithstanding, it is philosophically arguable that there is no such thing as a counterexample to any empirical belief whatsoever. So, I am not criticizing Wolfram for using his qualifiers; in fact, I think science would be more honest to use them more. I would venture to guess that NKS critics who complain about "falsifiability" may be uninformed on the matter.
(A) The base-2 sequence does not appear complex. Does the fact that we can find a visually non-complex representation of a system say anything about whether something is actually complex?
The point you make is exactly correct, and emphasizes the problem. Looking at the image alone, without any captions or labels, it certainly looks complex. But actually, it isn't; and just from the picture, I would not think to try converting it to base 2 digits. The problem is one of observation: looking at a system, how do I know whether it is actually complex?
Wolfram's claim is that if it looks complex, then usually it is somehow associated with a universal computation. The empirical value of the PCE is that apparent complexity is associated with universality -- it gives us a framework under which to understand all complex phenomena. But -- the fast exponentiation example in mind -- what complexity is "apparent"? To put it more harshly - what good is the PCE if I don't actually know if anything is complex?
As a side node, the base-2 representation looks simple because the specific example shows fast exponentiation for 2^b. Fast exponentiation works for a^b in general, in which case the base-2 representation may look complex. In all cases, the underlying computation is simple and repetitive.
An obvious idea, though: maybe the complexity in the picture actually arises from the computation involved in converting base 2 digits to base 10. Is that computation complex? Is it universal?
(C) Regarding (5), while it is tempting to use the "well, it has been around for thousands of years" reasoning to assert that visually complex, yet computationally reducible systems are non-rare, one should carefully consider that such logic is potentially a converse accident. In other words, even if that one counterexample is strong, the evidence of one case does not imply a preponderance of evidence of other such cases, even if a lot of time has passed since the creation of such an algorithm.
Agnostics aside, you believe that visually complex, reducible systems are either rare or common. What evidence, besides Wolfram's strong suspicions, is there that they are rare? How does the strength of that evidence compare to the existence of an ancient example to the contrary? If they are rare, is there any better explanation for the existence of the anomaly than "there just happen to a few such systems"? My complaint is that Wolfram cites an example that raises these questions -- and then waves them off. Again, I'm very glad that Wolfram included the example in the book, and I'm not asking anyone to "prove" anything -- just to convince me.
Your point is correct, and it's a good reason to view with skepticism an anomaly to an otherwise well-supported belief. I'm just not yet convinced that this belief is otherwise well-supported. And I read your other point, (B), as saying that the fast exponentiation example may belong to a much larger set of apparently complex, reducible systems.
Posted by: Jason Cawley
It is a good objection to advance, something others have noticed before. It shows real engagement with the issues raised by NKS. Someone who sees it is noticing that Wolfram is actually saying something surprising or that has real content, not something obvious that just has to be so.
Some expressions look simple in one representation system and complicated in most others. One fine example is the golden ratio, which is just a series of 1s in a continued fraction expansion, but more complicated looking in digit expansions.
One can extend this from single expressions to whole evolutions, e.g. continued multiplication by 3 looks involved in a base 2 representation but is a simple shift in base 3.
http://www.wolframscience.com/nksonline/page-119
This is generalized in the perception and analysis chapter, e.g. on page 612 and following (second paragraph from the bottom, starting at "But it turns out") -
http://www.wolframscience.com/nksonline/page-612
In addition to these relatively trivial cases (different bases, number theoretic functions like "largest common factor", etc), one can find more involved ones, where a formula to completely describe the evolution (or e.g. to give the value of a center cell on step n, or something similar) can be found, which is not immediately obvious. Within the Wolfram Science Group, we call such a solution "a crack", like cracking a safe or a breaking a code. One can deliberately engineer some algorithms to obey such a formula while showing non-trivial looking behavior, using a generating relationship from number theory for example. More impressive, one can find a crack sometimes in complexity candidates found by unrelated search methods, that have survived first glance.
If anyone has the idea from reading the NKS book that a rule generates some pattern that isn't obvious, and we just gaze in wonder and say, "must be complicated then" and give up on it, I can disabuse them of the notion. Some mess of a combinator expression (a line of "e"s in a blizzard of nested parentheses) suggests nesting to Wolfram, and a week later Matthew Szudzik has untangled a subtle form of nesting, requiring multiple substitutions - compiler-like - to become obvious. Or a recursive function is generating weird spikes, and Ed Pegg manages to find a submerged prime number relationship causing it. A candidate for complexity is poked and proded, it is a standing challenge, it is a Math Olympiad puzzle, screaming "crack this". A crack found is anything from a footnote to a paper, and on we go. The ones Wolfram calls complicated have survived a lot more than the eyes.
But Wolfram's point is that the eyes are usually right in this process. The crackable ones usually look crackable. The uncrackable looking ones usually are uncrackable, or at any rate uncracked to date. If someone found an easy crack for rule 30 it would be something of an embarassment and a strong argument against PCE. If someone found a very difficult crack for rule 30 it would be news, excellent work, and at least evidence against PCE. Wolfram has attacked it with all the standard methods and others Mathematica made possible, a skeptical Richard Feynman tried cryptanalysis techniques (he decoded Mayan hieroglyphs in his spare time, he was not a slouch at this), dozens of claimed regularities in it have been put forth by others over the years without proving in the end to provide a crack. If you want a first glance at why, look at the way the minimum boolean expressions for rule 30 blow up after just a few steps, as shown on page 618.
http://www.wolframscience.com/nksonline/page-618
Now, is it possible that for every complicated looking system there is a crack, but we are just too dim to have found it yet? Philosophically, considering all the possibilities, yes, sure. That is one of the ways PCE might not be true. There might be unknown general methods. Or it might be little islands of hyper-involved work for each particular case, so that no individual method of attack gets very many, but something can be done on a lot of others taken one at a time.
If that is what is really happening then the basic conjecture Wolfram is advancing in PCE would not be the real story. He is claiming that the complexity is real, not merely apparent, that it is the normal case not exceptional, that we find isolated cracks in a few borderline foothills, beyond which there is a combinatorial Himalayas' worth of outright uncrackability. It is a strong claim, it is saying something that need not be true but that he thinks is true. It is a standing challenge to the puzzlers of the world to produce a few thousand beautiful cracks and show it isn't so. Wolfram isn't holding his breath.
Understand, these candidates arise in the course of searches in which overall complexity is being "dialed upward" systematically. The goal is to find the simplest case of critter x that does y. So if something appears and looks promising, it must be tested. If it falls to combined attack, it is not an x that does y and the search must continue. We are deliberately in the foothills, in other words, because we are scanning for a boundary, for complexity "onset". And the eyes are still usually right. Go farther and finding complex looking cases becomes far easier and finding cracks - even starting them - becomes far more difficult.
When he claims that most of the systems that look complicated in the end will prove to actually be complicated, it is based on an intuition formed by seeing thousands of examples. It is heroic induction. Wolfram is a physicist by background and he approaches these things as would an experimental scientist. To say something about the world, the idea is to stick one's neck out, to generalize from what has been seen. He does not claim PCE is true because it follows from some definition, but as a contingent fact about the computational world, that it is. (Contingent with respect to our prior knowledge, anyway). For mathematicians, call it a conjecture if you like.
As a general methodological matter, heroic inductions can be wrong. So PCE could be wrong. It will have been useful as an hypothesis even if it is, if we can manage to actually show that. We'd know a lot more about complexity if we could, than we know today. If we can't show that, because it isn't wrong, it will be more useful still.
Before leaving the subject, I want to mention another way PCE might be wrong, besides "everything has a crack, some are just a little - or a lot - harder to find". This other way concerns class 3 behaviors. The claim that PCE applies to class 4s is probably on considerably safer ground than the full extension of it to class 3s.
The alternative is that maybe a kind of messy randomness occurs in class 3s without actually possessing the computational complexity seen in class 4s. Maybe uncontrolled information flow and the nearly all to all interaction it sets up, not only makes it harder for us to track or to engineer the things, but destroys real computational potential. Perhaps PCE properly applies to the 4s, while the 3s are fundamentally "statistical", forced to exhibit that kind of regularity and incapable of some more involved behaviors. Call this the "no class 3s" way that PCE might be wrong. (If this were so, PCE would still have a domain of applicability. But many things that appeared complex would be less complicated than some others of similar appearence. PCE would then not hold as originally stated).
It is listed as one of Wolfram's open problems in the booklet of those he came up with for the NKS 2003 conference, to show that any class 3 system is universal. I can explain a little about why Wolfram thinks, in the end, they will prove to be - although it may be far harder to establish it for any of them than for class 4s. Partially it is just the intuition that they have enough going on, that they can be viewed as performing some sort of calculation that is very involved, that we just don't see the point of and aren't able to easily track. But his prediction in the matter, and his inclusion of class 3s under the PCE claim, rests on more than that.
Class 3s can be made to perform in simpler ways than they usually do by using specific kinds of initial conditions. See the section starting on page 266.
http://www.wolframscience.com/nksonline/page-266
Potentially this can be used to create bands and/or upper areas free of the usual uncontrolled spread of information or acting as barriers to it. The boundaries of these areas might be tuned by adjusting spacing between parts of the initial conditions so coded. Will the remaining resources of the CA suffice to support universality? It has not been done. But Wolfram thinks it will prove do-able. A sketch of these ideas is presented starting on page 698 and continuing through page 705.
http://www.wolframscience.com/nksonline/page-698
So, the idea is use this sort of thing to show some class 3 can emulate some other system already known to be universal. If this scheme works (and there are difficulties, e.g. whether different information speeds are needed or not, and if they are whether class 3s can support them), then class 3s belong with 4s in Wolfram's overall PCE claim. That is what he thinks. Anybody who doesn't is welcome to show otherwise, if they can. Anybody who thinks it is a promising line of attack is invited to follow it out and solve the standing open problem, "show any class 3 system is universal", and thereby advance NKS another step. Either would be significant NKS work.
I hope this helps.
Posted by: Tony Smith
Thanks for that, Jason. It is certainly a useful summary of where the thinking has got to in the Wolfram Science team.
However I must remain skeptical about the importance of the PCE. To me, NKS is important regardless of the fate of the PCE claim, and I sometimes feel some concern that too much focus on the PCE might be putting at risk the broader goal.
On the other hand, I too expect that universality will at some stage be demonstrated near an emergent boundary in what had been previously presumed to be a Class 3. But it will be hard and it will not be obvious how to apply it more widely.
What I think is being lost in all this is that we need to get universality through a legitimate natural history, not through amazing feats of engineering based on unlikely candidates, before any of this will actually matter in the grand scheme of things.
I am confident that can be achieved and so will continue to focus my efforts in what look like the most productive corners of Class 4.
It would not be surprising if there have been more (generations x cells) of Conway's Life computed in the history of CAs than of all other CAs combined. That says something to me that is very inequivalent. And it fits well with "the eyes are usually right".
Posted by: Mike Lin
I have to disagree with Tony Smith on this one. The PCE defines the entire worldview for NKS. It is the bridge between the computer programs that we study as mathematicians, and the natural world that we want to understand as scientists. It is the reason to think that simple computer programs can tell us anything useful about the real world, and that it is therefore useful and worthwhile to study pure NKS in order to understand natural phenomena.
Without the PCE, NKS is just a few cute and interesting ways to think about some mathematical abstractions. Not that that is not worthwhile, but it's not going to save any lives.
And to Jason: As I wrote once before, I wish there were an account of NKS written with your clarity and candor. Actually, I think it would be incredibly helpful to onlookers if you were to "translate" the entire thing, if you will. But I can imagine that would step on some toes. Maybe you could at least make an indexed web page of the more important posts you've made in this forum.
That stated, I understand why Wolfram did things the way he did. I for one certainly would not know about it, had he done it any other way.
Posted by: Tony Smith
(The PCE) is the bridge between the computer programs that we study as mathematicians, and the natural world that we want to understand as scientists.
Mike, I see my disagreement as being at the margins and anything but denying "that simple computer programs can tell us (many things) useful about the real world".
It just seems obvious to me that burdening true Class 3 programs with an expectation that they might be engineerable to achieve universality is basically a distraction. The fact that simple programs often produce randomness is already very important in helping account for the resilience of the world we find ourselves in. Without such an account of resilience it would be that much more difficult to make the case that relatively fragile and less common Class 4 programs really can account for the natural organisation we see all around us.
Yes, we do live in a world produced by simple programs. No, we do not need all those programs to be universal to make that case, just enough of them, including some which we can eventually show could have arisen without engineered preconditions.
It is my choice to focus my efforts on the oft elusive edge of chaos-border of order where I am seeing pockets on low cost* organisation, but which most seem to have given up on way prematurely.
*simple starting conditions
Posted by: Jason Cawley
The simple program approach to modeling natural systems would have a domain of application even if neither spontaneous randomness generation as seen in class 3s, or universal computation as confirmed in some class 4s and suspected in others, did not arise.
It would still describe some systems exhibiting various kinds of periodic behavior or evolution to eventual fixed points. If a formal behavior can arise from simple conditions and relational rules, those simple conditions and relations are likely to occur in nature, here or there. An isomorphism to the behavior of some formal system with the same relations will then provide a model. And a useful one, because its elements can all be surveyed clearly and specified in reproducible ways.
But the simple program approach would not specifically be the proper method for handling natural systems exhibiting the specific feature "uniform randomness", or more varied behaviors, if that were the case, and all simple programs did was hit some nearby limit cycle or region of similar cycles. It is not the applicability as models, at all, that is brought into play by the observation that in fact simple programs do much more than this. It is how many places they may apply as models, and from it a fundamental intuitional shift in how natural systems themselves can be expected to behave.
Tony is right that instrinsic randomness generation, alone, would create an additional domain of applicability of simple program models, even if class 3s wound up not being universal. Isomorphisms would still arise between some non-universal natural systems exhibiting random-looking behavior, and various class 3 rules.
Simple programs are not the only way of understanding or modeling natural randomness. But it is probably overlooked just how common their use is, to understand randomness in the natural world. Essentially every computer model employing random number generation is at bottom tracing the randomness in the model to the behavior of a simple program. The Mathematica function "Random" uses Rule 30, for example.
The way in which this randomness is injected into the model can, however, vary. It can all be done in initial conditions, it can be done repeatedly in uncorrelated ways, or a simple program that instrinsically generates randomness can be put into the model itself, rather than "called" by it initially or repeatedly. In the past, the randomness has generally been idealized as coming from beyond the model itself, and the fact that a simple program was used to generate a version of it was viewed as an approximation of natural unknowns. Rather than expecting that a broad class of natural systems will create effective randomness in much the same way our simple programs do - which suggests the possibility of including model systems that produce randomness in a more "isomorphic" way.
And Tony is right that there is a domain of applicability of simple program models to natural systems that exhibit the most varied behaviors, from the observation that class 4 systems can and typically do produce such varied behaviors. The point of universality proofs about class 4s is to establish that bounds on the possible behaviors of programs exhibiting that class of behavior are very hard to state and for broad enough classes of them may be impossible in principle. It is a way of showing that intuitions such as "this system may do x, but it will never be able to do y", are unsound. It also establishes a threshold - to get y, one need not in principle go beyond a system construction with so-and-so many elements and these or those allowed interrelations. That motivates simplicity in modeling, by showing that excess "hair" is often unnecessary to recreate the behavior seen.
What is at stake in the question whether class 3s are also universal is again, not the whole simple programs approach. But instead, whether results or theorems can be unified. Is there one kind of unknowability limit involved, arising for one basic reason? Or are there two, one universality related, the other perhaps statistically characterized but containing unknowability only "in micro"?
The thing Tony should understand is that the interest in showing class 3s are universal (if it can be done) is not an engineering interest. It is not a matter of wanting to get class 3s to do things. It is an attempt to extend various intuitions, some full theorems, practical expectations and the like, to that broad class. If it is correct, the universality threshold is quite low and all the theorems that may imply become applicable early and often.
We want to find laws that are as general as possible, if we can. Is the attempt to find such generality a distraction? If it is real and is there and we can show it, then no, I don't think it is. Is it the only point, or the only thing that might allow us to formulate general principles for either class of system? No. The behavior of 3s is important itself, and modeling isomorphisms will exist between the phenomena seen in them and some natural systems. The same is true of 4s. Even if some body of theorems or observations does not apply to both.
What I take Tony to be saying is that he wishes people would focus on class 4s and what they can do, rather than getting hung up on whether that extends to the 3s. Or to focus on the edge of chaos, rather than trying to find principles that will apply to the disordered side of that transition as well as to the edge. There is nothing wrong with some people choosing to focus on that, and finding whatever they can. While doing so, though, they should not object to others trying to extend theorems or observations about that class of systems, to a wider domain.
I can understand the tension, however, in the context of communicating results. You try to say something about the edge, and someone objects that such and so doesn't apply to his other system way over in a different domain. Then you get dragged into questions about whether the domains are so different. Less may be known about the attempted union class, than one side of the attempted union. And the people one is talking to may confuse the subset you are talking about for the other, or for the union set. Even if eventually more is shown to be common between them, future results are not available now, and might turn out to be different than one expects.
So it is understandable. The attempt to enlarge the domain to which the same principles apply is distinct from the effort to deepen the set of known applicable principles. Ideally they should both be done, because they compliment one another in the long run, revealing more about the world. The trick is to remember which one is doing at any given time, and try not to get in the way of the other task, without letting it interfer with one's own present task.
Forum Sponsored by Wolfram Research
© 2004-2008 Wolfram Research, Inc. | Powered by vBulletin 2.3.0 © 2000-2002 Jelsoft Enterprises, Ltd. |
Disclaimer
vB Easy Archive Final - Created by Xenon and modified/released by SkuZZy from the Job Openings