[Collatz 3n+1 Conjecture and NKS] - A New Kind of Science: The NKS ForumA New Kind of Science: The NKS Forum
Pages:1
Collatz 3n+1 Conjecture and NKS
(Click here to view the original thread with full colors/images)
Posted by: Craig Feinstein
This weekend, I found NKS in my local library, checked it out, and was pleasantly surprised - I haven't read it all yet but I saw that Wolfram introduces the notion of "computationally irreducible" and predicts that many unsolved problems in math are so because they are computationally irreducible.
I find this prediction very plausible, as I recently wrote a paper http://arxiv.org/abs/math.GM/0312309 which proves that the 3n+1 problem (Collatz conjecture) is unprovable, using this idea.
Any comments on my paper and how this relates to NKS are welcome. I look forward to reading more of NKS.
Posted by: AnthonyMartin
Craig,
Your approach is interesting, but I have a few questions.
Regarding notation in your paper:
By x in {0,1}^m do you mean an x in R^m, comprising as elements only ones and zeros?
In your equations involving modular quantities (e.g. { n, (T^1)[n],
, (T^L)[n] } = x~Mod~2), should it be assumed that the equation is modular? i.e. { n, (T^1)[n],
, (T^L)[n]} = x mod(2) is equivalent to ({ n, (T^1)[n],
, (T^L)[n]} - x)~Mod~2 = 0) ; if this is not the case and my interpretation of x from above is correct then x = x~Mod~2 != { n, (T^1)[n],
, (T^L)[n] }.
If I have interpreted {0,1}^m correctly, then for any x in {0,1}^m (random or otherwise) why should more than m bits be required to describe x?
Also:
Im not familiar with the proof of Theorem 1 by Lagarias, but assuming it is correct, then there does exist an n in N such that { n, (T^1)[n],
, (T^L)[n]}~Mod~2 = x, where x in {0,1}^(L+1) is the random vector you defined, but where in Theorem 1 does it suggest 0=( (T^(L+1))[n] - (T^L)[n] )~Mod~2 ?
Posted by: Craig Feinstein
--Anthony, thank you for reading my paper. If you or others have any other questions, please ask.
Regarding notation in your paper:
By x in {0,1}^m do you mean an x in R^m, comprising as elements only ones and zeros?
--I meant that x is a string of m bits.
In your equations involving modular quantities (e.g. { n, (T^1)[n],
, (T^L)[n] } = x~Mod~2), should it be assumed that the equation is modular? i.e. { n, (T^1)[n],
, (T^L)[n]} = x mod(2) is equivalent to ({ n, (T^1)[n],
, (T^L)[n]} - x)~Mod~2 = 0) ; if this is not the case and my interpretation of x from above is correct then x = x~Mod~2 != { n, (T^1)[n],
, (T^L)[n] }.
--Your interpretation is correct that those equations are modular - they are equivalences, not really equations.
If I have interpreted {0,1}^m correctly, then for any x in {0,1}^m (random or otherwise) why should more than m bits be required to describe x?
--When one describes a string of m bits on a computer text-file by listing all of the m bits, usually more than m bits are necessary, since computer text files are usually written in ASCII, having more characters besides zeroes and ones. That is why I specified in that definition of "random" that "at least" m bits are necessary instead of what one might initially assume.
Also:
Im not familiar with the proof of Theorem 1 by Lagarias, but assuming it is correct, then there does exist an n in N such that { n, (T^1)[n],
, (T^L)[n]}~Mod~2 = x, where x in {0,1}^(L+1) is the random vector you defined, but where in Theorem 1 does it suggest 0=( (T^(L+1))[n] - (T^L)[n] )~Mod~2 ?
--Theorem 1 does not apply directly to vector x, but if one creates a new vector x' of size L+2 in which the first L+1 entries are the entries of x and the (L+2) th entry is identical to the (L+1) th entry of x, then Theorem 1 applied to x' would suggest that there exists an n such that 0=( (T^(L+1))[n] - (T^L)[n] ) (Mod 2) as well as (n, (T^1)[n],
, (T^L)[n]) (Mod 2) = x.
Craig
Posted by: Joshua Rosenthall
I too enjoy the prospect of finding the Collatz Conjecture and the Riemann Hypothesis unprovable (and possibly using such a quality as grounds for confirmation of said conjectures). However, as Antony did, I have a few questions concerning your paper. Forgive me if they seem trivial.
In your Definition 2 you specify that "vector X in {0, 1}^m is random if at least m bits are necesary to describe X". As specified in the above post, your notation of X in {0, 1)^m is meant to signify that X is a string of M bits.
Consider: Define vector Y to be Y = [01] = "Y in {0, 1}^2", that is, Y is a vector of 2 bits consisting of only the digits 0 and 1. Is there a description of vector Y consisting of 2 or less bits; ie, is Y non-random?
Initial instinct would lead to the simple desciption of Y as "01", consisting of only 2 bits. However, your example under Defintion 2 states that "the vector of outcomes of one million coin-tosses has a good chance of fitting out definition of 'random' since much of the time the best way to describe such a vector is to list the results of each coin-toss, in which at least one million bits are necessary."
Consider the listing of each flip that came up "heads" as the digit 0 and each flip that came up "tails" as the digit 1. Then the outcome of 1 coin toss would consist of one 1, the outcomes of 2 tosses would consist of 2 bits... the outcome of 1 million tosses would consist of exactly 1 million bits and, in general, the outcome of N coin tosses would consist of N bits.
However, you state above (reasonably) that such a description is "usually written is ASCII with more charectars besides ones and zeros." While this may be true, the description I descibe in the above paragraph would use only ones and zeros, and thus would not be random. Thus, I call your definition of random into question.
However, you may refute that I did not define a vector, only a string of digits, and that the addition of such a defining portion would push the definition to over 1 million digits. Eg: the document must state "Flip a coin 1 million times. Defining each flip that came up heads as the digit 0 and each flip that came up 1 as tails. The vector Q defined as such = 101010110111101111" etc etc.
However, if you were to require this defining portion, why not require it for all such descriptions? Look back to my above vector Y = [01]. A natural description of such a vector would be "01", therefore Y is non-random. However, if we are requiring the description of Y to explain Y, then the description would have to be "Y = [01]", a definition of Y requiring more than 2 bits. In other words, by this definition, Y is random.
Let me summarize:
I see two possible cases for what you define as describe:
1) If vectors are allowed to be "described" as simply their digits, I cannot see how any vector could be random.
2) If vectors must be defined or declared within their "descriptions", it seems some trivial vectors (eg [01]) would be random.
---------------------------------------------------
Forgive me if any of the above is unclear. I will try to clarify if neccesary.
Posted by: Craig Feinstein
--Joshua, thank you for reading my paper and giving feedback. My comments have two dashes in front of them.
I too enjoy the prospect of finding the Collatz Conjecture and the Riemann Hypothesis unprovable (and possibly using such a quality as grounds for confirmation of said conjectures). However, as Antony did, I have a few questions concerning your paper. Forgive me if they seem trivial.
--I should say that in my paper, I am not using the notion of "unprovable" as a means for confirmation of the Collatz or RH conjectures. I'm claiming that the Collatz problem (and probably RH too) can never be settled unless one finds a counterexample to them. If they are true, then we'll never know this for certain.
In your Definition 2 you specify that "vector X in {0, 1}^m is random if at least m bits are necesary to describe X". As specified in the above post, your notation of X in {0, 1)^m is meant to signify that X is a string of M bits.
Consider: Define vector Y to be Y = [01] = "Y in {0, 1}^2", that is, Y is a vector of 2 bits consisting of only the digits 0 and 1. Is there a description of vector Y consisting of 2 or less bits; ie, is Y non-random?
--You are misunderstanding the definition of "random" in the paper. A vector that cannot be described in less bits than its dimension is random, by definition. So in this case, since y=[01] cannot be described in less than 2 bits, y=[01] is random.
Initial instinct would lead to the simple desciption of Y as "01", consisting of only 2 bits. However, your example under Defintion 2 states that "the vector of outcomes of one million coin-tosses has a good chance of fitting out definition of 'random' since much of the time the best way to describe such a vector is to list the results of each coin-toss, in which at least one million bits are necessary."
Consider the listing of each flip that came up "heads" as the digit 0 and each flip that came up "tails" as the digit 1. Then the outcome of 1 coin toss would consist of one 1, the outcomes of 2 tosses would consist of 2 bits... the outcome of 1 million tosses would consist of exactly 1 million bits and, in general, the outcome of N coin tosses would consist of N bits.
However, you state above (reasonably) that such a description is "usually written is ASCII with more charectars besides ones and zeros." While this may be true, the description I descibe in the above paragraph would use only ones and zeros, and thus would not be random. Thus, I call your definition of random into question.
--The definition of "random" that I gave is really Chaitin's definition of "random". I would recommend visiting Chaitin's website to gain a better understanding of it, here: http://www.cs.auckland.ac.nz/CDMTCS/chaitin/
--You will find that it's not as complicated as you think. I'll give an example: The vector of a string of one million concatenations of the vector [01] is not random because I can describe it is less than two million bits. (I just did.) However, the following vector [10001011100100110000111110010010100101100001] which I "randomly" typed from my keyboard is probably random because it probably cannot be "compressed" into a decription which involves less bits than the size of the vector itself.
However, you may refute that I did not define a vector, only a string of digits, and that the addition of such a defining portion would push the definition to over 1 million digits. Eg: the document must state "Flip a coin 1 million times. Defining each flip that came up heads as the digit 0 and each flip that came up 1 as tails. The vector Q defined as such = 101010110111101111" etc etc.
However, if you were to require this defining portion, why not require it for all such descriptions? Look back to my above vector Y = [01]. A natural description of such a vector would be "01", therefore Y is non-random. However, if we are requiring the description of Y to explain Y, then the description would have to be "Y = [01]", a definition of Y requiring more than 2 bits. In other words, by this definition, Y is random.
Let me summarize:
I see two possible cases for what you define as describe:
1) If vectors are allowed to be "described" as simply their digits, I cannot see how any vector could be random.
2) If vectors must be defined or declared within their "descriptions", it seems some trivial vectors (eg [01]) would be random.
---------------------------------------------------
Forgive me if any of the above is unclear. I will try to clarify if neccesary.
Posted by: Sune Kristian Jakobsen
Craig,
I dont think your approach work:
Consider the function f(n):
If n is even: delete the last digits, and inset the half of this digits to the left (e.g. f(368)=436, f(1234)=2123, f(40)=4, f(0)=0),
If n is odd: Let d be the last digit. move the last digit to the first position, and delete the dth digit from the right (mod length of n). (e.g. f(13573)=3157, f(1357)=735, f(1)=0, f(12345)=1234).
Now for every n there is a k, such that f^(k)(n)=0:
Proof:
Let g(n) = number of digits in n + the digit sum of n.
Now g(n) only take integer values, and for n>0, g(f(n))<g(n). Therefore f^(g(n))(n)=0.
But if the argument you use to prove that the 3n+1 conjecture is unprovable works, I can't see why it shouldn't work on this problem:
To calculate f(n) you need to know the last digit in n, and for every vector x in {0,1,2,3,4,5,6,7,8,9}^k, it is possible to find a n such that x=(n,f(n),f(f(n))
,f^(k-1)(n)) (mod 10). Since for arbitrary large k this vector contains arbitrary much information, a proof that for every n there is a k such that f^(k)(n)=0 should also contain arbitrary much information. But the above proof is only finitely long.
I think that your paper only proves that a brute force proof of the 3n+1 conjecture has to be infinitely long.
I hope it is possible to understand the above, but please ask if some of it is unclear.
Sune
Posted by: Craig Feinstein
Sune,
Thank you for reading my paper and giving your feedback.
First of all, it is not obvious to me why your statement for every vector x in {0,1,2,3,4,5,6,7,8,9}^k, it is possible to find a n such that x=(n,f(n),f(f(n))
,f^(k-1)(n)) (mod 10) is true, but I have no reason to doubt it. Of course, this is the correct analogue of Theorem 1 in my paper, so if it were true, then Theorem 1 in my paper would certainly apply to your function f.
Nevertheless, regardless of whether Theorem 1 applies to your function f, Theorem 2 in my paper is still false for your function f. The reason for this is because the formula for the function f^(k)(n) is always 0 for any n and k=g(n), so it is not true that there is a one-to-one correspondence between all of the possible formulas for the function f^(k)(n) and all of the possible values for (n,f(n),f(f(n))
,f^(k-1)(n)) (mod 10) when k=g(n), which is the case for the Collatz function T^(k)(n) and is used in the proof of Theorem 2.
In the language of NKS, applying T iteratively is a computationally irreducible process, as the formula for T^(k)(n) cannot be reduced to anything simpler than its definition. However, applying f iteratively is not a computationally irreducible process, since we know that it will always be 0 when k=g(n). (The process associated with function f may still be computationally irreducible before it reaches 0 though. It is not always easy to tell whether a process is computationally irreducible or not.)
You say: I think that your paper only proves that a brute force proof of the 3n+1 conjecture has to be infinitely long.
It is more accurate to say that any proof of the 3n+1 conjecture must be brute force, since applying the Collatz 3n+1 function T is a computationally irreducible process. Therefore, any proof of the 3n+1 conjecture must be infinitely long.
If the Collatz conjecture is true, then it's "true by accident", as Gregory Chaitin would say. There is no reason why it is true. His theory about the number that he calls Omega predicts that mathematics is filled with statements with the characteristic, true but unprovable. Of course, we can never know for certain that any particular conjecture, for instance the Collatz conjecture, is true but unprovable. We can only say that a conjecture is true, false, unprovable, or irrefutable.
Posted by: Sune Kristian Jakobsen
Craig,
Theorem 1 applies to the function f: You can simply construct the number from the right. E.g. if the vector is (1,9,3) you start from the right. Last digits must be 1. Now the last digit in f(n) is the 3rd last in n, so n end by 9?1. Now the last digit in f(f(n)) is the 4th last digit in n, so n end by 39?1. If you enough digits to the right of this number (e.g. n=99999999993991), you will get a number n such that: (1,9,3)=(n,f(n),f(f(n))).
it is not true that there is a one-to-one correspondence between all of the possible formulas for the function f^(k)(n) and all of the possible values for (n,f(n),f(f(n))
,f^(k-1)(n)) (mod 10) when k=g(n), which is the case for the Collatz function T^(k)(n) and is used in the proof of Theorem 2.
I do not understand that. You need to know n (mod 10) to calculate f(n), like you need to know n (mod 2) to calculate T(n). The only difference is, that we know a function g(n), such that g(n)=0 => n=0 and g(f(n))<g(n), and only take positive integer values. If we could find a simple function h(n) such that:
h(n)=0 => n=1 or 2
h only assumes positive integer values
h(T(n))<h(n) if n>2
we would know that the 3x+1-conjecture is true. I do not think that you have proven that it is impossible to find such function and prove that the function have these properties.
Sune
Posted by: Craig Feinstein
Sune,
You said:
"If we could find a simple function h(n) such that:
h(n)=0 => n=1 or 2
h only assumes positive integer values
h(T(n))<h(n) if n>2
we would know that the 3x+1-conjecture is true. I do not think that you have proven that it is impossible to find such function and prove that the function have these properties."
My response is that yes, if you could find such a function, then we would know that the 3x+1 conjecture is true. But it is clearly impossible to find such a function; theorem 2 would be false if you could find such a function. And I have explained why Theorem 2 does not apply to your function f:
it is not true that there is a one-to-one correspondence between all of the possible formulas for the function f^(k)(n) and all of the possible values for (n,f(n),f(f(n))
,f^(k-1)(n)) (mod 10) when k=g(n)
There is such a one-to-one correspondence for the Collatz function T for any k. See formulas 2.2-2.6 of Lagarias' paper, http://www.cecm.sfu.ca/organics/pap...000000000000000
Posted by: Sune Kristian Jakobsen
Craig,
I can't see why the one-to-one correspondence proves that we can't find a function like h:
You only have to prove that h(f(x))<h(x) to prove that h(f^(k)(x))<h(x) for every k. So if we want to prove the conjecture by finding such a function h, we do not have to prove that the function h works for every sting of 0's and 1's, but only the strings (0) and (1).
Sune
Posted by: Craig Feinstein
Sune said:
"I can't see why the one-to-one correspondence proves that we can't find a function like h:
You only have to prove that h(f(x))<h(x) to prove that h(f^(k)(x))<h(x) for every k. So if we want to prove the conjecture by finding such a function h, we do not have to prove that the function h works for every sting of 0's and 1's, but only the strings (0) and (1)."
My response: I have proven Theorem 2 and you haven't found any logical flaw in the proof. All you are doing is asking a "what if" question - 'what if we could find such a function h?' But by Theorem 2, it is impossible to find such a function h.
Posted by: Sune Kristian Jakobsen
Craig,
There is a logical flaw in your proof or at least you use a statement that you have not proved:
You say (at the end of the proof of theorem 2) that "in order to prove that T ^(k)(n) = 1, it is necessary to specify the formula for T ^(k)(n) in the proof."
This is not the case for my function f, and it is not at all obvious that that is the case for T.
Sune
Posted by: Craig Feinstein
Sune said:
"There is a logical flaw in your proof or at least you use a statement that you have not proved:
You say (at the end of the proof of theorem 2) that 'in order to prove that T ^(k)(n) = 1, it is necessary to specify the formula for T ^(k)(n) in the proof.'
This is not the case for my function f, and it is not at all obvious that that is the case for T."
But this *is* the case for your function f as well as for T. In fact, this is the case for any generic function g - in order to prove that g(n)=1, one must specify the formula for g(n) in the proof. Otherwise, how can one know that g(n)=1?
Posted by: Sune Kristian Jakobsen
"this *is* the case for your function f as well as for T. In fact, this is the case for any generic function g - in order to prove that g(n)=1, one must specify the formula for g(n) in the proof. Otherwise, how can one know that g(n)=1?"
I have already explained how you can prove that f^(k)(n)=1, if k=g(n). This proof doesn't specify the formula for f^(k)(n).
Sune
Posted by: Craig Feinstein
Sune said:
"I have already explained how you can prove that f^(k)(n)=1, if k=g(n). This proof doesn't specify the formula for f^(k)(n)."
His proof is:
"Now for every n there is a k, such that f^(k)(n)=0:
Proof:
Let g(n) = number of digits in n + the digit sum of n.
Now g(n) only take integer values, and for n>0, g(f(n))<g(n). Therefore f^(g(n))(n)=0."
My response to Sune: Your proof does in fact specify the formula for f^(k)(n) when k=g(n). The formula for f^(k)(n) is a constant 0 when k=g(n).
In the case of T^k(n), formulas 2.2-2.6 of Lagarias' paper describe the formula for T^k(n):
http://www.cecm.sfu.ca/organics/pap...000000000000000
Notice that this formula is not a constant and it varies with the parity vector of n. And this is the difference between your function f and the Collatz function T.
Posted by: Sune Kristian Jakobsen
I could also make a formula for f, that depended on the value of f^(k)(n) (mod 10), but lets take a easier example:
F(n)=n/2 if n is even, (n-1)/2 if n is odd.
For every positive integer n there is a k such that F^(k)(n)=0.
But I could also make a formula for F^(k)(n) that depends on the parity vector of n: F^(k)(n)=n/2^k-p, where p is the number you get by reading 0. p1, p2, p3 i base 2 and (p1, p2... ) is the parity vector of n.
Sune
Posted by: Craig Feinstein
Sune said:
"I could also make a formula for f, that depended on the value of f^(k)(n) (mod 10),"
My response: You can't do this when k=g(n). In this case, f^k(n) does not depend on n, since f^k(n) is a constant.
Sune said:
"but lets take a easier example:
F(n)=n/2 if n is even, (n-1)/2 if n is odd.
For every positive integer n there is a k such that F^(k)(n)=0.
But I could also make a formula for F^(k)(n) that depends on the parity vector of n: F^(k)(n)=n/2^k-p, where p is the number you get by reading 0. p1, p2, p3 i base 2 and (p1, p2... ) is the parity vector of n."
My response: But when k is sufficiently large, this formula reduces to a constant 0. However, for the 3n+1 function T, the formula for T^k(n) is dependent on the parity vector of n. It doesn't reduce to a constant for any k, even when k is dependent on n. And this is why Theorem 2 holds true for the Collatz function but does not hold true for both of your functions f and F.
Posted by: Sune Kristian Jakobsen
I don' t think I have understood theorem 2 correctly. Could you please try to explain precisely how you prove it, and for which functions the theorem holds?
Sune
Posted by: Craig Feinstein
Sune said: "I don' t think I have understood theorem 2 correctly. Could you please try to explain precisely how you prove it, and for which functions the theorem holds?"
I'll try my best to explain it: The underlying concept of Theorem 2 is actually one of the key concepts in the book NKS, namely computational irreducibility. In the case of the Collatz function, this means that in order to determine what happens when the Collatz procedure is applied to some number n after k steps, it is necessary to carry out k steps.
Why is this? Notice that the formula for T^k(n) cannot be reduced to anything simpler than the formula in the Lagarias paper, namely
T^k(n) = lambda_k(n)*n + rho_k(n),
where lambda_k(n) and rho_k(n) are functions of the parity vector of n, defined in the Lagarias paper. Therefore, T^k(n) is computationally irreducible in that it is impossible to determine the value of T^k(n) without knowing the value of the parity vector of n. And this is why Theorem 2 is true.
Note that theorem 2 is true also for T(n)=(5n+1)/2 if n is odd and n/2 if n is even. It is also true for T(n)=(17n+1)/2 if n is odd and n/2 is even. However, it is not clear that the Collatz conjecture would be true in these cases. There are probably counter-examples, since on average these sequences will increase every step instead of decrease on average as is the case with the 3x+1 function.
The 3x+1 function is unique in that it is likely that the 3x+1 conjecture is both true and unprovable. There are not many conjectures known to have this property in which it is probably true but impossible to prove true. However, in my paper "Complexity Science for Simpletons", I show that this is the case for the Riemann Hypothesis as well - it is probably true but it is nevertheless unprovable.
In the case of the Collatz conjecture and the Riemann Hypothesis, we have a process which is so complex that it is impossible to know how the process will end without carrying out the process, yet it appears that how the process ends is always the same and is simple. Probability is the best we can do to explain why the Collatz process always seems to halt at 1 and why the nontrivial roots of the Riemann Zeta function are always on the critical line.
You might like the following movie "Dangerous Knowledge" which relates to all of this: http://video.google.com/videoplay?d...earch&plindex=0
Forum Sponsored by Wolfram Research
© 2004-2008 Wolfram Research, Inc. | Powered by vBulletin 2.3.0 © 2000-2002 Jelsoft Enterprises, Ltd. |
Disclaimer
vB Easy Archive Final - Created by Xenon and modified/released by SkuZZy from the Job Openings