[Benford Distribution et al] - A New Kind of Science: The NKS Forum

A New Kind of Science: The NKS Forum

Pages:1



Benford Distribution et al

(Click here to view the original thread with full colors/images)



Posted by: John Reith

In 1970 I contacted Ralph Raimi about his 1969 Scientific Amer. article on the distribution of first digits. I suggested this distribution was the result of how we measure, as opposed to the way we count. Historically, a dimension is randomly defined (meter, pound etc) When we encounter something in the real world we wish to measure (a random magnitude) we make this measure by dividing the unit of measure into the unknown magnitude and observe the quotient...essentially the division of two random numbers. I tried unsuccessfully to prove this resulted in the Benford distribution. Raimi suggested I try to check my premise by doing the multiplication of two random numbers on a computer, which should be equivalent. I did this, as well as the division of two random numbers and got a quite different distribution of first digits in the two cases: division:.33, .15, .10, .08, .07,.069,.065,.063, .062; multiplication: .24,.18,.15,.12,.10,.076,.059,.047,.034. Of interest was that this last distribution matched the Stigler distribution(which Raimi related to the Benford) almost exactly. This was a new finding. But of even greater interest was the fact that if one ordered the two random numbers prior to division one got entirely different distributions if one placed the smaller number in the numerator as opposed to the denominator: smaller in the numerator: .063, .105,.114,.117,.118,.121,.121,.121; smaller in the denominator: .602,.194,.092,.050,.029,.017,.010,.005,.001. As can be seen, the small/large produces something close to a uniform distribution. The distribution of the second digits for small/large gave almost a perfectly uniform distribution: .095,.096, 099,.099,.095,.104,.105,.102,.104,.100. This would be expected if indeed the first digits were uniform. A co-worker, Robert Kennard who was a statistician derived the theoretical distributions of first digits for the above four operations and showed that the division of small/large should give exactly a uniform distribution (1/9). This result was published in 1981 in Commun. Statist.-Simula., B10(1),97-98 (1981). There has been no reader remarks over this publication over these 22 years, though I would think the uniform distribution would be of interest as a theoretical random number generator. And I believe it may have some applications in NKS research. Ironically, about eight years after my correspondence with Raimi a Ralph Allen wrote Raimi with almost identical reasoning that I had used but derived the corresponding four distributions by geometrical analysis, measuring the relative sizes of the areas occupied by plotting the resulting first digits for division and multilication.

I will break this posting at this point but I have much more detail if anyone is interested.

John Reith



Posted by: Ed Pegg Jr

NKS discusses Benford distributions on page 914. Here's a quote:

Even though in individual numbers generated by simple mathematical procedures all possible digits often appear to occur with equal frequency, leading digits in sequences of numbers typically do not. Instead it is common for a leading digit s in base b to occur with frequency Log[b, (s+1)/s] (so that in base 10 1’s occur 30% of the time and 9’s 4.5%). This will happen whenever FractionalPart[Log[b, a[n]]] is uniformly distributed, which, as discussed on page 905, is known to be true for sequences such as r^n (with Log[b, r] irrational), n^n, n!, Fibonacci[n], but not r n, Prime[n] or Log[n]. A logarithmic law for leading digits is also found in many practical numerical tables, as noted by Simon Newcomb in 1881 and Frank Benford in 1938.



Posted by: Fiona Maclachlan

I'm attaching a Mathematica notebook in which I replicate Reith's results from dividing and multiplying random numbers to get Benford-like distributions.

I take one more step and average the leading digit proportions from the division and multiplication cases to arrive at something that rivals Benford in matching real world incidences of the leading digit phenomenon.

Reith's intuition about the effect of the combination of units of measurement with the quantity being measured may well be correct. But, in any case, it provides a good opportunity for the NKS method. In the model in my Mathematica notebook, outputs from Rule 30 (i.e., from the RNG) are being transformed with simple operations to yield fairly good predictions of an empirical regularity.

I hope this is of interest.



Posted by: Fiona Maclachlan

I tried a slightly different approach to test Reith's conjecture that Benford's Law emerges in those cases in which one random number is divided by another.

Instead of sampling from the unit interval with Random[], I tried sampling from a random interval using
Random[Real, {0, Random[]}].

It works quite well. Here are the leading digit proportions from a million ratios followed by those from Benford's Law:
{{0.305277, 0.173325, 0.121712, 0.094917, 0.078952, 0.067319, 0.05919,
0.052369, 0.046939},
{0.30103, 0.176091, 0.124939, 0.09691, 0.0791812,
0.0669468, 0.0579919, 0.0511525, 0.0457575}}

I used this program where length refers to the number of ratios that are constructed:

run[length_] := Module[{LDP},
benfordslaw = N[Table[Log[10, (s + 1)/s], {s, 9}]];
leadingdigitproportions[list_List] :=
Table[N[Total[
Table[Count[list, x_ /; s*10^y <= x < (s + 1)*10^y], {y,
Floor[Log[10, Sort[list][[1]]]],
Ceiling[Log[10, Sort[list][[Length[list]]]]]}]]/Length[list]], {s,
9}];
LDP = leadingdigitproportions[
Table[Random[Real, {0, Random[]}]/
Random[Real, {0, Random[]}], {length}]];
{LDP, benfordslaw}]

Also of interest is the fact that Simon Newcomb derived Benford's Law using the idea of dividing random numbers as a starting point. Writing in 1881, he states:
As natural numbers occur in nature, they are to be considered as the ratios of quantities. Therefore, instead of selecting a number at random, we must select two numbers, and inquire what is the probability that the first signficant digit of their ratio is the digit n.

It is surprising that Newcomb's explanation is not more widely discussed in the literature on Benford's Law. His proof is brief but is it on the right track?

Newcomb's two page note can be found here:
http://home.manhattan.edu/~fiona.maclachlan/newcomb.pdf





Forum Sponsored by Wolfram Research

© 2004-2008 Wolfram Research, Inc. | Powered by vBulletin 2.3.0 © 2000-2002 Jelsoft Enterprises, Ltd. | Disclaimer
vB Easy Archive Final - Created by Xenon and modified/released by SkuZZy from the Job Openings