[Hashing] - A New Kind of Science: The NKS Forum

A New Kind of Science: The NKS Forum

Pages:1



Hashing

(Click here to view the original thread with full colors/images)



Posted by: janos

On page 622 the Book says that hashing requires a rough uniform distribution of values to be able to retrieve the data.

Here is a small program with Mathematica 5.1 which does NOT show this uniform distribution. Any good explanation ?

In[75]:=
MapIndexed[{First[#2],
Hash[#1]} & ,
ToExpression /@
(StringJoin["aaaa", "$",
ToString[#1]] & ) /@
Sort[Table[Random[
Integer, {9, 10}],
{i, 1, 20}]]]
Out[75]=
{{1, 1584071780},
{2, 1584071780},
{3, 1584071780},
{4, 1584071780},
{5, 1584071780},
{6, 1584071780},
{7, 1584071780},
{8, 1584071780},
{9, 1584071780},
{10, 1149946229},
{11, 1149946229},
{12, 1149946229},
{13, 1149946229},
{14, 1149946229},
{15, 1149946229},
{16, 1149946229},
{17, 1149946229},
{18, 1149946229},
{19, 1149946229},
{20, 1149946229}}



Posted by: Aaron Turner

Perhaps I'm missing something here, but you've merely hashed two strings repeatedly - you expect a uniform distribution?



Posted by: janos

Yes, I expect two values "close" to each other. For example if I has aaaa8 and aaaa9 the the distribution of hashing values are differ just in 1.

In[2]:=
MapIndexed[{First[#2],
Hash[#1]} & ,
ToExpression /@
(StringJoin["aaaa", "$",
ToString[#1]] & ) /@
Sort[Table[Random[
Integer, {8, 9}],
{i, 1, 20}]]]
Out[2]=
{{1, 1584071779},
{2, 1584071779},
{3, 1584071779},
{4, 1584071779},
{5, 1584071779},
{6, 1584071779},
{7, 1584071779},
{8, 1584071779},
{9, 1584071779},
{10, 1584071779},
{11, 1584071779},
{12, 1584071780},
{13, 1584071780},
{14, 1584071780},
{15, 1584071780},
{16, 1584071780},
{17, 1584071780},
{18, 1584071780},
{19, 1584071780},
{20, 1584071780}}

So the question is why there is the jump for aaaa9 and aaaa10 ? The big jump puts the hash value in a very far region of the memory and to retrieve it might require paging of memory - inbetween - in and out, making Hashing slow, ie defeating its original purpose.





Forum Sponsored by Wolfram Research

© 2004-2009 Wolfram Research, Inc. | Powered by vBulletin 2.3.0 © 2000-2002 Jelsoft Enterprises, Ltd. | Disclaimer
vB Easy Archive Final - Created by Xenon and modified/released by SkuZZy from the Job Openings