wolframscience.com

A New Kind of Science: The NKS Forum : Powered by vBulletin version 2.3.0 A New Kind of Science: The NKS Forum > Pure NKS > Hashing
  Last Thread   Next Thread
Author
Thread Post New Thread    Post A Reply
janos

CT

Registered: Nov 2004
Posts: 23

Hashing

On page 622 the Book says that hashing requires a rough uniform distribution of values to be able to retrieve the data.

Here is a small program with Mathematica 5.1 which does NOT show this uniform distribution. Any good explanation ?

In[75]:=
MapIndexed[{First[#2],
Hash[#1]} & ,
ToExpression /@
(StringJoin["aaaa", "$",
ToString[#1]] & ) /@
Sort[Table[Random[
Integer, {9, 10}],
{i, 1, 20}]]]
Out[75]=
{{1, 1584071780},
{2, 1584071780},
{3, 1584071780},
{4, 1584071780},
{5, 1584071780},
{6, 1584071780},
{7, 1584071780},
{8, 1584071780},
{9, 1584071780},
{10, 1149946229},
{11, 1149946229},
{12, 1149946229},
{13, 1149946229},
{14, 1149946229},
{15, 1149946229},
{16, 1149946229},
{17, 1149946229},
{18, 1149946229},
{19, 1149946229},
{20, 1149946229}}

__________________
--------------------------------------
"..because Annushka has already bought
sunflower oil, and not only bought it, but
spilled it too."
Bulgakov: Master and Margarita

Report this post to a moderator | IP: Logged

Old Post 01-05-2005 03:18 PM
janos is offline Click Here to See the Profile for janos Click here to Send janos a Private Message Edit/Delete Message Reply w/Quote
Aaron Turner


Registered: Jan 2005
Posts: 1

Perhaps I'm missing something here, but you've merely hashed two strings repeatedly - you expect a uniform distribution?

Report this post to a moderator | IP: Logged

Old Post 01-09-2005 01:23 AM
Aaron Turner is offline Click Here to See the Profile for Aaron Turner Click here to Send Aaron Turner a Private Message Click Here to Email Aaron Turner Edit/Delete Message Reply w/Quote
janos

CT

Registered: Nov 2004
Posts: 23

Yes, I expect two values "close" to each other. For example if I has aaaa8 and aaaa9 the the distribution of hashing values are differ just in 1.

In[2]:=
MapIndexed[{First[#2],
Hash[#1]} & ,
ToExpression /@
(StringJoin["aaaa", "$",
ToString[#1]] & ) /@
Sort[Table[Random[
Integer, {8, 9}],
{i, 1, 20}]]]
Out[2]=
{{1, 1584071779},
{2, 1584071779},
{3, 1584071779},
{4, 1584071779},
{5, 1584071779},
{6, 1584071779},
{7, 1584071779},
{8, 1584071779},
{9, 1584071779},
{10, 1584071779},
{11, 1584071779},
{12, 1584071780},
{13, 1584071780},
{14, 1584071780},
{15, 1584071780},
{16, 1584071780},
{17, 1584071780},
{18, 1584071780},
{19, 1584071780},
{20, 1584071780}}

So the question is why there is the jump for aaaa9 and aaaa10 ? The big jump puts the hash value in a very far region of the memory and to retrieve it might require paging of memory - inbetween - in and out, making Hashing slow, ie defeating its original purpose.

__________________
--------------------------------------
"..because Annushka has already bought
sunflower oil, and not only bought it, but
spilled it too."
Bulgakov: Master and Margarita

Report this post to a moderator | IP: Logged

Old Post 01-14-2005 04:04 PM
janos is offline Click Here to See the Profile for janos Click here to Send janos a Private Message Edit/Delete Message Reply w/Quote
Post New Thread    Post A Reply
  Last Thread   Next Thread
Show Printable Version | Email this Page | Subscribe to this Thread


 

wolframscience.com  |  wolfram atlas  |  NKS online  |  web resources  |  contact us

Forum Sponsored by Wolfram Research

© 2004-13 Wolfram Research, Inc. | Powered by vBulletin 2.3.0 © 2000-2002 Jelsoft Enterprises, Ltd. | Disclaimer | Archives