Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am using fnv-1a to hash Objective-C method selectors, which are generally just identifier characters and zero or multiple ':'. At the time of my research, fnv-1a had the least collisions over my set of "real life" selectors. I think, it could be worthwhile some time, to try out other constants for maybe even less collisions. Is your list of good primes available ? (And maybe also those that are not quite perfect)


Everything is in the source code. I highly doubt any of the good hash functions listed in smhasher3 (ie all tests passed) would collide over identifiers.

So they should all have zero collisions, meaning there’s no ‘least’ among the good quality ones - they’re all equally collissionless (they differ in other tests).

Sounds like an interesting project. What’s its purpose?


Cool. I forgot to mention, that I am truncating the hash down to 32 bits to hopefully generate tighter CPU instructions. At these few bits, collisions are still rare enough, but they are a concern.

Now my understanding of the choice of prime is that, you are "weighing" the input bits and the computed bits, that will form the hash. So in case of identifiers its very likely that bit 7 of the input is always 0 and say maybe bit 4 is statistically more likely to be 1 by some margin. The other input bits would have some entropy as well. I would expect that certain (imperfect) primes would then aid me to get a better use of the 32 bit space and therefore less risk of a collision for my Objective-C runtime.

You can check out the project here: https://github.com/mulle-objc.


Ah, that's interesting. 32-bits yes you would get some collisions even from good hashes just statistically.

Also now I understand your constraints. Very interesting, so you are designing a custom hash function to use in this specific domain of keys with specific probabilistic properties, and you are thinking that there would be some way you could multiply by a certain prime that would ideally fan out these keys to be evenly distributed over the space?

mulle-objc looks fascinating: fast, portable Objective-C runtime written 100% in C11. I encourage you to post a Show HN I'm sure people here would like it.


Actually I already did, a few years ago: https://news.ycombinator.com/item?id=13030568


Hahaha! :) Good on you. Nothing to stop you posting again :)

Truly I have much experience with Posting Show HN's. There's very little quality difference between something you post that gets 3 points and something you post that gets 300 points. A lot of it depends on time, current dynamic audience, and other posts at the time.

Repeat post to get a better idea of the true interest in your work. I encourage you to do it again!!! :)


Spooky also has some good results on common identifiers.

But fnv-1a is in a completely different league. It's recommended for hash tables with other security measures than hash function security. This hash is a typical hybrid, but not universal. umash would be the perfect hybrid: insecure, pretty fast, passes all tests, universal


Thanks for the umash tip.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: