I recently discovered (thanks to Paul Wise) the [Meow hash][]. The TL;DR: is that it's a fast non-crypto hash which might be useful for git-annex. Here's their intro, quoted from the website: [Meow hash]: https://mollyrocket.com/meowhash > The Meow hash is a high-speed hash function named after the character > Meow in [Meow the Infinite][]. We developed the hash function at > [Molly Rocket][] for use in the asset pipeline of [1935][]. > > Because we have to process hundreds of gigabytes of art assets to build > game packages, we wanted a fast, non-cryptographic hash for use in > change detection and deduplication. We had been using a cryptographic > hash ([SHA-1][]), but it was > unnecessarily slowing things down. > > To our surprise, we found a lack of published, well-optimized, > large-data hash functions. Most hash work seems to focus on small input > sizes (for things like dictionary lookup) or on cryptographic quality. > We wanted the fastest possible hash that would be collision-free in > practice (like SHA-1 was), and we didn't need any cryptograhic security. > > We ended up creating Meow to fill this niche. [1935]: https://molly1935.com/ [Molly Rocket]: https://mollyrocket.com/ [Meow the Infinite]: https://meowtheinfinite.com/ [SHA-1]: https://en.m.wikipedia.org/wiki/SHA-1 I don't an immediate use case for this right now, but I think it could be useful to speed up checks on larger files. The license is a *little* weird but seems close enough to a BSD to be acceptable. I know it might sound like a conflict of interest, but I *swear* I am not bringing this up only as a oblique feline reference. ;) -- [[anarcat]]