Hashing algorithm for uniqueness and speed

When digesting data for uniqueness, but not security, sometimes SHA is overkill. I’m only after speed.

Source: https://stackoverflow.com/a/59030375/155351
Homepage: https://cyan4973.github.io/xxHash/

Rubygem

xxhash:

  • uses original C code
  • it’s not clear from README, but it has ruby standard library Digest integration

Problems with the xxhash rubygem:

  • It supports XXH32 and XXH64, but not XXH3_64 and XXH3_128 (faster, newer).
  • It doesn’t match Ruby’s Digest behavior because it uses .to_s on all input, whereas Digest requires an object to define implicit conversion to String (i.e. have .to_str method) and raises TypeError otherwise. (Here’s the bad PR: https://github.com/nashby/xxhash/pull/14)
  • It doesn’t match Ruby’s Digest behavior because it doesn’t have class methods such as .hexdigest, .base64digest.


Date
October 30, 2022