I made some improvements to the Hash Analysis Tool. Feel free to check it out, it's in the public Store repository. The package comment is below for your perusal. Also, let me know if you have any requests or comments about the current implementations of hash and identityHash in Cincom Smalltalk.
Hash Analysis Tool.
This tool analyzes hash quality for given object data sets. It can automate determining the following bits of information.
- Amount of collisions.
- Collision rate (objects per hash value).
- Hash quality (where 100% means zero collisions, 50% means one collision every two objects).
- Normalized chi square for hash values (optimum is zero, higher is worse).
- Normalized chi square modulo a variety of primes (optimum is zero, higher is worse). This is done because even if you have a hash with low overall collisions, the idea is that there are still low collisions when you look at those values modulo the size of the hash table.
- Assorted timings of hash calculations.
- A score based on the amount of collisions and the time it takes to calculate the hash.
- Whether the data set contains objects such that x = y but x hash ~= y hash.