Checksums and Hashes

I learned to appreciate the value of the Cyclic Redundancy Check (CRC) algorithm in my 8-bit, 300 baud file transferring days. If the CRC of the local file matched the CRC stored in the file (or on the server), I had a valid download. I also learned a little bit about the pigeonhole principle when I downloaded a file with a matching CRC that was corrupt! An 8-bit CRC only has 256 possible values, after all.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2005/04/checksums-and-hashes.html

Anecdotally, System.String.GetHashCode seemed to produce a different result in .Net 2.0 Beta 1 from 1.1.

Yeah, the BCL guys said up front they reserve the right to break compatibility on hashcode values between versions of the runtime. They’re improving the hashing algorithms for better distributions, among other things…

Anecdotally, System.String.GetHashCode seemed to produce a different result in .Net 2.0 Beta 1 from 1.1.

I built an RSS reader that relied on string combination and hashes to identify items it had already seen, and changing framework versions seemingly changed every hash code in use, so I had a lot of unread items post-upgrade…

Some posts about what GetHashCode does internally, from MS folks.

http://blogs.msdn.com/brada/archive/2003/10/06/50434.aspx

http://blogs.msdn.com/bclteam/archive/2003/10/31/49719.aspx

Bear in mind that System.String.GetHashCode produces a “true” hash, whereas the Object.GetHashCode provides a “stable” random number, which clearly isn’t a hash…

http://blogs.gotdotnet.com/BradA/commentview.aspx/b688ad81-1642-4a4b-bff8-a9fdb985fbbc