Thank you Michael Rabin for your excellent work. Rest in Peace.
Rabin Fingerprinting is one of my favorites of his contributions. It's a "rolling hash" that allows you to quickly compute a 32-bit (or larger) hash at *every* byte offset of a file. It is used most notably to do file block matching/deduplication when those matching blocks can be at any offset. It's tragically underappreciated.
I've been meaning to write up a tutorial as part of my Galois Field series. Someday..
I recently found his fingerprint algorithm and wrote a utility that uses it to find duplicate MIPS code for decompilation[0] and build unique identifiers that can be used to find duplicates without sharing any potentially copyrighted data[1].
This replaced some O(n²) searches through ASCII text, reducing search time from dozens of seconds to fractions of a second.
Important to note that FastCDC is about an order of magnitude for block deduplication and is generally considered the state of art for such an approach (speed of computing the hash is more important than absolutely optimal distribution of hashes).
I'm working on a data annotation system based around Rabin fingerprints. They're a really neat idea.
I especially like how if you end up with hash characteristics that you don't like, your can just select a different irreducible Galois polynomial and now you've got a whole new hash algorithm. It's like tuning to a different frequency.
For me it means I don't have to worry about cases where there aren't enough nearby fingerprints for the annotation to adhere to, I can just add or remove polynomials until I get a good density.
I wrote a Rabin—Karp implementation in ~2006 as part of the spam and threat scanning stack for the MX Logic mail service. It was incredibly performant, letting us test {n} bytes against an essentially unlimited number of string signatures in O(n) time.
I took his Introduction to Cryptography class when he was a visiting professor at Columbia. Absolute master of an old-school chalkboard lecturer. They don't make them like that any more.
I had the incredible good fortune to take one of his classes in college, and I loved it so much I took another just to learn from him again. A tremendous intellect AND an incredibly engaging and talented instructor. It would be an exaggeration to say that I knew him, but nevertheless he had a great impact on my education and my life. He will be missed.
Amazing man, with many important contributions over a very long career. The Rabin Cryptosystem (like RSA, but with public exponent 2) is notable for two reasons. First, unlike RSA, it is provably as hard as "factorization" (as he would call it), and second, unlike RSA, it wasn't protected by patent.
I took a course from him as a graduate student. I was not (and am still not) a theoretician. But I enjoyed the class and Professor Rabin's lectures.
A friend of mine was one of his graduate students and a teaching assistant for the class. He pointed out to me once that Professor Rabin would state many of his points during lecture twice. Once I started listening more carefully, I found this to be true. It was both subtle and pedagogically effective.
English was not his first language, but he enjoyed his struggles with it. I remember him stumbling over the pronunciation of a word during class. Giving up with a smile, he said, "This is a word I know only from books."
Michael O. Rabin had important contributions in many domains, but from a practical point of view the most important are his contributions to cryptography.
After Ralph Merkle, Whitfield Diffie and Martin Hellman, Michael O. Rabin is the most important of the creators of public-key cryptography.
The RSA team (Ron Rivest, Adi Shamir and Leonard Adleman) is better known than Michael O. Rabin, but that is entirely due to marketing and advertising, because they founded a successful business.
In reality the RSA algorithm is superfluous and suboptimal. If the RSA team had never discovered this algorithm, that would have had a null impact on the practice of cryptography. Public-key cryptography would have been developed equally well, because the algorithms discovered by Merkle, Diffie, Hellman and Rabin are necessary and sufficient.
On the other hand, while without the publications of RSA, cryptography would have evolved pretty much in the same way, without the publications of Michael O. Rabin from the late seventies the development of public-key cryptography would have been delayed by some years, until someone else would have made the same discoveries.
Together with Ralph Merkle, Michael O. Rabin was the one who discovered the need for secure cryptographic hash functions, i.e. one-way hash functions, which are now critical for many applications, including digital signatures. Thus Rabin is the one who has shown how the previously proposed methods of digital signing must be used in practice. For example, the original signing algorithm proposed by RSA could trivially be broken and it became secure only in the modified form described by Rabin, i.e. with the use of a one-way hash function.
Originally, Merkle defined 2 conditions for one-way hash functions, of resistance to first preimage attacks and second preimage attacks, while Rabin defined 1 condition, of resistance to collision attacks. Soon after that it was realized that all 3 conditions are mandatory, so the 2 definitions, of Merkle and of Rabin, have been merged into the modern definition of such hash functions.
Unfortunately, both Merkle and Rabin have overlooked a 4th condition, of resistance to length extension attacks. This should have always been included in the definition of secure hash functions.
Because this 4th condition was omitted, the US Secure Hash Algorithm Standards defined algorithms that lack this property, which has forced many applications to use workarounds, like the HMAC algorithm, which for many years have wasted time and energy wherever encrypted communications were used, until more efficient authentication methods have been standardized, which do not use one-way hash functions, for instance GCM, which is today the most frequently used authentication algorithm on the Internet.
"As a young boy, he was very interested in mathematics and his father sent him to the best high school in Haifa, where he studied under mathematician Elisha Netanyahu, who was then a high school teacher."
Interesting. Some people are lucky enough to find their vocation quite early in life.
This collection of "Rabinisms" [1] (thanks to the Internet Archive for keeping it from being lost) can give you some of the flavor of the delightful experience that it was to take one of his classes. RIP.
[If P = NP,] then all of modern cryptography collapses. On this happy thought... (1998-11-24)
I'm one of those people who is just never wrong. I mean, not one of those people. I'm like everybody else. Nobody is ever wrong. (1998-12-08)
After all I said, put here the word "obvious". (1998-12-15)
I am going to show that in one round the probability of not reaching agreement is less or equal to 2. ... Yeah, we're establishing new ground in probability theory. (1998-12-17)
It's more than 10 years old. It's either classical or incorrect. (Fall 1998)
Born in Breslau, nowadays Wroclaw. Had enough of a luck that his parents escaped Germany before the war. Many other people weren't as prescient...
Also, as a teen he was taught mathematics by a certain Elisha Netanyahu, who was an uncle of the current Israeli Prime Minister. What an unexpected connection, at least for me.
I am 90% finished writing an article about Miller-Rabin primality testing. A few weeks ago I was looking around and found out that Rabin was still alive, which I hadn't expected... and was wondering if I should try to contact him to ask a few questions regarding his motivation to explore stochastic algorithms. Too late. :-(
Before AI and the swell of papers for money(tenure), not necessarily in that order, science mattered. As a result, the science mattered more in the past. RIP Rabin.
90 comments
Rabin Fingerprinting is one of my favorites of his contributions. It's a "rolling hash" that allows you to quickly compute a 32-bit (or larger) hash at *every* byte offset of a file. It is used most notably to do file block matching/deduplication when those matching blocks can be at any offset. It's tragically underappreciated.
I've been meaning to write up a tutorial as part of my Galois Field series. Someday..
Thank you again!
This replaced some O(n²) searches through ASCII text, reducing search time from dozens of seconds to fractions of a second.
0 - https://github.com/ttkb-oss/mipsmatch 1 - https://github.com/ttkb-oss/mipsmatch/wiki/Identifiers
I especially like how if you end up with hash characteristics that you don't like, your can just select a different irreducible Galois polynomial and now you've got a whole new hash algorithm. It's like tuning to a different frequency.
For me it means I don't have to worry about cases where there aren't enough nearby fingerprints for the annotation to adhere to, I can just add or remove polynomials until I get a good density.
I wrote a Rabin—Karp implementation in ~2006 as part of the spam and threat scanning stack for the MX Logic mail service. It was incredibly performant, letting us test {n} bytes against an essentially unlimited number of string signatures in O(n) time.
A friend of mine was one of his graduate students and a teaching assistant for the class. He pointed out to me once that Professor Rabin would state many of his points during lecture twice. Once I started listening more carefully, I found this to be true. It was both subtle and pedagogically effective.
English was not his first language, but he enjoyed his struggles with it. I remember him stumbling over the pronunciation of a word during class. Giving up with a smile, he said, "This is a word I know only from books."
https://youtu.be/L3FZzGU3n14
After Ralph Merkle, Whitfield Diffie and Martin Hellman, Michael O. Rabin is the most important of the creators of public-key cryptography.
The RSA team (Ron Rivest, Adi Shamir and Leonard Adleman) is better known than Michael O. Rabin, but that is entirely due to marketing and advertising, because they founded a successful business.
In reality the RSA algorithm is superfluous and suboptimal. If the RSA team had never discovered this algorithm, that would have had a null impact on the practice of cryptography. Public-key cryptography would have been developed equally well, because the algorithms discovered by Merkle, Diffie, Hellman and Rabin are necessary and sufficient.
On the other hand, while without the publications of RSA, cryptography would have evolved pretty much in the same way, without the publications of Michael O. Rabin from the late seventies the development of public-key cryptography would have been delayed by some years, until someone else would have made the same discoveries.
Together with Ralph Merkle, Michael O. Rabin was the one who discovered the need for secure cryptographic hash functions, i.e. one-way hash functions, which are now critical for many applications, including digital signatures. Thus Rabin is the one who has shown how the previously proposed methods of digital signing must be used in practice. For example, the original signing algorithm proposed by RSA could trivially be broken and it became secure only in the modified form described by Rabin, i.e. with the use of a one-way hash function.
Originally, Merkle defined 2 conditions for one-way hash functions, of resistance to first preimage attacks and second preimage attacks, while Rabin defined 1 condition, of resistance to collision attacks. Soon after that it was realized that all 3 conditions are mandatory, so the 2 definitions, of Merkle and of Rabin, have been merged into the modern definition of such hash functions.
Unfortunately, both Merkle and Rabin have overlooked a 4th condition, of resistance to length extension attacks. This should have always been included in the definition of secure hash functions.
Because this 4th condition was omitted, the US Secure Hash Algorithm Standards defined algorithms that lack this property, which has forced many applications to use workarounds, like the HMAC algorithm, which for many years have wasted time and energy wherever encrypted communications were used, until more efficient authentication methods have been standardized, which do not use one-way hash functions, for instance GCM, which is today the most frequently used authentication algorithm on the Internet.
Interesting. Some people are lucky enough to find their vocation quite early in life.
[If P = NP,] then all of modern cryptography collapses. On this happy thought... (1998-11-24)
I'm one of those people who is just never wrong. I mean, not one of those people. I'm like everybody else. Nobody is ever wrong. (1998-12-08)
After all I said, put here the word "obvious". (1998-12-15)
I am going to show that in one round the probability of not reaching agreement is less or equal to 2. ... Yeah, we're establishing new ground in probability theory. (1998-12-17)
It's more than 10 years old. It's either classical or incorrect. (Fall 1998)
[1] https://web.archive.org/web/20210509160248/http://www.eecs.h...
Also, as a teen he was taught mathematics by a certain Elisha Netanyahu, who was an uncle of the current Israeli Prime Minister. What an unexpected connection, at least for me.
We are all in his debt.