Use the wrong hash, lose all your cash!

SHA-2 is a hash function that is leveraged by much of the current Internet and cryptographic systems (including TLS, SSL, SSH, Bitcoin). SHA-2 was published in 2001 and to complement it with a new one, NIST announced a competition in 2007, to determine SHA-3. The Keccak team’s submission was announced as the winner in 2012, and since then developers have implemented “sha3” using that submission (including Ethereum). However, in 2014, NIST made slight changes to the Keccak submission and published FIPS 202, which became the official SHA-3 standard in August 2015. So there’s a lot of old code that calls itself sha3 that has not been updated to the standard. This article will explain the current widespread inconsistency with “sha3” and encourage developers to be more precise about SHA-3 and older code.

SHA2, Keccak, SHA3 timeline

You should be aware that old code will NOT produce the same hashes as SHA-3. When you’re using a “sha3” library, do you know if it is old code or the standard SHA-3?

A simple test is to use an empty input for SHA3–256.

The correct output per the standard is:

a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a

A lot of old code is Keccak-256 which produces this output:

c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470

This is code that should not describe itself as sha3.

Let’s try to avoid using the term sha3, when referring to the old behavior. If you need to describe old behavior, use a term like Keccak, which is what other projects are using to decrease confusion. Let’s use the term sha3 for behavior that is consistent with the standard.

We want to use the term Keccak, because we don’t want to confuse developers that use standard SHA-3 libraries, why their SHA-3 hashes do not match “sha3” hashes which were produced by old behavior.

Dance from Bali named Kecak
The Keccak team’s proposal won the NIST competition for SHA-3. The name Keccak comes from Kecak, a Balinese dance. [1, 2]

Similarly, please don’t call something sha3 unless you have checked that it really is SHA-3. If you just assume the “standard” sha3 library in your programming language is SHA-3, you may be mistaken. For example, in Javascript, the NPM package named sha3 is not yet conformant.

Another library that has not been updated is the popular CryptoJS. Unfortunately, the readme in Github makes no mention, and the user must look at the old site to find the “NOTE: I made a mistake when I named this implementation SHA-3. It should be named Keccak.” So any dependents of CryptoJS that uses sha3 is using old code that would be better described by a term like Keccak. Also, beware of online sha3 calculators, because some of them have not been updated to SHA-3.

For a correct example, see js-sha3. You can see the difference between sha3 and old behavior, and its updated test.

+  var KECCAK_PADDING = [1, 256, 65536, 16777216];
+  var PADDING = [6, 1536, 393216, 100663296];

NIST says “SHA-3 testing is in development”, but fortunately they have test examples and do confirm that the SHA3–256 of an empty input is:

a7ffc6f8bf1ed76651c14756a061d662f580ff4de43b49fa82d80a4b80f8434a

(Disappointingly, NIST hasn’t updated their SSL certificate for almost 2 years.)

The above is a call to all “sha3” library projects, to make it clear if the code is not SHA-3, and what the plans are for updating to the standard. Dependent projects should also consider clarifying their codebase, or update to libraries that are SHA-3.

To the Ethereum community

Ethereum is using the same underlying algorithm that is used in SHA-3, and so benefits from its security, but its protocol is using a version of the algorithm which is slightly different from FIPS 202. The Ethereum Yellow Paper specifically notes “Keccak-256 hash function (as per the winning entry to the SHA-3 contest).”

In our interactions, let’s describe what we use more accurately. We don’t want to confuse developers that use standard SHA-3 libraries, why their SHA-3 hashes do not match Ethereum’s hashes. When someone tells us they are using “sha3”, we don’t want to have to question if they really mean SHA-3 or Keccak-256 — so let’s likewise be more accurate with what we tell others. In Solidity, web3.js, EVM, and documentation, we should consider renaming / aliasing away from “sha3”, perhaps using a term like “ksha3” if “keccak” is unwieldy.

To all

If your code is not consistent with the SHA-3 standard, please make it clear that it is not SHA-3. As cryptography becomes more widespread, developers now and in the future will thank you for minimizing the confusion.


Original title was “Are you really using SHA-3 or old code?”

“Use the wrong hash, lose all your cash!” Original vintage WWII Poster by John Parrot

References

[1] Photo of kecak dance original by Chang’r

[2] From “SHA3 is over. Long live SHA3!” http://blog.cryptographyengineering.com/2012/10/long-live-sha3.html