Some notes on MD5-crypt ======================= Notation: "..." represents an ASCII encoding of a string. || means concatenation x >>> k means x shifted right by k bits (i.e. floor(x/2^k)) x << k means x shifted left by k bits (i.e. x*2^k) x & y means the bitwise-AND of x and y. The output is encoded using the characters "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" (the same set as used by the Traditional crypt(3) algorithm). The salt is variable length, between 0 and 8 characters from the above set, with each character representing 6 bits. Let password be the password (as a byte string in some unspecified ASCII-compatible character encoding). salt be the ASCII-encoded form of the salt. The definition of the algorithm itself is just plain wierd: foo := MD5(password || salt || password); // extend(str, len) is str repeated to fill len bytes. bar := password || "$1$" || salt || extend(foo, length(password)); // yes, this makes absolutely no sense: i := length(password); while (i > 0) { if (i mod 2 != 0) { bar := bar || 0; } else { bar := bar || password[0]; } i := i >>> 1; } baz := MD5(bar); // [*] // ifnz(i, s) = "", if i == 0 // = s, otherwise i := 0; while (i < 1000) { baz := MD5(baz || ifnz(i mod 3, salt) || ifnz(i mod 7, password) || password); i++; baz := MD5(password || ifnz(i mod 3, salt) || ifnz(i mod 7, password) || baz); i++; } // encode(x) = ("./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ" || // "abcdefghijklmnopqrstuvwxyz")[x] // baz is encoded with an obscure byte ordering. output := "$1$" || salt || "$" || encode( baz[12] & 0x3F ) || encode((baz[12] >>> 6) | ((baz[ 6] << 2) & 0x3F)) || encode((baz[ 6] >>> 4) | ((baz[ 0] << 4) & 0x3F)) || encode( baz[ 0] >>> 2 ) || encode( baz[13] & 0x3F ) || encode((baz[13] >>> 6) | ((baz[ 7] << 2) & 0x3F)) || encode((baz[ 7] >>> 4) | ((baz[ 1] << 4) & 0x3F)) || encode( baz[ 1] >>> 2 ) || encode( baz[14] & 0x3F ) || encode((baz[14] >>> 6) | ((baz[ 8] << 2) & 0x3F)) || encode((baz[ 8] >>> 4) | ((baz[ 2] << 4) & 0x3F)) || encode( baz[ 2] >>> 2 ) || encode( baz[15] & 0x3F ) || encode((baz[15] >>> 6) | ((baz[ 9] << 2) & 0x3F)) || encode((baz[ 9] >>> 4) | ((baz[ 3] << 4) & 0x3F)) || encode( baz[ 3] >>> 2 ) || encode( baz[ 5] & 0x3F ) || encode((baz[ 5] >>> 6) | ((baz[10] << 2) & 0x3F)) || encode((baz[10] >>> 4) | ((baz[ 4] << 4) & 0x3F)) || encode( baz[ 4] >>> 2 ) || encode( baz[11] & 0x3F ) || encode( baz[11] >>> 6 ); The wierdness doesn't hurt too much, since bar at point [*] is of the form bar = password || "$1$" || salt || dontcare where the value of 'dontcare' makes no difference to security. The pattern of inputs to the hash in the 1000-iteration loop is no more or less secure (apart from differences in speed) than: while (i < 1000) { baz := MD5(baz || password); } (the fact that the salt is sometimes included makes no significant difference). The main problem is that 1000 iterations is not enough to slow down a dictionary attack as much as would be desirable.