KECB mode

Note: this is a draft posted for public comment. Please wait until it has been reviewed more thoroughly before relying on it. If you have any comments please mail them to me at <david.hopwood@lmh.ox.ac.uk>, or post them (with a reference to this page) to the newsgroup sci.crypt.

[Bold type] marks places where additional information will be added.

KECB mode (Keystream/electronic codebook mode)

This is an encryption mode which combines a block cipher with a keystream generator. I'm not entirely happy with the security of the conventional block cipher modes (CBC, CFB, OFB, ECB), in particular the extent to which they allow a man-in-the-middle attack to influence the decrypted plaintext. Even if an attacker can only cause particular blocks to decrypt to gibberish without detection, that is potentially a problem. It's difficult to predict how important this kind of attack will be without knowing the format of the higher-level protocol, or data format that is being encrypted, and that's why it should be prevented in encryption methods that claim to be general-purpose.

KECB mode was designed for connections over packet-based networks, but the same ideas can be applied in other situations. It has an overhead of a few bytes per record, plus any padding needed to extend the data to an exact number of blocks. A "record" in this case just refers to a piece of data whose integrity must be checked independently. It could be an IP packet, a block of data sent using some higher-level protocol, a disk block, or whatever. Records encrypted using a given session key are ordered, and the intention is to prevent an attacker from reordering them, as well as changing their contents.

The basic idea is as follows:

Split the original plaintext into records (each an exact number of blocks long), with each record including a check value.
XOR each record with bits from a keystream generator.
Encrypt each block of the result using a block cipher.

The rationale behind this is as follows:

to ensure that unless the keystream generator has been broken, a man-in-the-middle attack cannot be used to tamper with the decrypted output without being detected.
to make it difficult to attack the keystream generator, since a cryptanalyst does not know the output of the XOR unless he/she has (at least partially) broken the block cipher.
to make known plaintext attacks against the block cipher more difficult, since the input to it will look like random data.
to allow the keystream to be calculated in advance, or buffered just before it is needed, in order to speed up part of the calculation. The block algorithm still needs to be run at least once per record, of course.

Whenever a record with an incorrect check value is received, the receiver can look ahead up to several block lengths in the keystream, to see whether the resulting decryptions make sense (i.e. have a correct check value). If yes, the blocks up to that point have been lost. Otherwise, the current record was inserted. A changed record will look like one inserted and one lost record.

Some keystream generators (for example, Blum Blum Shub and SEAL) have the property that the ith bit of the stream can be calculated from the seed efficiently, without having to calculate any of the previous bits. If a keystream generator has this random-access property, so does KECB mode when used with that generator.

The mode requires two pieces of secret random data: S1, which will be used as the key for the block cipher, and S2, which will be used as part of the seed for the keystream generator. Note that the difficulty of a brute-force attack on the generator depends on the minimum of the number of bits in its internal state, and the number of bits in S2. S2 should be long enough for this not to be a problem.

For bidirectional communication, it's essential that there are two independent keystream generators, one for each direction (if the generator is good enough, this probably just requires adding another bit to the seed to indicate the direction). It's also possible, but not absolutely necessary, to have different block cipher keys for each direction, and/or change the key in mid-session. I'll just describe the simplest option; using more than one key probably doesn't add much security unless sessions last for a long time.

Suppose P= the ith block of the original plaintext, after
             prepending the LSB and check, and padding with zeros.
        
X= the ith block of the keystream output.
        
C= the ith block of the ciphertext.
        
S1 = the session key


Then    C= E[S1](P XOR X)

It's as simple as that.

The combination of a keystream generator and a block cipher is much more secure (IMHO) than either alone. It means that the input to the block cipher is randomly distributed, and unknown to an attacker (so many kinds of known and chosen plaintext attacks are infeasible, including the straightforward method of encrypting a likely plaintext block with every possible key - but see the meet-in-the-middle attack below).

It also avoids a serious problem with keystream generators (and block ciphers used in a keystream-like mode, such as CFB or OFB): if a bit of the original plaintext is known, the corresponding bit of the decrypted text can be influenced by an attacker. For example, if Mallory is a man-in-the middle on a connection in which he thinks that the message

        "Alice owes Mallory Ł100.00"

is being sent, he can XOR the ciphertext with

        "                       .  "
    XOR "                       0  "

before resending it. This is not good.

In other words, a keystream generator alone may provide security of the message, but has no protection against it being tampered with. KECB mode provides both, as well as making brute-force attacks on the block cipher very difficult, even with known or chosen plaintext.

Attacks against KECB

[Some of these could be explained a little better.]

There is a meet-in-the-middle attack against KECB, similar to those possible in other cases in which two algorithms are combined, or double encryption is used. It is not particularly serious, provided S1 and S2 are long enough to force the attack to require an infeasible amount of memory.

Create a table mapping each possible block-sized keystream output, to the internal generator state(s) that produce that output. Given a known plaintext/ciphertext pair, decrypt the ciphertext with each possible key, XOR with the plaintext, and look up the result in the table. This will give zero or more possible states; for each of them, run the generator on to another known plaintext/ciphertext pair to see whether the state is correct. The table can be re-used across sessions.

Let n = the symmetric key length
    m = the size in bits of the generator's internal state.

The complexity of this attack is 2^max(n, n+blocksize-m) + 2

operations in total [I think; this needs checking more carefully], and the memory for a table containing 2^blocksize * m bits (there are also other possible time/space tradeoffs). Suppose the generator state is 64 bits and the block size is also 64 bits; this requires memory on the order of 134 million terabytes (2 ^67

bytes).

You can also do a similar attack (also using a known plaintext/ciphertext pair) 'the other way around': create a table which maps all possible decryptions of the ciphertext block to the corresponding key(s). Then XOR the plaintext block with the keystream output for each possible generator state, and look up the result in the table. That will give zero or more candidate (block cipher key, generator state) pairs that can be checked to see whether they are correct. For this attack the table cannot be re-used across sessions. The complexity is 2^max(m, m+blocksize-n) + 2 operations, and memory for a table containing 2 blocks. For DES this is around 524,000 terabytes, or 2 ^59 bytes; since that isn't totally inconceivable, use a cipher with a larger key than DES.

Note that if you happen to have prepared a table which maps ciphertext to the key for common plaintext blocks (e.g. all-zeros and all-spaces), you almost certainly can't use it against KECB mode; those plaintext blocks are no more likely than any other block after being XOR'd with the keystream.

Another possible avenue of attack makes use of repetition in the keystream. Because of the birthday paradox, we can expect to find two equal blocks in the keystream after something on the order of 2 ^(blocksize/2) blocks. This doesn't necessarily lead to the corresponding ciphertext blocks being repeated, unless the plaintext for those blocks is also equal. It's difficult to see how this could be exploited effectively.

A more serious problem is that the internal generator state could repeat. This is likely to happen after about 2 ^(m/2) steps (note: this is the number of internal steps of the generator, not necessarily the number of blocks). In that case, if the period is P, blocks (x and x+P), (x+1 and x+1+P) etc. will be XOR'd with the same values. The cryptanalyst still does not know what these values are, but it's too much information to be reassuring.

For a generator with 64 bits internal state (which is too small), this is 2 ^32 blocks, or (assuming a 64-bit block size) just over 3 days on a 1 Mbit/s link. If sessions are going to last for long periods, change the session key often enough to make this kind of attack infeasible.

[Also need to consider differential and linear cryptanalysis, etc.]

David Hopwood
<david.hopwood@lmh.ox.ac.uk>

Suppose	P=	the ith block of the original plaintext, after prepending the LSB and check, and padding with zeros.
	X=	the ith block of the keystream output.
	C=	the ith block of the ciphertext.
	S1 =	the session key
Then	C=	E[S1](P XOR X)

Let	n =	the symmetric key length
	m =	the size in bits of the generator's internal state.