RC4 is a variable key-size stream cipher developed by Ron Rivest. It has two phases, the key set-up and the keystream generation. Both phases must be performed for every new key. During an n-bit key set-up, the encryption key is used to generate an encrypting variable using two arrays - the state and the key array. In RC4 there are two 256-byte arrays, the State S-box and the Key K-box. The S-box is linearly filled, such as S0 = 0, S1 = 1, S2 = 2, ..., S255 = 255. The K-box consists of the key repeated as many times in order to fill the array. RC4 cipher uses two counters, i and j, which are initialized to zero. In the key set-up phase, the S-box is being modified according to the following pseudo-code:
Key set-up phase:
for i = 0 to 255
j = (j + Si + Ki) mod 256
swap Si and Sj
Once the key set-up phase is completed, the second phase encrypts or decrypts a message. The keystream generation phase is described by the following pseudo-code:
Keystream generation phase:
i = (i + 1) mod 256
j = (j + Si) mod 256
swap Si and Sj
t = (Si + Sj) mod 256
K = St
For producing the ciphertext/plaintext, the generated keystream is XORed with the plaintext/ciphertext.
The cipher was implemented in Altera Quartus 5.1 environment and can support variable key lenghts. The K-box size was reduced to 32 bytes, so in order to perform correct modulo operations key can be 2^n bits long, where n = 1, ..., 8. The used architecture of the RC4 stream cipher consists of a control and a storage unit. The storage unit is responsible for the key set-up and keystream generation phases. The operation of the storage unit is synchronized by the control unit. The control unit generates the appropriate clock and control signals.
The storage unit contains memory elements for the S-box and K-box, along with 8-bit registers, adders and one multiplexer.
The block diagram of the S-box is shown in the picture below. It consists of 256 bytes of altsyncram block and multiplexers.
Key enter and initialize:
After reset the first 32 clock cycles are reserved to fill K-box with data appeared on key bus. At the same time the S-box is lineary initialized. Because used RAM block is dual port it is possible to do it in 128 clock cycles.
Key set-up phase:
In this phase S-box is randomly filled. At the first clock cycle, the value of counter i is used as address in the first RAM port. The value of Si is used for the computation of the new value of j. The two adders are used for the computation of the new value of j. They accept as input the values of Ki and Si. At the second clock cycle, the new produced value j is used as an address for the second port of RAM block. At the third cycle, the Si and Sj are written back at the j and i addresses, respectively. With this procedure, the swapping is achieved. The first phase needs three clock cycles per iteration. So, the total time that is required in the key set-up phase is 256 * 3 = 768 clock cycles.
This phase is quite similar with the previous one. So, the same hardware is being re-used. The difference in this phase is that the values of the K-box are not used. After the completion of the first phase, the multiplexer selects the zero value input. Also, the j_register is initialized to zero. After the two aforementioned actions, the procedure of keystream generation can begin. At the first step, the value of i is used as address in the first RAM port. Also, the new value of j is computed. At the second step, the new value of j is used as address of the second RAM port. In this step, the values of Si and Sj are being added and the result of the addition appears on t input of S-box. At the third step, the Si and Sj values are written at the j and i addresses, respectively, and the i counter is incremented. At the forth step value of the t is being used as address for the second RAM port. So, the value of St is produced and stored in t_register. At the same time the value of i is used as address in the first RAM port. The last three steps are repeated as long as there are data bytes to be encrypted/decrypted. The time needed for the keystream generation phase is 1 + 3.n cycles, where n is the number of bytes of the plaintext or ciphertext.
- A Stream Cipher Encryption Algorithm "Arcfour", http://www.tools.ietf.org/html/draft-kaukonen-cipher-arcfour-01,
K. Kaukonen, R. Thayer, July 1997
- Michalis Galanis, Paris Kitsos, Giorgos Kostopoulos, Nicolas Sklavos, and Costas Goutis, Comparison of the Hardware Implementation of Stream Ciphers, Electrical and Computer Engineering Department, University of Patras, Greece
- A Massively Parallel RC4 Key Search Engine, K.H. Tsoi, K.H. Lee and P.H.W. Leong, Department of Computer Science and Engineering, The Chinese University of Hong Kong
- Hardware Implementation of The RC4 Stream Cipher, P. Kitsos, G. Kostopoulos, N. Sklavos, and 0. Koufopavlou, VLSI Design Laboratory, Electrical and Computer Engineering Department, University of Patras, Patras, Greece
- Test vectors, http://www.pbcrypto.com/testvectors.php