RC4
General
Designers	Ron Rivest (RSA Security)
First published	Leaked in 1994 (designed in 1987)
Cipher detail
Key sizes	40–2048 bits
State size	2064 bits (1684 effective)
Rounds	1
Speed	7 cycles per byte on original Pentium^[1] Modified Alleged RC4 on Intel Core 2: 13.9 cycles per byte^[2]

In cryptography, RC4 (Rivest Cipher 4, also known as ARC4 or ARCFOUR, meaning Alleged RC4, see below) is a stream cipher. While it is remarkable for its simplicity and speed in software, multiple vulnerabilities have been discovered in RC4, rendering it insecure.^[3]^[4] It is especially vulnerable when the beginning of the output keystream is not discarded, or when nonrandom or related keys are used. Particularly problematic uses of RC4 have led to very insecure protocols such as WEP.^[5]

As of 2015^[update], there is speculation that some state cryptologic agencies may possess the capability to break RC4 when used in the TLS protocol.^[6] IETF has published RFC 7465 to prohibit the use of RC4 in TLS;^[3] Mozilla and Microsoft have issued similar recommendations.^[7]^[8]

A number of attempts have been made to strengthen RC4, notably Spritz, RC4A, VMPC, and RC4⁺.

History

RC4 was designed by Ron Rivest of RSA Security in 1987. While it is officially termed "Rivest Cipher 4", the RC acronym is alternatively understood to stand for "Ron's Code"^[9] (see also RC2, RC5 and RC6).

RC4 was initially a trade secret, but in September 1994, a description of it was anonymously posted to the Cypherpunks mailing list.^[10] It was soon posted on the sci.crypt newsgroup, where it was broken within days by Bob Jenkins.^[11] From there, it spread to many sites on the Internet. The leaked code was confirmed to be genuine, as its output was found to match that of proprietary software using licensed RC4. Because the algorithm is known, it is no longer a trade secret. The name RC4 is trademarked, so RC4 is often referred to as ARCFOUR or ARC4 (meaning alleged RC4)^[12] to avoid trademark problems. RSA Security has never officially released the algorithm; Rivest has, however, linked to the English Wikipedia article on RC4 in his own course notes in 2008^[13] and confirmed the history of RC4 and its code in a 2014 paper by him.^[14]

RC4 became part of some commonly used encryption protocols and standards, such as WEP in 1997 and WPA in 2003/2004 for wireless cards; and SSL in 1995 and its successor TLS in 1999, until it was prohibited for all versions of TLS by RFC 7465 in 2015, due to the RC4 attacks weakening or breaking RC4 used in SSL/TLS. The main factors in RC4's success over such a wide range of applications have been its speed and simplicity: efficient implementations in both software and hardware were very easy to develop.

Description

RC4 generates a pseudorandom stream of bits (a keystream). As with any stream cipher, these can be used for encryption by combining it with the plaintext using bitwise exclusive or; decryption is performed the same way (since exclusive or with given data is an involution). This is similar to the one-time pad, except that generated pseudorandom bits, rather than a prepared stream, are used.

To generate the keystream, the cipher makes use of a secret internal state which consists of two parts:

A permutation of all 256 possible bytes (denoted "S" below).
Two 8-bit index-pointers (denoted "i" and "j").

The permutation is initialized with a variable-length key, typically between 40 and 2048 bits, using the key-scheduling algorithm (KSA). Once this has been completed, the stream of bits is generated using the pseudo-random generation algorithm (PRGA).

Key-scheduling algorithm (KSA)

The key-scheduling algorithm is used to initialize the permutation in the array "S". "keylength" is defined as the number of bytes in the key and can be in the range 1 ≤ keylength ≤ 256, typically between 5 and 16, corresponding to a key length of 40–128 bits. First, the array "S" is initialized to the identity permutation. S is then processed for 256 iterations in a similar way to the main PRGA, but also mixes in bytes of the key at the same time.

for i from 0 to 255
    S[i] := i
endfor
j := 0
for i from 0 to 255
    j := (j + S[i] + key[i mod keylength]) mod 256
    swap values of S[i] and S[j]
endfor

Pseudo-random generation algorithm (PRGA)

The lookup stage of RC4. The output byte is selected by looking up the values of S[i] and S[j], adding them together modulo 256, and then using the sum as an index into S; S(S[i] + S[j]) is used as a byte of the key stream K.

For as many iterations as are needed, the PRGA modifies the state and outputs a byte of the keystream. In each iteration, the PRGA:

increments i;
looks up the ith element of S, S[i], and adds that to j;
exchanges the values of S[i] and S[j], then uses the sum S[i] + S[j] (modulo 256) as an index to fetch a third element of S (the keystream value K below);
then bitwise exclusive ORed (XORed) with the next byte of the message to produce the next byte of either ciphertext or plaintext.

Each element of S is swapped with another element at least once every 256 iterations.

i := 0
j := 0
while GeneratingOutput:
    i := (i + 1) mod 256
    j := (j + S[i]) mod 256
    swap values of S[i] and S[j]
    t := (S[i] + S[j]) mod 256
    K := S[t]
    output K
endwhile

Thus, this produces a stream of K[0], K[1], ... which are XORed with the plaintext to obtain the ciphertext. So ciphertext[l] = plaintext[l] ⊕ K[l].

RC4-based random number generators

Several operating systems include arc4random, an API originating in OpenBSD providing access to a random number generator originally based on RC4. The API allows no seeding, as the function initializes itself using /dev/random. The use of RC4 has been phased out in most systems implementing this API. Man pages for the new arc4random include the backronym "A Replacement Call for Random" for ARC4 as a mnemonic, as it provides better random data than rand() does.^[15]

In OpenBSD 5.5, released in May 2014, arc4random was modified to use ChaCha20.^[16]^[17] The implementations of arc4random in FreeBSD, NetBSD^[18]^[19] also use ChaCha20.
- Linux typically uses glibc, which did not offer arc4random until 2022. Instead, a separate library, libbsd, offers the function; it was updated to use ChaCha20 in 2016.^[20] In 2022, glibc added its own version of arc4random, also based on ChaCha20.^[21]
According to manual pages shipped with the operating system, in the 2017 release of macOS and iOS operating systems, Apple replaced RC4 with AES in its implementation of arc4random.

Proposed new random number generators are often compared to the RC4 random number generator.^[22]^[23]

Several attacks on RC4 are able to distinguish its output from a random sequence.^[24]

Implementation

Many stream ciphers are based on linear-feedback shift registers (LFSRs), which, while efficient in hardware, are less so in software. The design of RC4 avoids the use of LFSRs and is ideal for software implementation, as it requires only byte manipulations. It uses 256 bytes of memory for the state array, S[0] through S[255], k bytes of memory for the key, key[0] through key[k−1], and integer variables, i, j, and K. Performing a modular reduction of some value modulo 256 can be done with a bitwise AND with 255 (which is equivalent to taking the low-order byte of the value in question).

Test vectors

These test vectors are not official, but convenient for anyone testing their own RC4 program. The keys and plaintext are ASCII, the keystream and ciphertext are in hexadecimal.

Key	Keystream	Plaintext	Ciphertext
Key	EB9F7781B734CA72A719…	Plaintext	BBF316E8D940AF0AD3
Wiki	6044DB6D41B7…	pedia	1021BF0420
Secret	04D46B053CA87B59…	Attack at dawn	45A01F645FC35B383552544B9BF5

Security

Unlike a modern stream cipher (such as those in eSTREAM), RC4 does not take a separate nonce alongside the key. This means that if a single long-term key is to be used to securely encrypt multiple streams, the protocol must specify how to combine the nonce and the long-term key to generate the stream key for RC4. One approach to addressing this is to generate a "fresh" RC4 key by hashing a long-term key with a nonce. However, many applications that use RC4 simply concatenate key and nonce; RC4's weak key schedule then gives rise to related-key attacks, like the Fluhrer, Mantin and Shamir attack (which is famous for breaking the WEP standard).^[25]

Because RC4 is a stream cipher, it is more malleable than common block ciphers. If not used together with a strong message authentication code (MAC), then encryption is vulnerable to a bit-flipping attack. The cipher is also vulnerable to a stream cipher attack if not implemented correctly.^[26]

It is noteworthy, however, that RC4, being a stream cipher, was for a period of time the only common cipher that was immune^[27] to the 2011 BEAST attack on TLS 1.0. The attack exploits a known weakness in the way cipher-block chaining mode is used with all of the other ciphers supported by TLS 1.0, which are all block ciphers.

In March 2013, there were new attack scenarios proposed by Isobe, Ohigashi, Watanabe and Morii,^[28] as well as AlFardan, Bernstein, Paterson, Poettering and Schuldt that use new statistical biases in RC4 key table^[29] to recover plaintext with large number of TLS encryptions.^[30]^[31]

The use of RC4 in TLS is prohibited by RFC 7465 published in February 2015.

Roos' biases and key reconstruction from permutation

In 1995, Andrew Roos experimentally observed that the first byte of the keystream is correlated with the first three bytes of the key, and the first few bytes of the permutation after the KSA are correlated with some linear combination of the key bytes.^[32] These biases remained unexplained until 2007, when Goutam Paul, Siddheshwar Rathi and Subhamoy Maitra^[33] proved the keystream–key correlation and, in another work, Goutam Paul and Subhamoy Maitra^[34] proved the permutation–key correlations. The latter work also used the permutation–key correlations to design the first algorithm for complete key reconstruction from the final permutation after the KSA, without any assumption on the key or initialization vector. This algorithm has a constant probability of success in a time, which is the square root of the exhaustive key search complexity. Subsequently, many other works have been performed on key reconstruction from RC4 internal states.^[35]^[36]^[37] Subhamoy Maitra and Goutam Paul^[38] also showed that the Roos-type biases still persist even when one considers nested permutation indices, like S[S[i]] or S[S[S[i]]]. These types of biases are used in some of the later key reconstruction methods for increasing the success probability.

Biased outputs of the RC4

The keystream generated by the RC4 is biased to varying degrees towards certain sequences, making it vulnerable to distinguishing attacks. The best such attack is due to Itsik Mantin and Adi Shamir, who showed that the second output byte of the cipher was biased toward zero with probability 1/128 (instead of 1/256). This is due to the fact that if the third byte of the original state is zero, and the second byte is not equal to 2, then the second output byte is always zero. Such bias can be detected by observing only 256 bytes.^[24]

Souradyuti Paul and Bart Preneel of COSIC showed that the first and the second bytes of the RC4 were also biased. The number of required samples to detect this bias is 2²⁵ bytes.^[39]

Scott Fluhrer and David McGrew also showed attacks that distinguished the keystream of the RC4 from a random stream given a gigabyte of output.^[40]

The complete characterization of a single step of RC4 PRGA was performed by Riddhipratim Basu, Shirshendu Ganguly, Subhamoy Maitra, and Goutam Paul.^[41] Considering all the permutations, they proved that the distribution of the output is not uniform given i and j, and as a consequence, information about j is always leaked into the output.

Fluhrer, Mantin and Shamir attack

Main article: Fluhrer, Mantin and Shamir attack

In 2001, a new and surprising discovery was made by Fluhrer, Mantin and Shamir: over all the possible RC4 keys, the statistics for the first few bytes of output keystream are strongly non-random, leaking information about the key. If the nonce and long-term key are simply concatenated to generate the RC4 key, this long-term key can be discovered by analysing a large number of messages encrypted with this key.^[42] This and related effects were then used to break the WEP ("wired equivalent privacy") encryption used with 802.11 wireless networks. This caused a scramble for a standards-based replacement for WEP in the 802.11 market and led to the IEEE 802.11i effort and WPA.^[43]

Protocols can defend against this attack by discarding the initial portion of the keystream. Such a modified algorithm is traditionally called "RC4-drop[ $n$ ]", where $n$ is the number of initial keystream bytes that are dropped. The SCAN default is $n$ = 768 bytes, but a conservative value would be $n$ = 3072 bytes.^[44]

The Fluhrer, Mantin and Shamir attack does not apply to RC4-based SSL, since SSL generates the encryption keys it uses for RC4 by hashing, meaning that different SSL sessions have unrelated keys.^[45]

Klein's attack

In 2005, Andreas Klein presented an analysis of the RC4 stream cipher, showing more correlations between the RC4 keystream and the key.^[46] Erik Tews, Ralf-Philipp Weinmann, and Andrei Pychkine used this analysis to create aircrack-ptw, a tool that cracks 104-bit RC4 used in 128-bit WEP in under a minute.^[47] Whereas the Fluhrer, Mantin, and Shamir attack used around 10 million messages, aircrack-ptw can break 104-bit keys in 40,000 frames with 50% probability, or in 85,000 frames with 95% probability.

Combinatorial problem

A combinatorial problem related to the number of inputs and outputs of the RC4 cipher was first posed by Itsik Mantin and Adi Shamir in 2001, whereby, of the total 256 elements in the typical state of RC4, if x number of elements (x ≤ 256) are only known (all other elements can be assumed empty), then the maximum number of elements that can be produced deterministically is also $x$ in the next 256 rounds. This conjecture was put to rest in 2004 with a formal proof given by Souradyuti Paul and Bart Preneel.^[48]

Royal Holloway attack

In 2013, a group of security researchers at the Information Security Group at Royal Holloway, University of London reported an attack that can become effective using only 2³⁴ encrypted messages.^[49]^[50]^[51] While yet not a practical attack for most purposes, this result is sufficiently close to one that it has led to speculation that it is plausible that some state cryptologic agencies may already have better attacks that render RC4 insecure.^[6] Given that, as of 2013^[update], a large amount of TLS traffic uses RC4 to avoid attacks on block ciphers that use cipher block chaining, if these hypothetical better attacks exist, then this would make the TLS-with-RC4 combination insecure against such attackers in a large number of practical scenarios.^[6]

In March 2015, researcher to Royal Holloway announced improvements to their attack, providing a 2²⁶ attack against passwords encrypted with RC4, as used in TLS.^[52]

Bar mitzvah attack

Main article: Bar mitzvah attack

At the Black Hat Asia 2015 Conference, Itsik Mantin presented another attack against SSL using RC4 cipher.^[53]^[54]

NOMORE attack

In 2015, security researchers from KU Leuven presented new attacks against RC4 in both TLS and WPA-TKIP.^[55] Dubbed the Numerous Occurrence MOnitoring & Recovery Exploit (NOMORE) attack, it is the first attack of its kind that was demonstrated in practice. Their attack against TLS can decrypt a secure HTTP cookie within 75 hours. The attack against WPA-TKIP can be completed within an hour and allows an attacker to decrypt and inject arbitrary packets.

RC4 variants

As mentioned above, the most important weakness of RC4 comes from the insufficient key schedule; the first bytes of output reveal information about the key. This can be corrected by simply discarding some initial portion of the output stream.^[56] This is known as RC4-dropN, where N is typically a multiple of 256, such as 768 or 1024.

A number of attempts have been made to strengthen RC4, notably Spritz, RC4A, VMPC, and RC4⁺.

RC4A

Souradyuti Paul and Bart Preneel have proposed an RC4 variant, which they call RC4A.^[57]

RC4A uses two state arrays S1 and S2, and two indexes j1 and j2. Each time i is incremented, two bytes are generated:

First, the basic RC4 algorithm is performed using S1 and j1, but in the last step, S1[i]+S1[j1] is looked up in S2.
Second, the operation is repeated (without incrementing i again) on S2 and j2, and S1[S2[i]+S2[j2]] is output.

Thus, the algorithm is:

All arithmetic is performed modulo 256
i := 0
j1 := 0
j2 := 0
while GeneratingOutput:
    i := i + 1
    j1 := j1 + S1[i]
    swap values of S1[i] and S1[j1]
    output S2[S1[i] + S1[j1]]
    j2 := j2 + S2[i]
    swap values of S2[i] and S2[j2]
    output S1[S2[i] + S2[j2]]
endwhile

Although the algorithm required the same number of operations per output byte, there is greater parallelism than RC4, providing a possible speed improvement.

Although stronger than RC4, this algorithm has also been attacked, with Alexander Maximov^[58] and a team from NEC^[59] developing ways to distinguish its output from a truly random sequence.

VMPC

Main article: Variably Modified Permutation Composition

Variably Modified Permutation Composition (VMPC) is another RC4 variant.^[60] It uses similar key schedule as RC4, with j := S[(j + S[i] + key[i mod keylength]) mod 256] iterating 3 × 256 = 768 times rather than 256, and with an optional additional 768 iterations to incorporate an initial vector. The output generation function operates as follows:

All arithmetic is performed modulo 256.
i := 0
while GeneratingOutput:
    a := S[i]
    j := S[j + a]
    
    output S[S[S[j] + 1]]
    Swap S[i] and S[j]          (b := S[j]; S[i] := b; S[j] := a))
    
    i := i + 1
endwhile

This was attacked in the same papers as RC4A, and can be distinguished within 2³⁸ output bytes.^[61]^[59]

RC4⁺

RC4⁺ is a modified version of RC4 with a more complex three-phase key schedule (taking about three times as long as RC4, or the same as RC4-drop512), and a more complex output function which performs four additional lookups in the S array for each byte output, taking approximately 1.7 times as long as basic RC4.^[62]

All arithmetic modulo 256.  << and >> are left and right shift, ⊕ is exclusive OR
while GeneratingOutput:
    i := i + 1
    a := S[i]
    j := j + a
    
    Swap S[i] and S[j]               (b := S[j]; S[j] := S[i]; S[i] := b;)
    
    c := S[i<<5 ⊕ j>>3] + S[j<<5 ⊕ i>>3]
    output (S[a+b] + S[c⊕0xAA]) ⊕ S[j+b]
endwhile

This algorithm has not been analyzed significantly.

Spritz

In 2014, Ronald Rivest gave a talk and co-wrote a paper^[14] on an updated redesign called Spritz. A hardware accelerator of Spritz was published in Secrypt, 2016^[63] and shows that due to multiple nested calls required to produce output bytes, Spritz performs rather slowly compared to other hash functions such as SHA-3 and the best known hardware implementation of RC4.

Like other sponge functions, Spritz can be used to build a cryptographic hash function, a deterministic random bit generator (DRBG), an encryption algorithm that supports authenticated encryption with associated data (AEAD), etc.^[14]

In 2016, Banik and Isobe proposed an attack that can distinguish Spritz from random noise.^[64] In 2017, Banik, Isobe, and Morii proprosed a simple fix that removes the distinguisher in the first two keystream bytes, requiring only one additional memory access without diminishing software performance substantially.^[65]

RC4-based protocols

WEP
TKIP (default algorithm for WPA, but can be configured to use AES-CCMP instead of RC4)
BitTorrent protocol encryption
Microsoft Office XP (insecure implementation since nonce remains unchanged when documents get modified^[66])
Microsoft Point-to-Point Encryption
Transport Layer Security / Secure Sockets Layer (was optional and then the use of RC4 was prohibited in RFC 7465)
Secure Shell (optionally)
Remote Desktop Protocol (optionally)
Kerberos (optionally)
SASL Mechanism Digest-MD5 (optionally, historic, obsoleted in RFC 6331)
Gpcode.AK, an early June 2008 computer virus for Microsoft Windows, which takes documents hostage for ransom by obscuring them with RC4 and RSA-1024 encryption
PDF
Skype (in modified form)^[67]

Where a protocol is marked with "(optionally)", RC4 is one of multiple ciphers the system can be configured to use.

References

External links

Stream ciphers

Widely used ciphers

eSTREAM Portfolio

Software	HC-128 Rabbit Salsa20 SOSEMANUK
Hardware	Grain MICKEY Trivium

Other ciphers

Generators

Theory

Attacks

v t e Cryptography
General	History of cryptography Outline of cryptography Cryptographic protocol Authentication protocol Cryptographic primitive Cryptanalysis Cryptocurrency Cryptosystem Cryptographic nonce Cryptovirology Hash function Cryptographic hash function Key derivation function Digital signature Kleptography Key (cryptography) Key exchange Key generator Key schedule Key stretching Keygen Cryptojacking malware Ransomware Random number generation Cryptographically secure pseudorandom number generator (CSPRNG) Pseudorandom noise (PRN) Secure channel Insecure channel Subliminal channel Encryption Decryption End-to-end encryption Harvest now, decrypt later Information-theoretic security Plaintext Codetext Ciphertext Shared secret Trapdoor function Trusted timestamping Key-based routing Onion routing Garlic routing Kademlia Mix network
Mathematics	Cryptographic hash function Block cipher Stream cipher Symmetric-key algorithm Authenticated encryption Public-key cryptography Quantum key distribution Quantum cryptography Post-quantum cryptography Message authentication code Random numbers Steganography
Category

History

Description

Key-scheduling algorithm (KSA)

Pseudo-random generation algorithm (PRGA)

RC4-based random number generators

Implementation

Test vectors

Security

Roos' biases and key reconstruction from permutation

Biased outputs of the RC4

Fluhrer, Mantin and Shamir attack

Klein's attack

Combinatorial problem

Royal Holloway attack

Bar mitzvah attack

NOMORE attack

RC4 variants

RC4A

VMPC

RC4⁺

Spritz

RC4-based protocols

See also

References

Further reading

External links

History

Description

Key-scheduling algorithm (KSA)

Pseudo-random generation algorithm (PRGA)

RC4-based random number generators

Implementation

Test vectors

Security

Roos' biases and key reconstruction from permutation

Biased outputs of the RC4

Fluhrer, Mantin and Shamir attack

Klein's attack

Combinatorial problem

Royal Holloway attack

Bar mitzvah attack

NOMORE attack

RC4 variants

RC4A

VMPC

RC4+

Spritz

RC4-based protocols

See also

References

Further reading

External links

RC4⁺