scrypt
General
Designers	Colin Percival
First published	2009
Cipher detail
Digest sizes	variable
Block sizes	variable
Rounds	variable

In cryptography, scrypt (pronounced "ess crypt"^[1]) is a password-based key derivation function created by Colin Percival in March 2009, originally for the Tarsnap online backup service.^[2]^[3] The algorithm was specifically designed to make it costly to perform large-scale custom hardware attacks by requiring large amounts of memory. In 2016, the scrypt algorithm was published by IETF as RFC 7914.^[4] A simplified version of scrypt is used as a proof-of-work scheme by a number of cryptocurrencies, first implemented by an anonymous programmer called ArtForz in Tenebrix and followed by Fairbrix and Litecoin soon after.^[5]

Introduction

A password-based key derivation function (password-based KDF) is generally designed to be computationally intensive, so that it takes a relatively long time to compute (say on the order of several hundred milliseconds). Legitimate users only need to perform the function once per operation (e.g., authentication), and so the time required is negligible. However, a brute-force attack would likely need to perform the operation billions of times, at which point the time requirements become significant and, ideally, prohibitive.

Previous password-based KDFs (such as the popular PBKDF2 from RSA Laboratories) have relatively low resource demands, meaning they do not require elaborate hardware or very much memory to perform. They are therefore easily and cheaply implemented in hardware (for instance on an ASIC or even an FPGA). This allows an attacker with sufficient resources to launch a large-scale parallel attack by building hundreds or even thousands of implementations of the algorithm in hardware and having each search a different subset of the key space. This divides the amount of time needed to complete a brute-force attack by the number of implementations available, very possibly bringing it down to a reasonable time frame.

The scrypt function is designed to hinder such attempts by raising the resource demands of the algorithm. Specifically, the algorithm is designed to use a large amount of memory compared to other password-based KDFs,^[6] making the size and the cost of a hardware implementation much more expensive, and therefore limiting the amount of parallelism an attacker can use, for a given amount of financial resources.

Overview

The large memory requirements of scrypt come from a large vector of pseudorandom bit strings that are generated as part of the algorithm. Once the vector is generated, the elements of it are accessed in a pseudo-random order and combined to produce the derived key. A straightforward implementation would need to keep the entire vector in RAM so that it can be accessed as needed.

Because the elements of the vector are generated algorithmically, each element could be generated on the fly as needed, only storing one element in memory at a time and therefore cutting the memory requirements significantly. However, the generation of each element is intended to be computationally expensive, and the elements are expected to be accessed many times throughout the execution of the function. Thus there is a significant trade-off in speed to get rid of the large memory requirements.

This sort of time–memory trade-off often exists in computer algorithms: speed can be increased at the cost of using more memory, or memory requirements decreased at the cost of performing more operations and taking longer. The idea behind scrypt is to deliberately make this trade-off costly in either direction. Thus an attacker could use an implementation that doesn't require many resources (and can therefore be massively parallelized with limited expense) but runs very slowly, or use an implementation that runs more quickly but has very large memory requirements and is therefore more expensive to parallelize.

Algorithm

Function scrypt
   Inputs: This algorithm includes the following parameters:
      Passphrase:                Bytes    string of characters to be hashed
      Salt:                      Bytes    string of random characters that modifies the hash to protect against Rainbow table attacks
      CostFactor (N):            Integer  CPU/memory cost parameter – Must be a power of 2 (e.g. 1024)
      BlockSizeFactor (r):       Integer  blocksize parameter, which fine-tunes sequential memory read size and performance. (8 is commonly used)
      ParallelizationFactor (p): Integer  Parallelization parameter. (1 .. 2³²-1 * hLen/MFlen)
      DesiredKeyLen (dkLen):     Integer  Desired key length in bytes (Intended output length in octets of the derived key; a positive integer satisfying dkLen ≤ (2³²− 1) * hLen.)
      hLen:                      Integer  The length in octets of the hash function (32 for SHA256).
      MFlen:                     Integer  The length in octets of the output of the mixing function (SMix below). Defined as r * 128 in RFC7914.
   Output:
      DerivedKey:                Bytes    array of bytes, DesiredKeyLen long

   Step 1. Generate expensive salt
   blockSize ← 128*BlockSizeFactor  // Length (in bytes) of the SMix mixing function output (e.g. 128*8 = 1024 bytes)

   Use PBKDF2 to generate initial 128*BlockSizeFactor*p bytes of data (e.g. 128*8*3 = 3072 bytes)
   Treat the result as an array of p elements, each entry being blocksize bytes (e.g. 3 elements, each 1024 bytes)
   [B₀...B_p−1] ← PBKDF2_HMAC-SHA256(Passphrase, Salt, 1, blockSize*ParallelizationFactor)

   Mix each block in B Costfactor times using ROMix function (each block can be mixed in parallel)
   for i ← 0 to p-1 do
      B_i ← ROMix(B_i, CostFactor)

   All the elements of B is our new "expensive" salt
   expensiveSalt ← B₀∥B₁∥B₂∥ ... ∥B_p-1  // where ∥ is concatenation
 
   Step 2. Use PBKDF2 to generate the desired number of bytes, but using the expensive salt we just generated
   return PBKDF2_HMAC-SHA256(Passphrase, expensiveSalt, 1, DesiredKeyLen);

Where PBKDF2(P, S, c, dkLen) notation is defined in RFC 2898, where c is an iteration count.

This notation is used by RFC 7914 for specifying a usage of PBKDF2 with c = 1.

Function ROMix(Block, Iterations)

   Create Iterations copies of X
   X ← Block
   for i ← 0 to Iterations−1 do
      V_i ← X
      X ← BlockMix(X)

   for i ← 0 to Iterations−1 do
      j ← Integerify(X) mod Iterations 
      X ← BlockMix(X xor V_j)

   return X

Where RFC 7914 defines Integerify(X) as the result of interpreting the last 64 bytes of X as a little-endian integer A₁.

Since Iterations equals 2 to the power of N, only the first Ceiling(N / 8) bytes among the last 64 bytes of X, interpreted as a little-endian integer A₂, are actually needed to compute Integerify(X) mod Iterations = A₁ mod Iterations = A₂ mod Iterations.

Function BlockMix(B):

    The block B is r 128-byte chunks (which is equivalent of 2r 64-byte chunks)
    r ← Length(B) / 128;

    Treat B as an array of 2r 64-byte chunks
    [B₀...B_2r-1] ← B

    X ← B_2r−1
    for i ← 0 to 2r−1 do
        X ← Salsa20/8(X xor B_i)  // Salsa20/8 hashes from 64-bytes to 64-bytes
        Y_i ← X

    return ← Y₀∥Y₂∥...∥Y_2r−2 ∥ Y₁∥Y₃∥...∥Y_2r−1

Where Salsa20/8 is the 8-round version of Salsa20.

Cryptocurrency uses

Scrypt is used in many cryptocurrencies as a proof-of-work algorithm (more precisely, as the hash function in the Hashcash proof-of-work algorithm). It was first implemented for Tenebrix (released in September 2011) and served as the basis for Litecoin and Dogecoin, which also adopted its scrypt algorithm.^[7]^[8] Mining of cryptocurrencies that use scrypt is often performed on graphics processing units (GPUs) since GPUs tend to have significantly more processing power (for some algorithms) compared to the CPU.^[9] This led to shortages of high end GPUs due to the rising price of these currencies in the months of November and December 2013.^[10]

Utility

scrypt encryption utility
Developer(s)	Colin Percival

Stable release	1.3.2^[11] / 2 October 2023; 6 months ago (2 October 2023)

Repository	github.com/Tarsnap/scrypt
Website	www.tarsnap.com/scrypt.html

The scrypt utility was written in May 2009 by Colin Percival as a demonstration of the scrypt key derivation function.^[2]^[3] It's available in most Linux and BSD distributions.

References

External links

Cryptographic hash functions and message authentication codes

Cryptographic hash functions and message authentication codes
List Comparison Known attacks
Common functions	MD5 (compromised) SHA-1 (compromised) SHA-2 SHA-3 BLAKE2
SHA-3 finalists	BLAKE Grøstl JH Skein Keccak (winner)
Other functions	BLAKE3 CubeHash ECOH FSB Fugue GOST HAS-160 HAVAL Kupyna LSH Lane MASH-1 MASH-2 MD2 MD4 MD6 MDC-2 N-hash RIPEMD RadioGatún SIMD SM3 SWIFFT Shabal Snefru Streebog Tiger VSH Whirlpool
Password hashing/ key stretching functions	Argon2 Balloon bcrypt Catena crypt LM hash Lyra2 Makwa PBKDF2 scrypt yescrypt
General purpose key derivation functions	HKDF KDF1/KDF2
MAC functions	CBC-MAC DAA GMAC HMAC NMAC OMAC/CMAC PMAC Poly1305 SipHash UMAC VMAC
Authenticated encryption modes	CCM ChaCha20-Poly1305 CWC EAX GCM IAPM OCB
Attacks	Collision attack Preimage attack Birthday attack Brute-force attack Rainbow table Side-channel attack Length extension attack
Design	Avalanche effect Hash collision Merkle–Damgård construction Sponge function HAIFA construction
Standardization	CAESAR Competition CRYPTREC NESSIE NIST hash function competition Password Hashing Competition
Utilization	Hash-based cryptography Merkle tree Message authentication Proof of work Salt Pepper

v t e Cryptography
General	History of cryptography Outline of cryptography Cryptographic protocol Authentication protocol Cryptographic primitive Cryptanalysis Cryptocurrency Cryptosystem Cryptographic nonce Cryptovirology Hash function Cryptographic hash function Key derivation function Digital signature Kleptography Key (cryptography) Key exchange Key generator Key schedule Key stretching Keygen Cryptojacking malware Ransomware Random number generation Cryptographically secure pseudorandom number generator (CSPRNG) Pseudorandom noise (PRN) Secure channel Insecure channel Subliminal channel Encryption Decryption End-to-end encryption Harvest now, decrypt later Information-theoretic security Plaintext Codetext Ciphertext Shared secret Trapdoor function Trusted timestamping Key-based routing Onion routing Garlic routing Kademlia Mix network
Mathematics	Cryptographic hash function Block cipher Stream cipher Symmetric-key algorithm Authenticated encryption Public-key cryptography Quantum key distribution Quantum cryptography Post-quantum cryptography Message authentication code Random numbers Steganography
Category

Cryptocurrencies

Technology

Consensus mechanisms

Proof of work currencies

SHA-256-based	Bitcoin Bitcoin Cash Counterparty LBRY MazaCoin Namecoin Peercoin Titcoin
Ethash-based	Ethereum (1.0) Ethereum Classic
Scrypt-based	Auroracoin Bitconnect Coinye Dogecoin Litecoin
Equihash-based	Bitcoin Gold Zcash
RandomX-based	Monero
X11-based	Dash Petro
Other	AmbaCoin Firo IOTA Nervos Network Primecoin Verge Vertcoin