MNIST sample images — Sample images from MNIST test dataset

The MNIST database (Modified National Institute of Standards and Technology database^[1]) is a large database of handwritten digits that is commonly used for training various image processing systems.^[2]^[3] The database is also widely used for training and testing in the field of machine learning.^[4]^[5] It was created by "re-mixing" the samples from NIST's original datasets.^[6] The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments.^[7] Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.^[7]

The MNIST database contains 60,000 training images and 10,000 testing images.^[8] Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset.^[9] The original creators of the database keep a list of some of the methods tested on it.^[7] In their original paper, they use a support-vector machine to get an error rate of 0.8%.^[10]

Extended MNIST (EMNIST) is a newer dataset developed and released by NIST to be the (final) successor to MNIST.^[11]^[12] MNIST included images only of handwritten digits. EMNIST includes all the images from NIST Special Database 19, which is a large database of handwritten uppercase and lower case letters as well as digits.^[13]^[14] The images in EMNIST were converted into the same 28x28 pixel format, by the same process, as were the MNIST images. Accordingly, tools which work with the older, smaller, MNIST dataset will likely work unmodified with EMNIST.

History

The set of images in the MNIST database was created in 1994 as a combination of two of NIST's databases: Special Database 1; and Special Database 3.^[15]

Special Database 1 and Special Database 3 consist of digits written by high school students and employees of the United States Census Bureau, respectively.^[7]

The original dataset was a set of 128x128 binary images, processed into 28x28 grayscale images. The training set and the testing set both originally had 60k samples, but 50k of the testing set samples were discarded.^[16]

Performance

Some researchers have achieved "near-human performance" on the MNIST database, using a committee of neural networks; in the same paper, the authors achieve performance double that of humans on other recognition tasks.^[17] The highest error rate listed^[7] on the original website of the database is 12 percent, which is achieved using a simple linear classifier with no preprocessing.^[10]

In 2004, a best-case error rate of 0.42 percent was achieved on the database by researchers using a new classifier called the LIRA, which is a neural classifier with three neuron layers based on Rosenblatt's perceptron principles.^[18]

Some researchers have tested artificial intelligence systems using the database put under random distortions. The systems in these cases are usually neural networks and the distortions used tend to be either affine distortions or elastic distortions.^[7] Sometimes, these systems can be very successful; one such system achieved an error rate on the database of 0.39 percent.^[19]

In 2011, an error rate of 0.27 percent, improving on the previous best result, was reported by researchers using a similar system of neural networks.^[20] In 2013, an approach based on regularization of neural networks using DropConnect has been claimed to achieve a 0.21 percent error rate.^[21] In 2016, the single convolutional neural network best performance was 0.25 percent error rate.^[22] As of August 2018, the best performance of a single convolutional neural network trained on MNIST training data using no data augmentation is 0.25 percent error rate.^[22]^[23] Also, the Parallel Computing Center (Khmelnytskyi, Ukraine) obtained an ensemble of only 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate.^[24]^[25]

Classifiers

This is a table of some of the machine learning methods used on the dataset and their error rates, by type of classifier:

Type	Classifier	Distortion	Preprocessing	Error rate (%)
Neural Network	Gradient Descent Tunneling	None	None	0^[26]
Linear classifier	Pairwise linear classifier	None	Deskewing	7.6^[10]
K-Nearest Neighbors	K-NN with rigid transformations	None	None	0.96^[27]
K-Nearest Neighbors	K-NN with non-linear deformation (P2DHMDM)	None	Shiftable edges	0.52^[28]
Boosted Stumps	Product of stumps on Haar features	None	Haar features	0.87^[29]
Non-linear classifier	40 PCA + quadratic classifier	None	None	3.3^[10]
Random Forest	Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)^[30]	None	Simple statistical pixel importance	2.8^[31]
Support-vector machine (SVM)	Virtual SVM, deg-9 poly, 2-pixel jittered	None	Deskewing	0.56^[32]
Neural network	2-layer 784-800-10	None	None	1.6^[33]
Neural network	2-layer 784-800-10	Elastic distortions	None	0.7^[33]
Deep neural network (DNN)	6-layer 784-2500-2000-1500-1000-500-10	Elastic distortions	None	0.35^[34]
Convolutional neural network (CNN)	6-layer 784-40-80-500-1000-2000-10	None	Expansion of the training data	0.31^[35]
Convolutional neural network	6-layer 784-50-100-500-1000-10-10	None	Expansion of the training data	0.27^[36]
Convolutional neural network (CNN)	13-layer 64-128(5x)-256(3x)-512-2048-256-256-10	None	None	0.25^[22]
Convolutional neural network	Committee of 35 CNNs, 1-20-P-40-P-150-10	Elastic distortions	Width normalizations	0.23^[17]
Convolutional neural network	Committee of 5 CNNs, 6-layer 784-50-100-500-1000-10-10	None	Expansion of the training data	0.21^[24]^[25]
Convolutional neural network	Committee of 20 CNNS with Squeeze-and-Excitation Networks^[37]	None	Data augmentation	0.17^[38]
Convolutional neural network	Ensemble of 3 CNNs with varying kernel sizes	None	Data augmentation consisting of rotation and translation	0.09^[39]

References

External links

Differentiable computing

General

Concepts

Applications

Hardware

Software libraries

Implementations

Audio–visual	AlexNet WaveNet Human image synthesis HWR OCR Speech synthesis Speech recognition Facial recognition AlphaFold Text-to-image models DALL-E Midjourney Stable Diffusion Text-to-video models Sora VideoPoet Whisper
Verbal	Word2vec Seq2seq BERT Gemini LaMDA Bard NMT Project Debater IBM Watson IBM Watsonx Granite GPT-1 GPT-2 GPT-3 GPT-4 ChatGPT GPT-J Chinchilla AI PaLM BLOOM LLaMA PanGu-Σ
Decisional	AlphaGo AlphaZero Q-learning SARSA OpenAI Five Self-driving car MuZero Action selection Auto-GPT Robot control

People

Organizations

Architectures

Portals
- Computer programming
- Technology
Categories
- Artificial neural networks
- Machine learning

Standard test items

Standard test items
Pangram Reference implementation Sanity check Standard test image
Artificial intelligence	Chinese room Turing test
Television (test card)	SMPTE color bars EBU colour bars Indian-head test pattern EIA 1956 resolution chart BBC Test Card A, B, C, D, E, F, G, H, J, W, X ETP-1 Philips circle pattern (PM 5538, PM 5540, PM 5544, PM 5644) Snell & Wilcox SW2/SW4 Telefunken FuBK TVE test card UEIT
Computer languages	"Hello, World!" program Quine Trabb Pardo–Knuth algorithm Man or boy test Just another Perl hacker
Data compression	Calgary corpus Canterbury corpus Silesia corpus enwik8, enwik9
3D computer graphics	Cornell box Stanford bunny Stanford dragon Utah teapot List
Machine learning	ImageNet MNIST database List
Typography (filler text)	Etaoin shrdlu Hamburgevons Lorem ipsum The quick brown fox jumps over the lazy dog
Other	3DBenchy Acid 1 2 3 "Bad Apple!!" EICAR test file functions for optimization GTUBE Harvard sentences Lenna "The North Wind and the Sun" "Tom's Diner" SMPTE universal leader EURion constellation Shakedown Webdriver Torso 1951 USAF resolution test chart

History

Performance

Classifiers

See also

References

Further reading

External links