0 / 0

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image of 28x28.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

The data set consists of 4 files contained within MNIST.zip:

  • train-images-idx3-ubyte.gz:  training set images (9912422 bytes)
  • train-labels-idx1-ubyte.gz:  training set labels (28881 bytes)
  • t10k-images-idx3-ubyte.gz:   test set images (1648877 bytes)
  • t10k-labels-idx1-ubyte.gz:   test set labels (4542 bytes)

For more details on each file, see: http://yann.lecun.com/exdb/mnist/