Skip to content

Latest commit

 

History

History
18 lines (9 loc) · 1.21 KB

dataset.md

File metadata and controls

18 lines (9 loc) · 1.21 KB

Dataset

The train set and test set do not have overlaps.

Train set

We use the CASIA WebFace Database as our train set, which is one of the most popular dataset about face recognition problem. This dataset is colledted from Internet and containing 10,575 subjects and 494,414 images, called CASIA-WebFace.

Test set

Labeled Faces in the Wild, a database of face photographs designed for studying the problem of unconstrained face recognition, is used as our test datset. The data set contains more than 13,000 images of faces collected from the web. Each face has been labeled with the name of the person pictured. 1680 of the people pictured have two or more distinct photos in the data set.

Preprocessing

I supply the preprocessed dataset in baidu pan:CASIA-WebFace-112X96,lfw-112X96. You can download and unzip them to dir dataset.

If you want to preprocess the dataset by yourself, you can refer to sphereface.