1.3. Data


1.3.1. Introducation

This document introduces the preparation of ImageNet1k and flowers102

1.3.2. Dataset

Dataset train dataset size valid dataset size category
flowers102 1k 6k 102
ImageNet1k 1.2M 50k 1000
  • Data format

Please follow the steps mentioned below to organize data, include train_list.txt and val_list.txt

# delimiter: "space"

ILSVRC2012_val_00000001.JPEG 65
...

1.3.2.1. ImageNet1k

After downloading data, please organize the data dir as below

PaddleClas/dataset/imagenet/
|_ train/
|  |_ n01440764
|  |  |_ n01440764_10026.JPEG
|  |  |_ ...
|  |_ ...
|  |
|  |_ n15075141
|     |_ ...
|     |_ n15075141_9993.JPEG
|_ val/
|  |_ ILSVRC2012_val_00000001.JPEG
|  |_ ...
|  |_ ILSVRC2012_val_00050000.JPEG
|_ train_list.txt
|_ val_list.txt

1.3.2.2. Flowers102 Dataset

Download Data then decompress:

jpg/
setid.mat
imagelabels.mat

Please put all the files under PaddleClas/dataset/flowers102

generate generate_flowers102_list.py and train_list.txt和val_list.txt

python generate_flowers102_list.py jpg train > train_list.txt
python generate_flowers102_list.py jpg valid > val_list.txt

Please organize data dir as below

PaddleClas/dataset/flowers102/
|_ jpg/
|  |_ image_03601.jpg
|  |_ ...
|  |_ image_02355.jpg
|_ train_list.txt
|_ val_list.txt