Get Complete Project Material File(s) Now! »
CIFAR10 and CIFAR100
CIFAR10 and CIFAR100 are datasets containing colored tiny pictures of size 32×32 [50]. Because they are encoded using the three main colors, a picture in one of these datasets can be represented as a tridimensional tensor containing a total of 32 × 32 × 3 = 3072 dimensions. CIFAR10 contains 10 classes, each one made of 5000 images for training and 1000 images for testing. CIFAR100 contains 100 classes, each one made of 500 images for training and 100 images for testing. These datasets are widely accepted as an interesting compromise between a toy dataset, in the sense that the images are small, and as such training architectures can be fast, and a competitive one, as the best performance reported in the state-of-the-art is respectively of 97.6% accuracy for CIFAR10 [122] and only 85, 42% for CIFAR100 [63].
ImageNet (ILSVRC 2012)
ImageNet is a large visual dataset used in visual object recognition research. It is made of more than 14 millions of images and 20, 000 classes. ILSVRC2012 [85] is a subset of Imagenet that contains 1, 000 classes, more than 1, 200, 000 images for training and 50, 000 images for testing. Contrary to CIFAR10 and CIFAR100, the images have various sizes which are typically of the order of a 1,000 pixels in both width and height. It is common to resize the input images to 200 to 300 pixels square inputs that are being processed by the classifier. Despite being a few years old, ILSVRC remains a highly competitive benchmark that requires a processing time of the order of days to weeks to be trained. As such, it is considered by most as a reference in vision benchmarks.
ImageNet1, ImageNet2 and ImageNet50
In this document we introduce two other datasets extracted from Imagenet. We call them ImageNet1 and ImageNet2. Both contain 10 classes, distinct between themselves and from that in the ILSVRC dataset. Each class contains about 900 images for training and 100 for testing. In some cases, we also make use of ImageNet50, built using the same idea, but containing a total of 50 classes.
AudioSet
AudioSet is a large dataset made of 10 second sound clips extracted from YouTube videos [18]. It contains more than 2 millions of samples which correspond to 5.8 thou-sands of hours of audio split into 527 classes. AudioSet is sometimes presented as the equivalent of ImageNet for sound recognition.
Let us point out that these datasets are but a small fraction of the plethora that can be found freely online. In order to be fair in comparisons, it is crucial that different methodologies are evaluated against using the same benchmarks. This is why all the results presented in this manuscript use these few selected datasets.
Table of contents :
R´esum´e
1 Introduction
2 Basics in Deep Learning
2.1 Datasets
2.1.1 Training, Validation and Test Sets
2.1.2 CIFAR10 and CIFAR100
2.1.3 ImageNet (ILSVRC 2012)
2.1.4 ImageNet1, ImageNet2 and ImageNet50
2.1.5 AudioSet
2.2 Main Elements
2.2.1 Activation Functions
2.2.2 Loss Functions
2.2.3 Layers
2.3 Deep learning
2.3.1 Deep Neural Networks
2.3.2 Learning Process
2.3.3 Classification Inherent Difficulties
3 Neural Networks and Low Resources Systems
3.1 Context
3.2 Quantization
3.3 Pruning
3.4 Light Architectures
3.5 Convolution Alternatives
3.6 Other Methods
3.7 Comparison and Combination of Different Compression Methods
3.8 Hardware Implementation
3.8.1 Hardware Architecture
3.8.2 Hardware Results
3.9 Energy Gains with Faulty Memories
3.10 Summary of the Chapter
4 Incremental Learning on Chip
4.1 Context
4.2 Main Methods in the Literature
4.3 Transfer Learning
4.4 Segmentation
4.5 Budget Restricted Incremental Learning
4.6 Transfer Incremental Learning using Data Augmentation
4.6.1 Feature Vector Extraction
4.6.2 Vector Segmentation
4.6.3 Aggregation of Subspaces Weak Classifiers
4.6.4 Data Augmentation
4.7 Experimental Results
4.7.1 Benchmark Protocol
4.7.2 Results
4.8 Hardware Implementation
4.8.1 Data Quantization
4.8.2 Hardware Architecture
4.8.3 Results
4.9 Summary of the Chapter
5 Conclusion
5.1 Conclusion and Perspectives
5.1.1 Summary of the Thesis
5.1.2 Summary of Contributions
5.1.3 Perspectives