Home

Using deep learning to read street signs

This is a convolution neural network with heavy data augmentation. It gets 95.3% on a holdout validation dataset of 4,410 images and 81.2% on the test dataset of 12,630 images

Architecture

INPUT => CONV => ELU => POOL => CONV => ELU CONV => ELU => POOL => FC => ELU => FC

I am using 3 convolution layers with filter size 7x7, 3x3, and 4x4. ELU activations, after every convolution layer and Max pooling layer. After the convolution layers I am using 2 hidden layers with dropout and a 43 output.

Layer Type	Channels	Size
Input	1	32x32
Convolution	16	7x7
Max pool	32	2x2
Convolution	32	3x3
Convolution	32	1x1
Convolution	64	4x4
Max pool	128	2x2
Fully connected	1024 to 688	-
Dropout	-	0.4
Fully connected	688 to 172	-
Dropout	-	0.4
Fully connected	172 to 43	-

Data augmentation

Data augmentation is fundamentally important for improving the performance of the networks, it allows the network to learn the important features that are invariant for the object classes, rather than the artifact of the training images. We have explored many different ways of doing augmentation to artificially increase the size of the dataset.

Random rotations between -20 and 20 degrees.
Zoom factor increase of 1.3.
Random Brightness factor
Shear
Translation

Knowing that traffic signs can dramatically change brightness dues to times of the day, or weather, jittered was used to randomizing brightness tones.

Karen Simonyan, Andrew Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition”, link
Sebastian Rubers, “An overview of Gradient descent optimization algorithms”, link
Ren Wu, Shengen Yan, Hi Shan, Qingqing Dang, Gang Sun, “Deep Image: Scaling up Image Recognition”, link
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks”, link
Djork-Arne Clevert, Thomas Unterthiner & Sepp Hochreiter, “Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)”, link

Written on Dec 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Using deep learning to read street signs

Architecture

Data augmentation

Clone this wiki locally