Contextually Constrained Deep Networks for Scene Labeling
In Proceedings British Machine Vision Conference 2014
AbstractLearning using deep learning architectures is a difficult problem: the complexity of the network and the gradient descent method used to update the network's weights can both lead to overfitting phenomena and bad local optima. To overcome these problems in the context of full scene labeling, we would like to constraint parts of the network using some semantic context to 1) control its capacity while still allowing complex functions to be learned 2) obtain more meaningful layers which will avoid bad local optima. We first propose to learn a weak convolutional network which would provide us rough label maps over the neighborhood of a pixel. Then, we incorporate this weak learner in a bigger network previously trained using some label information on the neighborhood of a pixel. This iterative augmentation process aims at increasing the interpretability by constraining some features maps to learn precise contextual information. We show how this contextual knowledge yields higher accuracy than state-of-the-art architectures for Stanford and SIFT Flow scene labeling datasets. The approach is generic and can be applied to similar networks where contextual cues are available at training time.
FilesExtended Abstract (PDF, 1 page, 1.0M)
Paper (PDF, 11 pages, 1.2M)