Accepted at Thirty-second Conference on Neural Information Processing Systems (NIPS), 2018
Abstract: Previous works on sequential learning address the problem of forgetting in discriminative models. In this paper we consider the case of generative models. In particular, we investigate generative adversarial networks (GANs) in the task of learning new categories in a sequential fashion. We first show that sequential fine tuning renders the network unable to properly generate images from previous categories (i.e. forgetting). Addressing this problem, we propose Memory Replay GANs (MeRGANs), a conditional GAN framework that integrates a memory replay generator. We study two methods to prevent forgetting by leveraging these replays, namely joint training with replay and replay alignment. Qualitative and quantitative experimental results in MNIST, SVHN and LSUN datasets show that our memory replay approach can generate competitive images while significantly mitigating the forgetting of previous categories.
Accepted at International Conference on Pattern Recognition (ICPR), 2018
Abstract: In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to other state-of-the-art in lifelong learning without forgetting.
International Conference on Computer Vision and Pattern Recognition (CVPR), 2018
Abstract: We propose a novel crowd counting approach that leverages abundantly available unlabeled crowd imagery in a learning-to-rank framework. To induce a ranking of cropped images , we use the observation that any sub-image of a crowded scene image is guaranteed to contain the same number or fewer persons than the super-image. This allows us to address the problem of limited size of existing datasets for crowd counting. We collect two crowd scene datasets from Google using keyword searches and query-by-example image retrieval, respectively. We demonstrate how to efficiently learn from these unlabeled datasets by incorporating learning-to-rank in a multi-task network which simultaneously ranks images and estimates crowd density maps. Experiments on two of the most challenging crowd counting datasets show that our approach obtains state-of-the-art results.
International Conference on Computer Vision (ICCV), 2017
Abstract:We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-the-art by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.
Submitted to Transactions on Pattern Analysis and Machine Intelligence
Advisors: Joost van de Weijer and Andrew D. Bagdanov
Abstract: In this thesis we present a no-reference image quality assessment (NR-IQA) approach based on deep Siamese networks. One of the major challenges to apply deep learning techniques to the problem of image quality assessment is the absence of large data sets. To address this problem, we train our Siamese Network to rank images in terms of image quality by using ranked image sets for which relative image quality is known. These ranked image sets can be automatically generated without the use of laborious human labelling. We then use fine-tuning to transfer the knowledge represented by the trained Siamese Network to a traditional CNN that is able to estimate absolute image quality from single images. To solve the difficulty of pair selection for Siamese network training, we demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. We evaluate our approach on the LIVE dataset. Our approach is demonstrated to be superior to the existing NR-IQA techniques. Furthermore, we are the first NR-IQA method to surpass the state-of-the-art full-reference IQA (FR-IQA) methods. Experiments on TID2008 and Places2 datasets show the generalization ability of our approach.
Collaborate within the Learning and Machine Perception (LAMP) groups at different research projects.
I received my B.Sc. and M.Sc. degrees in Information Engineering and Control Engineering from the Northwestern Polytechnic university (NWPU), China in 2013 and 2016, respectively. I received my second M.Sc. degree in Computer Vision from the Universitat Autònoma de Barcelona (UAB), Barcelona in 2016. Currently, I am pursuing the Ph.D. degree under the supervision of Dr. Joost van de Weijer and Dr. Andrew D. Bagdanov starting in 2016. My main research interests include Deep Neural Networks, Object Detection, Image Quality Assessment, Crowd Counting, GANs and Lifelong Learning.