October 17, 2019

3441 words 17 mins read

Paper Group ANR 855

Survey on Deep Learning Techniques for Person Re-Identification Task. Outperforming Good-Turing: Preliminary Report. Global and Local Consistent Age Generative Adversarial Networks. Wider Channel Attention Network for Remote Sensing Image Super-resolution. Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control. Knowle …

Survey on Deep Learning Techniques for Person Re-Identification Task


Title	Survey on Deep Learning Techniques for Person Re-Identification Task
Authors	Bahram Lavi, Mehdi Fatan Serj, Ihsan Ullah
Abstract	Intelligent video-surveillance is currently an active research field in computer vision and machine learning techniques. It provides useful tools for surveillance operators and forensic video investigators. Person re-identification (PReID) is one among these tools. It consists of recognizing whether an individual has already been observed over a camera in a network or not. This tool can also be employed in various possible applications such as off-line retrieval of all the video-sequences showing an individual of interest whose image is given a query, and online pedestrian tracking over multiple camera views. To this aim, many techniques have been proposed to increase the performance of PReID. Among the systems, many researchers utilized deep neural networks (DNNs) because of their better performance and fast execution at test time. Our objective is to provide for future researchers the work being done on PReID to date. Therefore, we summarized state-of-the-art DNN models being used for this task. A brief description of each model along with their evaluation on a set of benchmark datasets is given. Finally, a detailed comparison is provided among these models followed by some limitations that can work as guidelines for future research.
Tasks	Person Re-Identification
Published	2018-07-13
URL	http://arxiv.org/abs/1807.05284v3
PDF	http://arxiv.org/pdf/1807.05284v3.pdf
PWC	https://paperswithcode.com/paper/survey-on-deep-learning-techniques-for-person
Repo
Framework

Outperforming Good-Turing: Preliminary Report


Title	Outperforming Good-Turing: Preliminary Report
Authors	Amichai Painsky, Meir Feder
Abstract	Estimating a large alphabet probability distribution from a limited number of samples is a fundamental problem in machine learning and statistics. A variety of estimation schemes have been proposed over the years, mostly inspired by the early work of Laplace and the seminal contribution of Good and Turing. One of the basic assumptions shared by most commonly-used estimators is the unique correspondence between the symbol’s sample frequency and its estimated probability. In this work we tackle this paradigmatic assumption; we claim that symbols with “similar” frequencies shall be assigned the same estimated probability value. This way we regulate the number of parameters and improve generalization. In this preliminary report we show that by applying an ensemble of such regulated estimators, we introduce a dramatic enhancement in the estimation accuracy (typically up to 50%), compared to currently known methods. An implementation of our suggested method is publicly available at the first author’s web-page.
Tasks
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02287v2
PDF	http://arxiv.org/pdf/1807.02287v2.pdf
PWC	https://paperswithcode.com/paper/outperforming-good-turing-preliminary-report
Repo
Framework

Global and Local Consistent Age Generative Adversarial Networks


Title	Global and Local Consistent Age Generative Adversarial Networks
Authors	Peipei Li, Yibo Hu, Qi Li, Ran He, Zhenan Sun
Abstract	Age progression/regression is a challenging task due to the complicated and non-linear transformation in human aging process. Many researches have shown that both global and local facial features are essential for face representation, but previous GAN based methods mainly focused on the global feature in age synthesis. To utilize both global and local facial information, we propose a Global and Local Consistent Age Generative Adversarial Network (GLCA-GAN). In our generator, a global network learns the whole facial structure and simulates the aging trend of the whole face, while three crucial facial patches are progressed or regressed by three local networks aiming at imitating subtle changes of crucial facial subregions. To preserve most of the details in age-attribute-irrelevant areas, our generator learns the residual face. Moreover, we employ an identity preserving loss to better preserve the identity information, as well as age preserving loss to enhance the accuracy of age synthesis. A pixel loss is also adopted to preserve detailed facial information of the input face. Our proposed method is evaluated on three face aging datasets, i.e., CACD dataset, Morph dataset and FG-NET dataset. Experimental results show appealing performance of the proposed method by comparing with the state-of-the-art.
Tasks
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08390v1
PDF	http://arxiv.org/pdf/1801.08390v1.pdf
PWC	https://paperswithcode.com/paper/global-and-local-consistent-age-generative
Repo
Framework

Wider Channel Attention Network for Remote Sensing Image Super-resolution


Title	Wider Channel Attention Network for Remote Sensing Image Super-resolution
Authors	Jun Gu, Guangluan Xu, Yue Zhang, Xian Sun, Ran Wen, Lei Wang
Abstract	Recently, deep convolutional neural networks (CNNs) have obtained promising results in image processing tasks including super-resolution (SR). However, most CNN-based SR methods treat low-resolution (LR) inputs and features equally across channels, rarely notice the loss of information flow caused by the activation function and fail to leverage the representation ability of CNNs. In this letter, we propose a novel single-image super-resolution (SISR) algorithm named Wider Channel Attention Network (WCAN) for remote sensing images. Firstly, the channel attention mechanism is used to adaptively recalibrate the importance of each channel at the middle of the wider attention block (WAB). Secondly, we propose the Local Memory Connection (LMC) to enhance the information flow. Finally, the features within each WAB are fused to take advantage of the network’s representation capability and further improve information and gradient flow. Analytic experiments on a public remote sensing data set (UC Merced) show that our WCAN achieves better accuracy and visual improvements against most state-of-the-art methods.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-12-13
URL	http://arxiv.org/abs/1812.05329v2
PDF	http://arxiv.org/pdf/1812.05329v2.pdf
PWC	https://paperswithcode.com/paper/wider-channel-attention-network-for-remote
Repo
Framework

Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control


Title	Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control
Authors	Glen Berseth, Cheng Xie, Paul Cernek, Michiel Van de Panne
Abstract	Deep reinforcement learning has demonstrated increasing capabilities for continuous control problems, including agents that can move with skill and agility through their environment. An open problem in this setting is that of developing good strategies for integrating or merging policies for multiple skills, where each individual skill is a specialist in a specific skill and its associated state distribution. We extend policy distillation methods to the continuous action setting and leverage this technique to combine expert policies, as evaluated in the domain of simulated bipedal locomotion across different classes of terrain. We also introduce an input injection method for augmenting an existing policy network to exploit new input features. Lastly, our method uses transfer learning to assist in the efficient acquisition of new skills. The combination of these methods allows a policy to be incrementally augmented with new skills. We compare our progressive learning and integration via distillation (PLAID) method against three alternative baselines.
Tasks	Continuous Control, Transfer Learning
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04765v1
PDF	http://arxiv.org/pdf/1802.04765v1.pdf
PWC	https://paperswithcode.com/paper/progressive-reinforcement-learning-with
Repo
Framework

Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students


Title	Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students
Authors	Chenglin Yang, Lingxi Xie, Siyuan Qiao, Alan Yuille
Abstract	We focus on the problem of training a deep neural network in generations. The flowchart is that, in order to optimize the target network (student), another network (teacher) with the same architecture is first trained, and used to provide part of supervision signals in the next stage. While this strategy leads to a higher accuracy, many aspects (e.g., why teacher-student optimization helps) still need further explorations. This paper studies this problem from a perspective of controlling the strictness in training the teacher network. Existing approaches mostly used a hard distribution (e.g., one-hot vectors) in training, leading to a strict teacher which itself has a high accuracy, but we argue that the teacher needs to be more tolerant, although this often implies a lower accuracy. The implementation is very easy, with merely an extra loss term added to the teacher network, facilitating a few secondary classes to emerge and complement to the primary class. Consequently, the teacher provides a milder supervision signal (a less peaked distribution), and makes it possible for the student to learn from inter-class similarity and potentially lower the risk of over-fitting. Experiments are performed on standard image classification tasks (CIFAR100 and ILSVRC2012). Although the teacher network behaves less powerful, the students show a persistent ability growth and eventually achieve higher classification accuracies than other competitors. Model ensemble and transfer feature extraction also verify the effectiveness of our approach.
Tasks	Image Classification
Published	2018-05-15
URL	http://arxiv.org/abs/1805.05551v2
PDF	http://arxiv.org/pdf/1805.05551v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-distillation-in-generations-more
Repo
Framework

Gaussian Process Regression for Binned Data


Title	Gaussian Process Regression for Binned Data
Authors	Michael Thomas Smith, Mauricio A Alvarez, Neil D Lawrence
Abstract	Many datasets are in the form of tables of binned data. Performing regression on these data usually involves either reading off bin heights, ignoring data from neighbouring bins or interpolating between bins thus over or underestimating the true bin integrals. In this paper we propose an elegant method for performing Gaussian Process (GP) regression given such binned data, allowing one to make probabilistic predictions of the latent function which produced the binned data. We look at several applications. First, for differentially private regression; second, to make predictions over other integrals; and third when the input regions are irregularly shaped collections of polytopes. In summary, our method provides an effective way of analysing binned data such that one can use more information from the histogram representation, and thus reconstruct a more useful and precise density for making predictions.
Tasks
Published	2018-09-06
URL	https://arxiv.org/abs/1809.02010v2
PDF	https://arxiv.org/pdf/1809.02010v2.pdf
PWC	https://paperswithcode.com/paper/gaussian-process-regression-for-binned-data
Repo
Framework

Fast L1-Minimization Algorithm for Sparse Approximation Based on an Improved LPNN-LCA framework


Title	Fast L1-Minimization Algorithm for Sparse Approximation Based on an Improved LPNN-LCA framework
Authors	Hao Wang, Ruibin Feng, Chi-Sing Leung
Abstract	The aim of sparse approximation is to estimate a sparse signal according to the measurement matrix and an observation vector. It is widely used in data analytics, image processing, and communication, etc. Up to now, a lot of research has been done in this area, and many off-the-shelf algorithms have been proposed. However, most of them cannot offer a real-time solution. To some extent, this shortcoming limits its application prospects. To address this issue, we devise a novel sparse approximation algorithm based on Lagrange programming neural network (LPNN), locally competitive algorithm (LCA), and projection theorem. LPNN and LCA are both analog neural network which can help us get a real-time solution. The non-differentiable objective function can be solved by the concept of LCA. Utilizing the projection theorem, we further modify the dynamics and proposed a new system with global asymptotic stability. Simulation results show that the proposed sparse approximation method has the real-time solutions with satisfactory MSEs.
Tasks
Published	2018-05-30
URL	http://arxiv.org/abs/1805.11949v1
PDF	http://arxiv.org/pdf/1805.11949v1.pdf
PWC	https://paperswithcode.com/paper/fast-l1-minimization-algorithm-for-sparse
Repo
Framework

Active Learning and CSI Acquisition for mmWave Initial Alignment


Title	Active Learning and CSI Acquisition for mmWave Initial Alignment
Authors	Sung-En Chiu, Nancy Ronquillo, Tara Javidi
Abstract	Millimeter wave (mmWave) communication with large antenna arrays is a promising technique to enable extremely high data rates due to the large available bandwidth in mmWave frequency bands. In addition, given the knowledge of an optimal directional beamforming vector, large antenna arrays have been shown to overcome both the severe signal attenuation in mmWave as well as the interference problem. However, fundamental limits on achievable learning rate of an optimal beamforming vector remain. This paper considers the problem of adaptive and sequential optimization of the beamforming vectors during the initial access phase of communication. With a single-path channel model, the problem is reduced to actively learning the Angle-of-Arrival (AoA) of the signal sent from the user to the Base Station (BS). Drawing on the recent results in the design of a hierarchical beamforming codebook [1], sequential measurement dependent noisy search strategies [2], and active learning from an imperfect labeler [3], an adaptive and sequential alignment algorithm is proposed. An upper bound on the expected search time of the proposed algorithm is derived via Extrinsic Jensen-Shannon Divergence. which demonstrates that the search time of the proposed algorithm asymptotically matches the performance of the noiseless bisection search up to a constant factor. Furthermore, the upper bound shows that the acquired AoA error probability decays exponentially fast with the search time with an exponent that is a decreasing function of the acquisition rate. Numerically, the proposed algorithm is compared with prior work where a significant improvement of the system communication rate is observed. Most notably, in the relevant regime of low (-10dB to 5dB) raw SNR, this establishes the first practically viable solution for initial access and, hence, the first demonstration of stand-alone mmWave communication
Tasks	Active Learning
Published	2018-12-19
URL	https://arxiv.org/abs/1812.07722v4
PDF	https://arxiv.org/pdf/1812.07722v4.pdf
PWC	https://paperswithcode.com/paper/active-learning-and-csi-acquisition-for
Repo
Framework

MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection


Title	MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection
Authors	Wenchi Ma, Yuanwei Wu, Zongbo Wang, Guanghui Wang
Abstract	Object detection in challenging situations such as scale variation, occlusion, and truncation depends not only on feature details but also on contextual information. Most previous networks emphasize too much on detailed feature extraction through deeper and wider networks, which may enhance the accuracy of object detection to certain extent. However, the feature details are easily being changed or washed out after passing through complicated filtering structures. To better handle these challenges, the paper proposes a novel framework, multi-scale, deep inception convolutional neural network (MDCN), which focuses on wider and broader object regions by activating feature maps produced in the deep part of the network. Instead of incepting inner layers in the shallow part of the network, multi-scale inceptions are introduced in the deep layers. The proposed framework integrates the contextual information into the learning process through a single-shot network structure. It is computational efficient and avoids the hard training problem of previous macro feature extraction network designed for shallow layers. Extensive experiments demonstrate the effectiveness and superior performance of MDCN over the state-of-the-art models.
Tasks	Object Detection
Published	2018-09-06
URL	http://arxiv.org/abs/1809.01791v1
PDF	http://arxiv.org/pdf/1809.01791v1.pdf
PWC	https://paperswithcode.com/paper/mdcn-multi-scale-deep-inception-convolutional
Repo
Framework

Accelerating Deep Learning with Memcomputing


Title	Accelerating Deep Learning with Memcomputing
Authors	Haik Manukian, Fabio L. Traversa, Massimiliano Di Ventra
Abstract	Restricted Boltzmann machines (RBMs) and their extensions, called ‘deep-belief networks’, are powerful neural networks that have found applications in the fields of machine learning and artificial intelligence. The standard way to training these models resorts to an iterative unsupervised procedure based on Gibbs sampling, called ‘contrastive divergence’ (CD), and additional supervised tuning via back-propagation. However, this procedure has been shown not to follow any gradient and can lead to suboptimal solutions. In this paper, we show an efficient alternative to CD by means of simulations of digital memcomputing machines (DMMs). We test our approach on pattern recognition using a modified version of the MNIST data set. DMMs sample effectively the vast phase space given by the model distribution of the RBM, and provide a very good approximation close to the optimum. This efficient search significantly reduces the number of pretraining iterations necessary to achieve a given level of accuracy, as well as a total performance gain over CD. In fact, the acceleration of pretraining achieved by simulating DMMs is comparable to, in number of iterations, the recently reported hardware application of the quantum annealing method on the same network and data set. Notably, however, DMMs perform far better than the reported quantum annealing results in terms of quality of the training. We also compare our method to advances in supervised training, like batch-normalization and rectifiers, that work to reduce the advantage of pretraining. We find that the memcomputing method still maintains a quality advantage ($>1%$ in accuracy, and a $20%$ reduction in error rate) over these approaches. Furthermore, our method is agnostic about the connectivity of the network. Therefore, it can be extended to train full Boltzmann machines, and even deep networks at once.
Tasks
Published	2018-01-01
URL	http://arxiv.org/abs/1801.00512v3
PDF	http://arxiv.org/pdf/1801.00512v3.pdf
PWC	https://paperswithcode.com/paper/accelerating-deep-learning-with-memcomputing
Repo
Framework

Frank-Wolfe with Subsampling Oracle


Title	Frank-Wolfe with Subsampling Oracle
Authors	Thomas Kerdreux, Fabian Pedregosa, Alexandre d’Aspremont
Abstract	We analyze two novel randomized variants of the Frank-Wolfe (FW) or conditional gradient algorithm. While classical FW algorithms require solving a linear minimization problem over the domain at each iteration, the proposed method only requires to solve a linear minimization problem over a small \emph{subset} of the original domain. The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart. The second algorithm is a randomized variant of the Away-step FW algorithm, and again as its deterministic counterpart, reaches linear (i.e., exponential) convergence rate making it the first provably convergent randomized variant of Away-step FW. In both cases, while subsampling reduces the convergence rate by a constant factor, the linear minimization step can be a fraction of the cost of that of the deterministic versions, especially when the data is streamed. We illustrate computational gains of the algorithms on regression problems, involving both $\ell_1$ and latent group lasso penalties.
Tasks
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07348v1
PDF	http://arxiv.org/pdf/1803.07348v1.pdf
PWC	https://paperswithcode.com/paper/frank-wolfe-with-subsampling-oracle
Repo
Framework

Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization


Title	Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization
Authors	Ozsel Kilinc, Ismail Uysal
Abstract	In this paper, we propose a novel unsupervised clustering approach exploiting the hidden information that is indirectly introduced through a pseudo classification objective. Specifically, we randomly assign a pseudo parent-class label to each observation which is then modified by applying the domain specific transformation associated with the assigned label. Generated pseudo observation-label pairs are subsequently used to train a neural network with Auto-clustering Output Layer (ACOL) that introduces multiple softmax nodes for each pseudo parent-class. Due to the unsupervised objective based on Graph-based Activity Regularization (GAR) terms, softmax duplicates of each parent-class are specialized as the hidden information captured through the help of domain specific transformations is propagated during training. Ultimately we obtain a k-means friendly latent representation. Furthermore, we demonstrate how the chosen transformation type impacts performance and helps propagate the latent information that is useful in revealing unknown clusters. Our results show state-of-the-art performance for unsupervised clustering tasks on MNIST, SVHN and USPS datasets, with the highest accuracies reported to date in the literature.
Tasks	Unsupervised Image Classification
Published	2018-02-08
URL	http://arxiv.org/abs/1802.03063v1
PDF	http://arxiv.org/pdf/1802.03063v1.pdf
PWC	https://paperswithcode.com/paper/learning-latent-representations-in-neural
Repo
Framework

This Looks Like That: Deep Learning for Interpretable Image Recognition


Title	This Looks Like That: Deep Learning for Interpretable Image Recognition
Authors	Chaofan Chen, Oscar Li, Chaofan Tao, Alina Jade Barnett, Jonathan Su, Cynthia Rudin
Abstract	When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture – prototypical part network (ProtoPNet), that reasons in a similar way: the network dissects the image by finding prototypical parts, and combines evidence from the prototypes to make a final classification. The model thus reasons in a way that is qualitatively similar to the way ornithologists, physicians, and others would explain to people on how to solve challenging image classification tasks. The network uses only image-level labels for training without any annotations for parts of images. We demonstrate our method on the CUB-200-2011 dataset and the Stanford Cars dataset. Our experiments show that ProtoPNet can achieve comparable accuracy with its analogous non-interpretable counterpart, and when several ProtoPNets are combined into a larger network, it can achieve an accuracy that is on par with some of the best-performing deep models. Moreover, ProtoPNet provides a level of interpretability that is absent in other interpretable deep models.
Tasks	Image Classification
Published	2018-06-27
URL	https://arxiv.org/abs/1806.10574v5
PDF	https://arxiv.org/pdf/1806.10574v5.pdf
PWC	https://paperswithcode.com/paper/this-looks-like-that-deep-learning-for
Repo
Framework

Understanding Human Behaviors in Crowds by Imitating the Decision-Making Process


Title	Understanding Human Behaviors in Crowds by Imitating the Decision-Making Process
Authors	Haosheng Zou, Hang Su, Shihong Song, Jun Zhu
Abstract	Crowd behavior understanding is crucial yet challenging across a wide range of applications, since crowd behavior is inherently determined by a sequential decision-making process based on various factors, such as the pedestrians’ own destinations, interaction with nearby pedestrians and anticipation of upcoming events. In this paper, we propose a novel framework of Social-Aware Generative Adversarial Imitation Learning (SA-GAIL) to mimic the underlying decision-making process of pedestrians in crowds. Specifically, we infer the latent factors of human decision-making process in an unsupervised manner by extending the Generative Adversarial Imitation Learning framework to anticipate future paths of pedestrians. Different factors of human decision making are disentangled with mutual information maximization, with the process modeled by collision avoidance regularization and Social-Aware LSTMs. Experimental results demonstrate the potential of our framework in disentangling the latent decision-making factors of pedestrians and stronger abilities in predicting future trajectories.
Tasks	Decision Making, Imitation Learning
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08391v1
PDF	http://arxiv.org/pdf/1801.08391v1.pdf
PWC	https://paperswithcode.com/paper/understanding-human-behaviors-in-crowds-by
Repo
Framework