Paper Group ANR 855
Survey on Deep Learning Techniques for Person Re-Identification Task. Outperforming Good-Turing: Preliminary Report. Global and Local Consistent Age Generative Adversarial Networks. Wider Channel Attention Network for Remote Sensing Image Super-resolution. Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control. Knowle …
Survey on Deep Learning Techniques for Person Re-Identification Task
Title | Survey on Deep Learning Techniques for Person Re-Identification Task |
Authors | Bahram Lavi, Mehdi Fatan Serj, Ihsan Ullah |
Abstract | Intelligent video-surveillance is currently an active research field in computer vision and machine learning techniques. It provides useful tools for surveillance operators and forensic video investigators. Person re-identification (PReID) is one among these tools. It consists of recognizing whether an individual has already been observed over a camera in a network or not. This tool can also be employed in various possible applications such as off-line retrieval of all the video-sequences showing an individual of interest whose image is given a query, and online pedestrian tracking over multiple camera views. To this aim, many techniques have been proposed to increase the performance of PReID. Among the systems, many researchers utilized deep neural networks (DNNs) because of their better performance and fast execution at test time. Our objective is to provide for future researchers the work being done on PReID to date. Therefore, we summarized state-of-the-art DNN models being used for this task. A brief description of each model along with their evaluation on a set of benchmark datasets is given. Finally, a detailed comparison is provided among these models followed by some limitations that can work as guidelines for future research. |
Tasks | Person Re-Identification |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.05284v3 |
http://arxiv.org/pdf/1807.05284v3.pdf | |
PWC | https://paperswithcode.com/paper/survey-on-deep-learning-techniques-for-person |
Repo | |
Framework | |
Outperforming Good-Turing: Preliminary Report
Title | Outperforming Good-Turing: Preliminary Report |
Authors | Amichai Painsky, Meir Feder |
Abstract | Estimating a large alphabet probability distribution from a limited number of samples is a fundamental problem in machine learning and statistics. A variety of estimation schemes have been proposed over the years, mostly inspired by the early work of Laplace and the seminal contribution of Good and Turing. One of the basic assumptions shared by most commonly-used estimators is the unique correspondence between the symbol’s sample frequency and its estimated probability. In this work we tackle this paradigmatic assumption; we claim that symbols with “similar” frequencies shall be assigned the same estimated probability value. This way we regulate the number of parameters and improve generalization. In this preliminary report we show that by applying an ensemble of such regulated estimators, we introduce a dramatic enhancement in the estimation accuracy (typically up to 50%), compared to currently known methods. An implementation of our suggested method is publicly available at the first author’s web-page. |
Tasks | |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02287v2 |
http://arxiv.org/pdf/1807.02287v2.pdf | |
PWC | https://paperswithcode.com/paper/outperforming-good-turing-preliminary-report |
Repo | |
Framework | |
Global and Local Consistent Age Generative Adversarial Networks
Title | Global and Local Consistent Age Generative Adversarial Networks |
Authors | Peipei Li, Yibo Hu, Qi Li, Ran He, Zhenan Sun |
Abstract | Age progression/regression is a challenging task due to the complicated and non-linear transformation in human aging process. Many researches have shown that both global and local facial features are essential for face representation, but previous GAN based methods mainly focused on the global feature in age synthesis. To utilize both global and local facial information, we propose a Global and Local Consistent Age Generative Adversarial Network (GLCA-GAN). In our generator, a global network learns the whole facial structure and simulates the aging trend of the whole face, while three crucial facial patches are progressed or regressed by three local networks aiming at imitating subtle changes of crucial facial subregions. To preserve most of the details in age-attribute-irrelevant areas, our generator learns the residual face. Moreover, we employ an identity preserving loss to better preserve the identity information, as well as age preserving loss to enhance the accuracy of age synthesis. A pixel loss is also adopted to preserve detailed facial information of the input face. Our proposed method is evaluated on three face aging datasets, i.e., CACD dataset, Morph dataset and FG-NET dataset. Experimental results show appealing performance of the proposed method by comparing with the state-of-the-art. |
Tasks | |
Published | 2018-01-25 |
URL | http://arxiv.org/abs/1801.08390v1 |
http://arxiv.org/pdf/1801.08390v1.pdf | |
PWC | https://paperswithcode.com/paper/global-and-local-consistent-age-generative |
Repo | |
Framework | |
Wider Channel Attention Network for Remote Sensing Image Super-resolution
Title | Wider Channel Attention Network for Remote Sensing Image Super-resolution |
Authors | Jun Gu, Guangluan Xu, Yue Zhang, Xian Sun, Ran Wen, Lei Wang |
Abstract | Recently, deep convolutional neural networks (CNNs) have obtained promising results in image processing tasks including super-resolution (SR). However, most CNN-based SR methods treat low-resolution (LR) inputs and features equally across channels, rarely notice the loss of information flow caused by the activation function and fail to leverage the representation ability of CNNs. In this letter, we propose a novel single-image super-resolution (SISR) algorithm named Wider Channel Attention Network (WCAN) for remote sensing images. Firstly, the channel attention mechanism is used to adaptively recalibrate the importance of each channel at the middle of the wider attention block (WAB). Secondly, we propose the Local Memory Connection (LMC) to enhance the information flow. Finally, the features within each WAB are fused to take advantage of the network’s representation capability and further improve information and gradient flow. Analytic experiments on a public remote sensing data set (UC Merced) show that our WCAN achieves better accuracy and visual improvements against most state-of-the-art methods. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-12-13 |
URL | http://arxiv.org/abs/1812.05329v2 |
http://arxiv.org/pdf/1812.05329v2.pdf | |
PWC | https://paperswithcode.com/paper/wider-channel-attention-network-for-remote |
Repo | |
Framework | |
Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control
Title | Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control |
Authors | Glen Berseth, Cheng Xie, Paul Cernek, Michiel Van de Panne |
Abstract | Deep reinforcement learning has demonstrated increasing capabilities for continuous control problems, including agents that can move with skill and agility through their environment. An open problem in this setting is that of developing good strategies for integrating or merging policies for multiple skills, where each individual skill is a specialist in a specific skill and its associated state distribution. We extend policy distillation methods to the continuous action setting and leverage this technique to combine expert policies, as evaluated in the domain of simulated bipedal locomotion across different classes of terrain. We also introduce an input injection method for augmenting an existing policy network to exploit new input features. Lastly, our method uses transfer learning to assist in the efficient acquisition of new skills. The combination of these methods allows a policy to be incrementally augmented with new skills. We compare our progressive learning and integration via distillation (PLAID) method against three alternative baselines. |
Tasks | Continuous Control, Transfer Learning |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04765v1 |
http://arxiv.org/pdf/1802.04765v1.pdf | |
PWC | https://paperswithcode.com/paper/progressive-reinforcement-learning-with |
Repo | |
Framework | |
Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students
Title | Knowledge Distillation in Generations: More Tolerant Teachers Educate Better Students |
Authors | Chenglin Yang, Lingxi Xie, Siyuan Qiao, Alan Yuille |
Abstract | We focus on the problem of training a deep neural network in generations. The flowchart is that, in order to optimize the target network (student), another network (teacher) with the same architecture is first trained, and used to provide part of supervision signals in the next stage. While this strategy leads to a higher accuracy, many aspects (e.g., why teacher-student optimization helps) still need further explorations. This paper studies this problem from a perspective of controlling the strictness in training the teacher network. Existing approaches mostly used a hard distribution (e.g., one-hot vectors) in training, leading to a strict teacher which itself has a high accuracy, but we argue that the teacher needs to be more tolerant, although this often implies a lower accuracy. The implementation is very easy, with merely an extra loss term added to the teacher network, facilitating a few secondary classes to emerge and complement to the primary class. Consequently, the teacher provides a milder supervision signal (a less peaked distribution), and makes it possible for the student to learn from inter-class similarity and potentially lower the risk of over-fitting. Experiments are performed on standard image classification tasks (CIFAR100 and ILSVRC2012). Although the teacher network behaves less powerful, the students show a persistent ability growth and eventually achieve higher classification accuracies than other competitors. Model ensemble and transfer feature extraction also verify the effectiveness of our approach. |
Tasks | Image Classification |
Published | 2018-05-15 |
URL | http://arxiv.org/abs/1805.05551v2 |
http://arxiv.org/pdf/1805.05551v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-distillation-in-generations-more |
Repo | |
Framework | |
Gaussian Process Regression for Binned Data
Title | Gaussian Process Regression for Binned Data |
Authors | Michael Thomas Smith, Mauricio A Alvarez, Neil D Lawrence |
Abstract | Many datasets are in the form of tables of binned data. Performing regression on these data usually involves either reading off bin heights, ignoring data from neighbouring bins or interpolating between bins thus over or underestimating the true bin integrals. In this paper we propose an elegant method for performing Gaussian Process (GP) regression given such binned data, allowing one to make probabilistic predictions of the latent function which produced the binned data. We look at several applications. First, for differentially private regression; second, to make predictions over other integrals; and third when the input regions are irregularly shaped collections of polytopes. In summary, our method provides an effective way of analysing binned data such that one can use more information from the histogram representation, and thus reconstruct a more useful and precise density for making predictions. |
Tasks | |
Published | 2018-09-06 |
URL | https://arxiv.org/abs/1809.02010v2 |
https://arxiv.org/pdf/1809.02010v2.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-process-regression-for-binned-data |
Repo | |
Framework | |
Fast L1-Minimization Algorithm for Sparse Approximation Based on an Improved LPNN-LCA framework
Title | Fast L1-Minimization Algorithm for Sparse Approximation Based on an Improved LPNN-LCA framework |
Authors | Hao Wang, Ruibin Feng, Chi-Sing Leung |
Abstract | The aim of sparse approximation is to estimate a sparse signal according to the measurement matrix and an observation vector. It is widely used in data analytics, image processing, and communication, etc. Up to now, a lot of research has been done in this area, and many off-the-shelf algorithms have been proposed. However, most of them cannot offer a real-time solution. To some extent, this shortcoming limits its application prospects. To address this issue, we devise a novel sparse approximation algorithm based on Lagrange programming neural network (LPNN), locally competitive algorithm (LCA), and projection theorem. LPNN and LCA are both analog neural network which can help us get a real-time solution. The non-differentiable objective function can be solved by the concept of LCA. Utilizing the projection theorem, we further modify the dynamics and proposed a new system with global asymptotic stability. Simulation results show that the proposed sparse approximation method has the real-time solutions with satisfactory MSEs. |
Tasks | |
Published | 2018-05-30 |
URL | http://arxiv.org/abs/1805.11949v1 |
http://arxiv.org/pdf/1805.11949v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-l1-minimization-algorithm-for-sparse |
Repo | |
Framework | |
Active Learning and CSI Acquisition for mmWave Initial Alignment
Title | Active Learning and CSI Acquisition for mmWave Initial Alignment |
Authors | Sung-En Chiu, Nancy Ronquillo, Tara Javidi |
Abstract | Millimeter wave (mmWave) communication with large antenna arrays is a promising technique to enable extremely high data rates due to the large available bandwidth in mmWave frequency bands. In addition, given the knowledge of an optimal directional beamforming vector, large antenna arrays have been shown to overcome both the severe signal attenuation in mmWave as well as the interference problem. However, fundamental limits on achievable learning rate of an optimal beamforming vector remain. This paper considers the problem of adaptive and sequential optimization of the beamforming vectors during the initial access phase of communication. With a single-path channel model, the problem is reduced to actively learning the Angle-of-Arrival (AoA) of the signal sent from the user to the Base Station (BS). Drawing on the recent results in the design of a hierarchical beamforming codebook [1], sequential measurement dependent noisy search strategies [2], and active learning from an imperfect labeler [3], an adaptive and sequential alignment algorithm is proposed. An upper bound on the expected search time of the proposed algorithm is derived via Extrinsic Jensen-Shannon Divergence. which demonstrates that the search time of the proposed algorithm asymptotically matches the performance of the noiseless bisection search up to a constant factor. Furthermore, the upper bound shows that the acquired AoA error probability decays exponentially fast with the search time with an exponent that is a decreasing function of the acquisition rate. Numerically, the proposed algorithm is compared with prior work where a significant improvement of the system communication rate is observed. Most notably, in the relevant regime of low (-10dB to 5dB) raw SNR, this establishes the first practically viable solution for initial access and, hence, the first demonstration of stand-alone mmWave communication |
Tasks | Active Learning |
Published | 2018-12-19 |
URL | https://arxiv.org/abs/1812.07722v4 |
https://arxiv.org/pdf/1812.07722v4.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-and-csi-acquisition-for |
Repo | |
Framework | |
MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection
Title | MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection |
Authors | Wenchi Ma, Yuanwei Wu, Zongbo Wang, Guanghui Wang |
Abstract | Object detection in challenging situations such as scale variation, occlusion, and truncation depends not only on feature details but also on contextual information. Most previous networks emphasize too much on detailed feature extraction through deeper and wider networks, which may enhance the accuracy of object detection to certain extent. However, the feature details are easily being changed or washed out after passing through complicated filtering structures. To better handle these challenges, the paper proposes a novel framework, multi-scale, deep inception convolutional neural network (MDCN), which focuses on wider and broader object regions by activating feature maps produced in the deep part of the network. Instead of incepting inner layers in the shallow part of the network, multi-scale inceptions are introduced in the deep layers. The proposed framework integrates the contextual information into the learning process through a single-shot network structure. It is computational efficient and avoids the hard training problem of previous macro feature extraction network designed for shallow layers. Extensive experiments demonstrate the effectiveness and superior performance of MDCN over the state-of-the-art models. |
Tasks | Object Detection |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.01791v1 |
http://arxiv.org/pdf/1809.01791v1.pdf | |
PWC | https://paperswithcode.com/paper/mdcn-multi-scale-deep-inception-convolutional |
Repo | |
Framework | |
Accelerating Deep Learning with Memcomputing
Title | Accelerating Deep Learning with Memcomputing |
Authors | Haik Manukian, Fabio L. Traversa, Massimiliano Di Ventra |
Abstract | Restricted Boltzmann machines (RBMs) and their extensions, called ‘deep-belief networks’, are powerful neural networks that have found applications in the fields of machine learning and artificial intelligence. The standard way to training these models resorts to an iterative unsupervised procedure based on Gibbs sampling, called ‘contrastive divergence’ (CD), and additional supervised tuning via back-propagation. However, this procedure has been shown not to follow any gradient and can lead to suboptimal solutions. In this paper, we show an efficient alternative to CD by means of simulations of digital memcomputing machines (DMMs). We test our approach on pattern recognition using a modified version of the MNIST data set. DMMs sample effectively the vast phase space given by the model distribution of the RBM, and provide a very good approximation close to the optimum. This efficient search significantly reduces the number of pretraining iterations necessary to achieve a given level of accuracy, as well as a total performance gain over CD. In fact, the acceleration of pretraining achieved by simulating DMMs is comparable to, in number of iterations, the recently reported hardware application of the quantum annealing method on the same network and data set. Notably, however, DMMs perform far better than the reported quantum annealing results in terms of quality of the training. We also compare our method to advances in supervised training, like batch-normalization and rectifiers, that work to reduce the advantage of pretraining. We find that the memcomputing method still maintains a quality advantage ($>1%$ in accuracy, and a $20%$ reduction in error rate) over these approaches. Furthermore, our method is agnostic about the connectivity of the network. Therefore, it can be extended to train full Boltzmann machines, and even deep networks at once. |
Tasks | |
Published | 2018-01-01 |
URL | http://arxiv.org/abs/1801.00512v3 |
http://arxiv.org/pdf/1801.00512v3.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-deep-learning-with-memcomputing |
Repo | |
Framework | |
Frank-Wolfe with Subsampling Oracle
Title | Frank-Wolfe with Subsampling Oracle |
Authors | Thomas Kerdreux, Fabian Pedregosa, Alexandre d’Aspremont |
Abstract | We analyze two novel randomized variants of the Frank-Wolfe (FW) or conditional gradient algorithm. While classical FW algorithms require solving a linear minimization problem over the domain at each iteration, the proposed method only requires to solve a linear minimization problem over a small \emph{subset} of the original domain. The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart. The second algorithm is a randomized variant of the Away-step FW algorithm, and again as its deterministic counterpart, reaches linear (i.e., exponential) convergence rate making it the first provably convergent randomized variant of Away-step FW. In both cases, while subsampling reduces the convergence rate by a constant factor, the linear minimization step can be a fraction of the cost of that of the deterministic versions, especially when the data is streamed. We illustrate computational gains of the algorithms on regression problems, involving both $\ell_1$ and latent group lasso penalties. |
Tasks | |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07348v1 |
http://arxiv.org/pdf/1803.07348v1.pdf | |
PWC | https://paperswithcode.com/paper/frank-wolfe-with-subsampling-oracle |
Repo | |
Framework | |
Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization
Title | Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization |
Authors | Ozsel Kilinc, Ismail Uysal |
Abstract | In this paper, we propose a novel unsupervised clustering approach exploiting the hidden information that is indirectly introduced through a pseudo classification objective. Specifically, we randomly assign a pseudo parent-class label to each observation which is then modified by applying the domain specific transformation associated with the assigned label. Generated pseudo observation-label pairs are subsequently used to train a neural network with Auto-clustering Output Layer (ACOL) that introduces multiple softmax nodes for each pseudo parent-class. Due to the unsupervised objective based on Graph-based Activity Regularization (GAR) terms, softmax duplicates of each parent-class are specialized as the hidden information captured through the help of domain specific transformations is propagated during training. Ultimately we obtain a k-means friendly latent representation. Furthermore, we demonstrate how the chosen transformation type impacts performance and helps propagate the latent information that is useful in revealing unknown clusters. Our results show state-of-the-art performance for unsupervised clustering tasks on MNIST, SVHN and USPS datasets, with the highest accuracies reported to date in the literature. |
Tasks | Unsupervised Image Classification |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.03063v1 |
http://arxiv.org/pdf/1802.03063v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-latent-representations-in-neural |
Repo | |
Framework | |
This Looks Like That: Deep Learning for Interpretable Image Recognition
Title | This Looks Like That: Deep Learning for Interpretable Image Recognition |
Authors | Chaofan Chen, Oscar Li, Chaofan Tao, Alina Jade Barnett, Jonathan Su, Cynthia Rudin |
Abstract | When we are faced with challenging image classification tasks, we often explain our reasoning by dissecting the image, and pointing out prototypical aspects of one class or another. The mounting evidence for each of the classes helps us make our final decision. In this work, we introduce a deep network architecture – prototypical part network (ProtoPNet), that reasons in a similar way: the network dissects the image by finding prototypical parts, and combines evidence from the prototypes to make a final classification. The model thus reasons in a way that is qualitatively similar to the way ornithologists, physicians, and others would explain to people on how to solve challenging image classification tasks. The network uses only image-level labels for training without any annotations for parts of images. We demonstrate our method on the CUB-200-2011 dataset and the Stanford Cars dataset. Our experiments show that ProtoPNet can achieve comparable accuracy with its analogous non-interpretable counterpart, and when several ProtoPNets are combined into a larger network, it can achieve an accuracy that is on par with some of the best-performing deep models. Moreover, ProtoPNet provides a level of interpretability that is absent in other interpretable deep models. |
Tasks | Image Classification |
Published | 2018-06-27 |
URL | https://arxiv.org/abs/1806.10574v5 |
https://arxiv.org/pdf/1806.10574v5.pdf | |
PWC | https://paperswithcode.com/paper/this-looks-like-that-deep-learning-for |
Repo | |
Framework | |
Understanding Human Behaviors in Crowds by Imitating the Decision-Making Process
Title | Understanding Human Behaviors in Crowds by Imitating the Decision-Making Process |
Authors | Haosheng Zou, Hang Su, Shihong Song, Jun Zhu |
Abstract | Crowd behavior understanding is crucial yet challenging across a wide range of applications, since crowd behavior is inherently determined by a sequential decision-making process based on various factors, such as the pedestrians’ own destinations, interaction with nearby pedestrians and anticipation of upcoming events. In this paper, we propose a novel framework of Social-Aware Generative Adversarial Imitation Learning (SA-GAIL) to mimic the underlying decision-making process of pedestrians in crowds. Specifically, we infer the latent factors of human decision-making process in an unsupervised manner by extending the Generative Adversarial Imitation Learning framework to anticipate future paths of pedestrians. Different factors of human decision making are disentangled with mutual information maximization, with the process modeled by collision avoidance regularization and Social-Aware LSTMs. Experimental results demonstrate the potential of our framework in disentangling the latent decision-making factors of pedestrians and stronger abilities in predicting future trajectories. |
Tasks | Decision Making, Imitation Learning |
Published | 2018-01-25 |
URL | http://arxiv.org/abs/1801.08391v1 |
http://arxiv.org/pdf/1801.08391v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-human-behaviors-in-crowds-by |
Repo | |
Framework | |