May 7, 2019

3008 words 15 mins read

Paper Group AWR 72

Deep Convolutional Neural Networks with Merge-and-Run Mappings. Deep Learning For Smile Recognition. Parallelized Tensor Train Learning of Polynomial Classifiers. Generating Images with Perceptual Similarity Metrics based on Deep Networks. Object Recognition with and without Objects. Neural Semantic Role Labeling with Dependency Path Embeddings. Va …

Deep Convolutional Neural Networks with Merge-and-Run Mappings


Title	Deep Convolutional Neural Networks with Merge-and-Run Mappings
Authors	Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wenjun Zeng
Abstract	A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow. To further reduce the training difficulty, we present a simple network architecture, deep merge-and-run neural networks. The novelty lies in a modularized building block, merge-and-run block, which assembles residual branches in parallel through a merge-and-run mapping: Average the inputs of these residual branches (Merge), and add the average to the output of each residual branch as the input of the subsequent residual branch (Run), respectively. We show that the merge-and-run mapping is a linear idempotent function in which the transformation matrix is idempotent, and thus improves information flow, making training easy. In comparison to residual networks, our networks enjoy compelling advantages: they contain much shorter paths, and the width, i.e., the number of channels, is increased. We evaluate the performance on the standard recognition tasks. Our approach demonstrates consistent improvements over ResNets with the comparable setup, and achieves competitive results (e.g., $3.57%$ testing error on CIFAR-$10$, $19.00%$ on CIFAR-$100$, $1.51%$ on SVHN).
Tasks
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07718v2
PDF	http://arxiv.org/pdf/1611.07718v2.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-neural-networks-with-merge
Repo	https://github.com/homles11/IGCV3
Framework	tf

Deep Learning For Smile Recognition


Title	Deep Learning For Smile Recognition
Authors	Patrick O. Glauner
Abstract	Inspired by recent successes of deep learning in computer vision, we propose a novel application of deep convolutional neural networks to facial expression recognition, in particular smile recognition. A smile recognition test accuracy of 99.45% is achieved for the Denver Intensity of Spontaneous Facial Action (DISFA) database, significantly outperforming existing approaches based on hand-crafted features with accuracies ranging from 65.55% to 79.67%. The novelty of this approach includes a comprehensive model selection of the architecture parameters, allowing to find an appropriate architecture for each expression such as smile. This is feasible because all experiments were run on a Tesla K40c GPU, allowing a speedup of factor 10 over traditional computations on a CPU.
Tasks	Facial Expression Recognition, Model Selection, Smile Recognition
Published	2016-01-30
URL	http://arxiv.org/abs/1602.00172v2
PDF	http://arxiv.org/pdf/1602.00172v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-smile-recognition
Repo	https://github.com/cem8301/EmotionDetector
Framework	none

Parallelized Tensor Train Learning of Polynomial Classifiers


Title	Parallelized Tensor Train Learning of Polynomial Classifiers
Authors	Zhongming Chen, Kim Batselier, Johan A. K. Suykens, Ngai Wong
Abstract	In pattern classification, polynomial classifiers are well-studied methods as they are capable of generating complex decision surfaces. Unfortunately, the use of multivariate polynomials is limited to kernels as in support vector machines, because polynomials quickly become impractical for high-dimensional problems. In this paper, we effectively overcome the curse of dimensionality by employing the tensor train format to represent a polynomial classifier. Based on the structure of tensor trains, two learning algorithms are proposed which involve solving different optimization problems of low computational complexity. Furthermore, we show how both regularization to prevent overfitting and parallelization, which enables the use of large training sets, are incorporated into these methods. Both the efficiency and efficacy of our tensor-based polynomial classifier are then demonstrated on the two popular datasets USPS and MNIST.
Tasks
Published	2016-12-20
URL	http://arxiv.org/abs/1612.06505v4
PDF	http://arxiv.org/pdf/1612.06505v4.pdf
PWC	https://paperswithcode.com/paper/parallelized-tensor-train-learning-of
Repo	https://github.com/kbatseli/TTClassifier
Framework	none

Generating Images with Perceptual Similarity Metrics based on Deep Networks


Title	Generating Images with Perceptual Similarity Metrics based on Deep Networks
Authors	Alexey Dosovitskiy, Thomas Brox
Abstract	Image-generating machine learning models are typically trained with loss functions based on distance in the image space. This often leads to over-smoothed results. We propose a class of loss functions, which we call deep perceptual similarity metrics (DeePSiM), that mitigate this problem. Instead of computing distances in the image space, we compute distances between image features extracted by deep neural networks. This metric better reflects perceptually similarity of images and thus leads to better results. We show three applications: autoencoder training, a modification of a variational autoencoder, and inversion of deep convolutional networks. In all cases, the generated images look sharp and resemble natural images.
Tasks	Image Generation
Published	2016-02-08
URL	http://arxiv.org/abs/1602.02644v2
PDF	http://arxiv.org/pdf/1602.02644v2.pdf
PWC	https://paperswithcode.com/paper/generating-images-with-perceptual-similarity
Repo	https://github.com/Evolving-AI-Lab/synthesizing
Framework	caffe2

Object Recognition with and without Objects


Title	Object Recognition with and without Objects
Authors	Zhuotun Zhu, Lingxi Xie, Alan L. Yuille
Abstract	While recent deep neural networks have achieved a promising performance on object recognition, they rely implicitly on the visual contents of the whole image. In this paper, we train deep neural net- works on the foreground (object) and background (context) regions of images respectively. Consider- ing human recognition in the same situations, net- works trained on the pure background without ob- jects achieves highly reasonable recognition performance that beats humans by a large margin if only given context. However, humans still outperform networks with pure object available, which indicates networks and human beings have different mechanisms in understanding an image. Furthermore, we straightforwardly combine multiple trained networks to explore different visual cues learned by different networks. Experiments show that useful visual hints can be explicitly learned separately and then combined to achieve higher performance, which verifies the advantages of the proposed framework.
Tasks	Object Recognition
Published	2016-11-20
URL	http://arxiv.org/abs/1611.06596v3
PDF	http://arxiv.org/pdf/1611.06596v3.pdf
PWC	https://paperswithcode.com/paper/object-recognition-with-and-without-objects
Repo	https://github.com/sunformoon/ObjectRecognitionWithWithoutObjects
Framework	none

Neural Semantic Role Labeling with Dependency Path Embeddings


Title	Neural Semantic Role Labeling with Dependency Path Embeddings
Authors	Michael Roth, Mirella Lapata
Abstract	This paper introduces a novel model for semantic role labeling that makes use of neural sequence modeling techniques. Our approach is motivated by the observation that complex syntactic structures and related phenomena, such as nested subordinations and nominal predicates, are not handled well by existing models. Our model treats such instances as sub-sequences of lexicalized dependency paths and learns suitable embedding representations. We experimentally demonstrate that such embeddings can improve results over previous state-of-the-art semantic role labelers, and showcase qualitative improvements obtained by our method.
Tasks	Semantic Role Labeling
Published	2016-05-24
URL	http://arxiv.org/abs/1605.07515v2
PDF	http://arxiv.org/pdf/1605.07515v2.pdf
PWC	https://paperswithcode.com/paper/neural-semantic-role-labeling-with-dependency
Repo	https://github.com/microth/PathLSTM
Framework	none

Variations of the Similarity Function of TextRank for Automated Summarization


Title	Variations of the Similarity Function of TextRank for Automated Summarization
Authors	Federico Barrios, Federico López, Luis Argerich, Rosa Wachenchauzer
Abstract	This article presents new alternatives to the similarity function for the TextRank algorithm for automatic summarization of texts. We describe the generalities of the algorithm and the different functions we propose. Some of these variants achieve a significative improvement using the same metrics and dataset as the original publication.
Tasks
Published	2016-02-11
URL	http://arxiv.org/abs/1602.03606v1
PDF	http://arxiv.org/pdf/1602.03606v1.pdf
PWC	https://paperswithcode.com/paper/variations-of-the-similarity-function-of
Repo	https://github.com/jaumeCloquellCapo/text-summaritzation-textRank
Framework	none

Variational Mixture Models with Gamma or inverse-Gamma components


Title	Variational Mixture Models with Gamma or inverse-Gamma components
Authors	A. Llera, D. Vidaurre, R. H. R. Pruim, C. F. Beckmann
Abstract	Mixture models with Gamma and or inverse-Gamma distributed mixture components are useful for medical image tissue segmentation or as post-hoc models for regression coefficients obtained from linear regression within a Generalised Linear Modeling framework (GLM), used in this case to separate stochastic (Gaussian) noise from some kind of positive or negative “activation” (modeled as Gamma or inverse-Gamma distributed). To date, the most common choice in this context it is Gaussian/Gamma mixture models learned through a maximum likelihood (ML) approach; we recently extended such algorithm for mixture models with inverse-Gamma components. Here, we introduce a fully analytical Variational Bayes (VB) learning framework for both Gamma and/or inverse-Gamma components. We use synthetic and resting state fMRI data to compare the performance of the ML and VB algorithms in terms of area under the curve and computational cost. We observed that the ML Gaussian/Gamma model is very expensive specially when considering high resolution images; furthermore, these solutions are highly variable and they occasionally can overestimate the activations severely. The Bayesian Gauss-Gamma is in general the fastest algorithm but provides too dense solutions. The maximum likelihood Gaussian/inverse-Gamma is also very fast but provides in general very sparse solutions. The variational Gaussian/inverse-Gamma mixture model is the most robust and its cost is acceptable even for high resolution images. Further, the presented methodology represents an essential building block that can be directly used in more complex inference tasks, specially designed to analyse MRI-fMRI data; such models include for example analytical variational mixture models with adaptive spatial regularization or better source models for new spatial blind source separation approaches.
Tasks
Published	2016-07-26
URL	http://arxiv.org/abs/1607.07573v1
PDF	http://arxiv.org/pdf/1607.07573v1.pdf
PWC	https://paperswithcode.com/paper/variational-mixture-models-with-gamma-or
Repo	https://github.com/allera/One_Dim_Mixture_Models
Framework	none

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving


Title	SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
Authors	Bichen Wu, Alvin Wan, Forrest Iandola, Peter H. Jin, Kurt Keutzer
Abstract	Object detection is a crucial task for autonomous driving. In addition to requiring high accuracy to ensure safety, object detection for autonomous driving also requires real-time inference speed to guarantee prompt vehicle control, as well as small model size and energy efficiency to enable embedded system deployment. In this work, we propose SqueezeDet, a fully convolutional neural network for object detection that aims to simultaneously satisfy all of the above constraints. In our network, we use convolutional layers not only to extract feature maps but also as the output layer to compute bounding boxes and class probabilities. The detection pipeline of our model only contains a single forward pass of a neural network, thus it is extremely fast. Our model is fully-convolutional, which leads to a small model size and better energy efficiency. While achieving the same accuracy as previous baselines, our model is 30.4x smaller, 19.7x faster, and consumes 35.2x lower energy. The code is open-sourced at \url{https://github.com/BichenWuUCB/squeezeDet}.
Tasks	Autonomous Driving, Object Detection, Real-Time Object Detection
Published	2016-12-04
URL	https://arxiv.org/abs/1612.01051v4
PDF	https://arxiv.org/pdf/1612.01051v4.pdf
PWC	https://paperswithcode.com/paper/squeezedet-unified-small-low-power-fully
Repo	https://github.com/fregu856/2D_detection
Framework	tf

Multilingual Twitter Sentiment Classification: The Role of Human Annotators


Title	Multilingual Twitter Sentiment Classification: The Role of Human Annotators
Authors	Igor Mozetic, Miha Grcar, Jasmina Smailovic
Abstract	What are the limits of automated Twitter sentiment classification? We analyze a large set of manually labeled tweets in different languages, use them as training data, and construct automated classification models. It turns out that the quality of classification models depends much more on the quality and size of training data than on the type of the model trained. Experimental results indicate that there is no statistically significant difference between the performance of the top classification models. We quantify the quality of training data by applying various annotator agreement measures, and identify the weakest points of different datasets. We show that the model performance approaches the inter-annotator agreement when the size of the training set is sufficiently large. However, it is crucial to regularly monitor the self- and inter-annotator agreements since this improves the training datasets and consequently the model performance. Finally, we show that there is strong evidence that humans perceive the sentiment classes (negative, neutral, and positive) as ordered.
Tasks	Sentiment Analysis
Published	2016-02-24
URL	http://arxiv.org/abs/1602.07563v2
PDF	http://arxiv.org/pdf/1602.07563v2.pdf
PWC	https://paperswithcode.com/paper/multilingual-twitter-sentiment-classification
Repo	https://github.com/joemzhao/tweets-retriever
Framework	none

Fitted Learning: Models with Awareness of their Limits


Title	Fitted Learning: Models with Awareness of their Limits
Authors	Navid Kardan, Kenneth O. Stanley
Abstract	Though deep learning has pushed the boundaries of classification forward, in recent years hints of the limits of standard classification have begun to emerge. Problems such as fooling, adding new classes over time, and the need to retrain learning models only for small changes to the original problem all point to a potential shortcoming in the classic classification regime, where a comprehensive a priori knowledge of the possible classes or concepts is critical. Without such knowledge, classifiers misjudge the limits of their knowledge and overgeneralization therefore becomes a serious obstacle to consistent performance. In response to these challenges, this paper extends the classic regime by reframing classification instead with the assumption that concepts present in the training set are only a sample of the hypothetical final set of concepts. To bring learning models into this new paradigm, a novel elaboration of standard architectures called the competitive overcomplete output layer (COOL) neural network is introduced. Experiments demonstrate the effectiveness of COOL by applying it to fooling, separable concept learning, one-class neural networks, and standard classification benchmarks. The results suggest that, unlike conventional classifiers, the amount of generalization in COOL networks can be tuned to match the problem.
Tasks
Published	2016-09-07
URL	http://arxiv.org/abs/1609.02226v4
PDF	http://arxiv.org/pdf/1609.02226v4.pdf
PWC	https://paperswithcode.com/paper/fitted-learning-models-with-awareness-of
Repo	https://github.com/ndkn/fitted-learning
Framework	torch

On the Use of Sparse Filtering for Covariate Shift Adaptation


Title	On the Use of Sparse Filtering for Covariate Shift Adaptation
Authors	Fabio Massimo Zennaro, Ke Chen
Abstract	In this paper we formally analyse the use of sparse filtering algorithms to perform covariate shift adaptation. We provide a theoretical analysis of sparse filtering by evaluating the conditions required to perform covariate shift adaptation. We prove that sparse filtering can perform adaptation only if the conditional distribution of the labels has a structure explained by a cosine metric. To overcome this limitation, we propose a new algorithm, named periodic sparse filtering, and carry out the same theoretical analysis regarding covariate shift adaptation. We show that periodic sparse filtering can perform adaptation under the looser and more realistic requirement that the conditional distribution of the labels has a periodic structure, which may be satisfied, for instance, by user-dependent data sets. We experimentally validate our theoretical results on synthetic data. Moreover, we apply periodic sparse filtering to real-world data sets to demonstrate that this simple and computationally efficient algorithm is able to achieve competitive performances.
Tasks
Published	2016-07-22
URL	http://arxiv.org/abs/1607.06781v2
PDF	http://arxiv.org/pdf/1607.06781v2.pdf
PWC	https://paperswithcode.com/paper/on-the-use-of-sparse-filtering-for-covariate
Repo	https://github.com/FMZennaro/PSF
Framework	none

Pruning Convolutional Neural Networks for Resource Efficient Inference


Title	Pruning Convolutional Neural Networks for Resource Efficient Inference
Authors	Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz
Abstract	We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with fine-tuning by backpropagation - a computationally efficient procedure that maintains good generalization in the pruned network. We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters. We focus on transfer learning, where large pretrained networks are adapted to specialized tasks. The proposed criterion demonstrates superior performance compared to other criteria, e.g. the norm of kernel weights or feature map activation, for pruning large CNNs after adaptation to fine-grained classification tasks (Birds-200 and Flowers-102) relaying only on the first order gradient information. We also show that pruning can lead to more than 10x theoretical (5x practical) reduction in adapted 3D-convolutional filters with a small drop in accuracy in a recurrent gesture classifier. Finally, we show results for the large-scale ImageNet dataset to emphasize the flexibility of our approach.
Tasks	Transfer Learning
Published	2016-11-19
URL	http://arxiv.org/abs/1611.06440v2
PDF	http://arxiv.org/pdf/1611.06440v2.pdf
PWC	https://paperswithcode.com/paper/pruning-convolutional-neural-networks-for
Repo	https://github.com/dongkwan-kim/Adaptive-Forgetting
Framework	tf

Biconvex Relaxation for Semidefinite Programming in Computer Vision


Title	Biconvex Relaxation for Semidefinite Programming in Computer Vision
Authors	Sohil Shah, Abhay Kumar, Carlos Castillo, David Jacobs, Christoph Studer, Tom Goldstein
Abstract	Semidefinite programming is an indispensable tool in computer vision, but general-purpose solvers for semidefinite programs are often too slow and memory intensive for large-scale problems. We propose a general framework to approximately solve large-scale semidefinite problems (SDPs) at low complexity. Our approach, referred to as biconvex relaxation (BCR), transforms a general SDP into a specific biconvex optimization problem, which can then be solved in the original, low-dimensional variable space at low complexity. The resulting biconvex problem is solved using an efficient alternating minimization (AM) procedure. Since AM has the potential to get stuck in local minima, we propose a general initialization scheme that enables BCR to start close to a global optimum - this is key for our algorithm to quickly converge to optimal or near-optimal solutions. We showcase the efficacy of our approach on three applications in computer vision, namely segmentation, co-segmentation, and manifold metric learning. BCR achieves solution quality comparable to state-of-the-art SDP methods with speedups between 4X and 35X. At the same time, BCR handles a more general set of SDPs than previous approaches, which are more specialized.
Tasks	Metric Learning
Published	2016-05-31
URL	http://arxiv.org/abs/1605.09527v2
PDF	http://arxiv.org/pdf/1605.09527v2.pdf
PWC	https://paperswithcode.com/paper/biconvex-relaxation-for-semidefinite
Repo	https://github.com/Axeldnahcram/biconvex_relaxation
Framework	none

Approximation Vector Machines for Large-scale Online Learning


Title	Approximation Vector Machines for Large-scale Online Learning
Authors	Trung Le, Tu Dinh Nguyen, Vu Nguyen, Dinh Phung
Abstract	One of the most challenging problems in kernel online learning is to bound the model size and to promote the model sparsity. Sparse models not only improve computation and memory usage, but also enhance the generalization capacity, a principle that concurs with the law of parsimony. However, inappropriate sparsity modeling may also significantly degrade the performance. In this paper, we propose Approximation Vector Machine (AVM), a model that can simultaneously encourage the sparsity and safeguard its risk in compromising the performance. When an incoming instance arrives, we approximate this instance by one of its neighbors whose distance to it is less than a predefined threshold. Our key intuition is that since the newly seen instance is expressed by its nearby neighbor the optimal performance can be analytically formulated and maintained. We develop theoretical foundations to support this intuition and further establish an analysis to characterize the gap between the approximation and optimal solutions. This gap crucially depends on the frequency of approximation and the predefined threshold. We perform the convergence analysis for a wide spectrum of loss functions including Hinge, smooth Hinge, and Logistic for classification task, and $l_1$, $l_2$, and $\epsilon$-insensitive for regression task. We conducted extensive experiments for classification task in batch and online modes, and regression task in online mode over several benchmark datasets. The results show that our proposed AVM achieved a comparable predictive performance with current state-of-the-art methods while simultaneously achieving significant computational speed-up due to the ability of the proposed AVM in maintaining the model size.
Tasks
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06518v4
PDF	http://arxiv.org/pdf/1604.06518v4.pdf
PWC	https://paperswithcode.com/paper/approximation-vector-machines-for-large-scale
Repo	https://github.com/tund/avm
Framework	none