Paper Group AWR 72
Deep Convolutional Neural Networks with Merge-and-Run Mappings. Deep Learning For Smile Recognition. Parallelized Tensor Train Learning of Polynomial Classifiers. Generating Images with Perceptual Similarity Metrics based on Deep Networks. Object Recognition with and without Objects. Neural Semantic Role Labeling with Dependency Path Embeddings. Va …
Deep Convolutional Neural Networks with Merge-and-Run Mappings
Title | Deep Convolutional Neural Networks with Merge-and-Run Mappings |
Authors | Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wenjun Zeng |
Abstract | A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow. To further reduce the training difficulty, we present a simple network architecture, deep merge-and-run neural networks. The novelty lies in a modularized building block, merge-and-run block, which assembles residual branches in parallel through a merge-and-run mapping: Average the inputs of these residual branches (Merge), and add the average to the output of each residual branch as the input of the subsequent residual branch (Run), respectively. We show that the merge-and-run mapping is a linear idempotent function in which the transformation matrix is idempotent, and thus improves information flow, making training easy. In comparison to residual networks, our networks enjoy compelling advantages: they contain much shorter paths, and the width, i.e., the number of channels, is increased. We evaluate the performance on the standard recognition tasks. Our approach demonstrates consistent improvements over ResNets with the comparable setup, and achieves competitive results (e.g., $3.57%$ testing error on CIFAR-$10$, $19.00%$ on CIFAR-$100$, $1.51%$ on SVHN). |
Tasks | |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.07718v2 |
http://arxiv.org/pdf/1611.07718v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-neural-networks-with-merge |
Repo | https://github.com/homles11/IGCV3 |
Framework | tf |
Deep Learning For Smile Recognition
Title | Deep Learning For Smile Recognition |
Authors | Patrick O. Glauner |
Abstract | Inspired by recent successes of deep learning in computer vision, we propose a novel application of deep convolutional neural networks to facial expression recognition, in particular smile recognition. A smile recognition test accuracy of 99.45% is achieved for the Denver Intensity of Spontaneous Facial Action (DISFA) database, significantly outperforming existing approaches based on hand-crafted features with accuracies ranging from 65.55% to 79.67%. The novelty of this approach includes a comprehensive model selection of the architecture parameters, allowing to find an appropriate architecture for each expression such as smile. This is feasible because all experiments were run on a Tesla K40c GPU, allowing a speedup of factor 10 over traditional computations on a CPU. |
Tasks | Facial Expression Recognition, Model Selection, Smile Recognition |
Published | 2016-01-30 |
URL | http://arxiv.org/abs/1602.00172v2 |
http://arxiv.org/pdf/1602.00172v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-smile-recognition |
Repo | https://github.com/cem8301/EmotionDetector |
Framework | none |
Parallelized Tensor Train Learning of Polynomial Classifiers
Title | Parallelized Tensor Train Learning of Polynomial Classifiers |
Authors | Zhongming Chen, Kim Batselier, Johan A. K. Suykens, Ngai Wong |
Abstract | In pattern classification, polynomial classifiers are well-studied methods as they are capable of generating complex decision surfaces. Unfortunately, the use of multivariate polynomials is limited to kernels as in support vector machines, because polynomials quickly become impractical for high-dimensional problems. In this paper, we effectively overcome the curse of dimensionality by employing the tensor train format to represent a polynomial classifier. Based on the structure of tensor trains, two learning algorithms are proposed which involve solving different optimization problems of low computational complexity. Furthermore, we show how both regularization to prevent overfitting and parallelization, which enables the use of large training sets, are incorporated into these methods. Both the efficiency and efficacy of our tensor-based polynomial classifier are then demonstrated on the two popular datasets USPS and MNIST. |
Tasks | |
Published | 2016-12-20 |
URL | http://arxiv.org/abs/1612.06505v4 |
http://arxiv.org/pdf/1612.06505v4.pdf | |
PWC | https://paperswithcode.com/paper/parallelized-tensor-train-learning-of |
Repo | https://github.com/kbatseli/TTClassifier |
Framework | none |
Generating Images with Perceptual Similarity Metrics based on Deep Networks
Title | Generating Images with Perceptual Similarity Metrics based on Deep Networks |
Authors | Alexey Dosovitskiy, Thomas Brox |
Abstract | Image-generating machine learning models are typically trained with loss functions based on distance in the image space. This often leads to over-smoothed results. We propose a class of loss functions, which we call deep perceptual similarity metrics (DeePSiM), that mitigate this problem. Instead of computing distances in the image space, we compute distances between image features extracted by deep neural networks. This metric better reflects perceptually similarity of images and thus leads to better results. We show three applications: autoencoder training, a modification of a variational autoencoder, and inversion of deep convolutional networks. In all cases, the generated images look sharp and resemble natural images. |
Tasks | Image Generation |
Published | 2016-02-08 |
URL | http://arxiv.org/abs/1602.02644v2 |
http://arxiv.org/pdf/1602.02644v2.pdf | |
PWC | https://paperswithcode.com/paper/generating-images-with-perceptual-similarity |
Repo | https://github.com/Evolving-AI-Lab/synthesizing |
Framework | caffe2 |
Object Recognition with and without Objects
Title | Object Recognition with and without Objects |
Authors | Zhuotun Zhu, Lingxi Xie, Alan L. Yuille |
Abstract | While recent deep neural networks have achieved a promising performance on object recognition, they rely implicitly on the visual contents of the whole image. In this paper, we train deep neural net- works on the foreground (object) and background (context) regions of images respectively. Consider- ing human recognition in the same situations, net- works trained on the pure background without ob- jects achieves highly reasonable recognition performance that beats humans by a large margin if only given context. However, humans still outperform networks with pure object available, which indicates networks and human beings have different mechanisms in understanding an image. Furthermore, we straightforwardly combine multiple trained networks to explore different visual cues learned by different networks. Experiments show that useful visual hints can be explicitly learned separately and then combined to achieve higher performance, which verifies the advantages of the proposed framework. |
Tasks | Object Recognition |
Published | 2016-11-20 |
URL | http://arxiv.org/abs/1611.06596v3 |
http://arxiv.org/pdf/1611.06596v3.pdf | |
PWC | https://paperswithcode.com/paper/object-recognition-with-and-without-objects |
Repo | https://github.com/sunformoon/ObjectRecognitionWithWithoutObjects |
Framework | none |
Neural Semantic Role Labeling with Dependency Path Embeddings
Title | Neural Semantic Role Labeling with Dependency Path Embeddings |
Authors | Michael Roth, Mirella Lapata |
Abstract | This paper introduces a novel model for semantic role labeling that makes use of neural sequence modeling techniques. Our approach is motivated by the observation that complex syntactic structures and related phenomena, such as nested subordinations and nominal predicates, are not handled well by existing models. Our model treats such instances as sub-sequences of lexicalized dependency paths and learns suitable embedding representations. We experimentally demonstrate that such embeddings can improve results over previous state-of-the-art semantic role labelers, and showcase qualitative improvements obtained by our method. |
Tasks | Semantic Role Labeling |
Published | 2016-05-24 |
URL | http://arxiv.org/abs/1605.07515v2 |
http://arxiv.org/pdf/1605.07515v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-semantic-role-labeling-with-dependency |
Repo | https://github.com/microth/PathLSTM |
Framework | none |
Variations of the Similarity Function of TextRank for Automated Summarization
Title | Variations of the Similarity Function of TextRank for Automated Summarization |
Authors | Federico Barrios, Federico López, Luis Argerich, Rosa Wachenchauzer |
Abstract | This article presents new alternatives to the similarity function for the TextRank algorithm for automatic summarization of texts. We describe the generalities of the algorithm and the different functions we propose. Some of these variants achieve a significative improvement using the same metrics and dataset as the original publication. |
Tasks | |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03606v1 |
http://arxiv.org/pdf/1602.03606v1.pdf | |
PWC | https://paperswithcode.com/paper/variations-of-the-similarity-function-of |
Repo | https://github.com/jaumeCloquellCapo/text-summaritzation-textRank |
Framework | none |
Variational Mixture Models with Gamma or inverse-Gamma components
Title | Variational Mixture Models with Gamma or inverse-Gamma components |
Authors | A. Llera, D. Vidaurre, R. H. R. Pruim, C. F. Beckmann |
Abstract | Mixture models with Gamma and or inverse-Gamma distributed mixture components are useful for medical image tissue segmentation or as post-hoc models for regression coefficients obtained from linear regression within a Generalised Linear Modeling framework (GLM), used in this case to separate stochastic (Gaussian) noise from some kind of positive or negative “activation” (modeled as Gamma or inverse-Gamma distributed). To date, the most common choice in this context it is Gaussian/Gamma mixture models learned through a maximum likelihood (ML) approach; we recently extended such algorithm for mixture models with inverse-Gamma components. Here, we introduce a fully analytical Variational Bayes (VB) learning framework for both Gamma and/or inverse-Gamma components. We use synthetic and resting state fMRI data to compare the performance of the ML and VB algorithms in terms of area under the curve and computational cost. We observed that the ML Gaussian/Gamma model is very expensive specially when considering high resolution images; furthermore, these solutions are highly variable and they occasionally can overestimate the activations severely. The Bayesian Gauss-Gamma is in general the fastest algorithm but provides too dense solutions. The maximum likelihood Gaussian/inverse-Gamma is also very fast but provides in general very sparse solutions. The variational Gaussian/inverse-Gamma mixture model is the most robust and its cost is acceptable even for high resolution images. Further, the presented methodology represents an essential building block that can be directly used in more complex inference tasks, specially designed to analyse MRI-fMRI data; such models include for example analytical variational mixture models with adaptive spatial regularization or better source models for new spatial blind source separation approaches. |
Tasks | |
Published | 2016-07-26 |
URL | http://arxiv.org/abs/1607.07573v1 |
http://arxiv.org/pdf/1607.07573v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-mixture-models-with-gamma-or |
Repo | https://github.com/allera/One_Dim_Mixture_Models |
Framework | none |
SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
Title | SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving |
Authors | Bichen Wu, Alvin Wan, Forrest Iandola, Peter H. Jin, Kurt Keutzer |
Abstract | Object detection is a crucial task for autonomous driving. In addition to requiring high accuracy to ensure safety, object detection for autonomous driving also requires real-time inference speed to guarantee prompt vehicle control, as well as small model size and energy efficiency to enable embedded system deployment. In this work, we propose SqueezeDet, a fully convolutional neural network for object detection that aims to simultaneously satisfy all of the above constraints. In our network, we use convolutional layers not only to extract feature maps but also as the output layer to compute bounding boxes and class probabilities. The detection pipeline of our model only contains a single forward pass of a neural network, thus it is extremely fast. Our model is fully-convolutional, which leads to a small model size and better energy efficiency. While achieving the same accuracy as previous baselines, our model is 30.4x smaller, 19.7x faster, and consumes 35.2x lower energy. The code is open-sourced at \url{https://github.com/BichenWuUCB/squeezeDet}. |
Tasks | Autonomous Driving, Object Detection, Real-Time Object Detection |
Published | 2016-12-04 |
URL | https://arxiv.org/abs/1612.01051v4 |
https://arxiv.org/pdf/1612.01051v4.pdf | |
PWC | https://paperswithcode.com/paper/squeezedet-unified-small-low-power-fully |
Repo | https://github.com/fregu856/2D_detection |
Framework | tf |
Multilingual Twitter Sentiment Classification: The Role of Human Annotators
Title | Multilingual Twitter Sentiment Classification: The Role of Human Annotators |
Authors | Igor Mozetic, Miha Grcar, Jasmina Smailovic |
Abstract | What are the limits of automated Twitter sentiment classification? We analyze a large set of manually labeled tweets in different languages, use them as training data, and construct automated classification models. It turns out that the quality of classification models depends much more on the quality and size of training data than on the type of the model trained. Experimental results indicate that there is no statistically significant difference between the performance of the top classification models. We quantify the quality of training data by applying various annotator agreement measures, and identify the weakest points of different datasets. We show that the model performance approaches the inter-annotator agreement when the size of the training set is sufficiently large. However, it is crucial to regularly monitor the self- and inter-annotator agreements since this improves the training datasets and consequently the model performance. Finally, we show that there is strong evidence that humans perceive the sentiment classes (negative, neutral, and positive) as ordered. |
Tasks | Sentiment Analysis |
Published | 2016-02-24 |
URL | http://arxiv.org/abs/1602.07563v2 |
http://arxiv.org/pdf/1602.07563v2.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-twitter-sentiment-classification |
Repo | https://github.com/joemzhao/tweets-retriever |
Framework | none |
Fitted Learning: Models with Awareness of their Limits
Title | Fitted Learning: Models with Awareness of their Limits |
Authors | Navid Kardan, Kenneth O. Stanley |
Abstract | Though deep learning has pushed the boundaries of classification forward, in recent years hints of the limits of standard classification have begun to emerge. Problems such as fooling, adding new classes over time, and the need to retrain learning models only for small changes to the original problem all point to a potential shortcoming in the classic classification regime, where a comprehensive a priori knowledge of the possible classes or concepts is critical. Without such knowledge, classifiers misjudge the limits of their knowledge and overgeneralization therefore becomes a serious obstacle to consistent performance. In response to these challenges, this paper extends the classic regime by reframing classification instead with the assumption that concepts present in the training set are only a sample of the hypothetical final set of concepts. To bring learning models into this new paradigm, a novel elaboration of standard architectures called the competitive overcomplete output layer (COOL) neural network is introduced. Experiments demonstrate the effectiveness of COOL by applying it to fooling, separable concept learning, one-class neural networks, and standard classification benchmarks. The results suggest that, unlike conventional classifiers, the amount of generalization in COOL networks can be tuned to match the problem. |
Tasks | |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.02226v4 |
http://arxiv.org/pdf/1609.02226v4.pdf | |
PWC | https://paperswithcode.com/paper/fitted-learning-models-with-awareness-of |
Repo | https://github.com/ndkn/fitted-learning |
Framework | torch |
On the Use of Sparse Filtering for Covariate Shift Adaptation
Title | On the Use of Sparse Filtering for Covariate Shift Adaptation |
Authors | Fabio Massimo Zennaro, Ke Chen |
Abstract | In this paper we formally analyse the use of sparse filtering algorithms to perform covariate shift adaptation. We provide a theoretical analysis of sparse filtering by evaluating the conditions required to perform covariate shift adaptation. We prove that sparse filtering can perform adaptation only if the conditional distribution of the labels has a structure explained by a cosine metric. To overcome this limitation, we propose a new algorithm, named periodic sparse filtering, and carry out the same theoretical analysis regarding covariate shift adaptation. We show that periodic sparse filtering can perform adaptation under the looser and more realistic requirement that the conditional distribution of the labels has a periodic structure, which may be satisfied, for instance, by user-dependent data sets. We experimentally validate our theoretical results on synthetic data. Moreover, we apply periodic sparse filtering to real-world data sets to demonstrate that this simple and computationally efficient algorithm is able to achieve competitive performances. |
Tasks | |
Published | 2016-07-22 |
URL | http://arxiv.org/abs/1607.06781v2 |
http://arxiv.org/pdf/1607.06781v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-use-of-sparse-filtering-for-covariate |
Repo | https://github.com/FMZennaro/PSF |
Framework | none |
Pruning Convolutional Neural Networks for Resource Efficient Inference
Title | Pruning Convolutional Neural Networks for Resource Efficient Inference |
Authors | Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, Jan Kautz |
Abstract | We propose a new formulation for pruning convolutional kernels in neural networks to enable efficient inference. We interleave greedy criteria-based pruning with fine-tuning by backpropagation - a computationally efficient procedure that maintains good generalization in the pruned network. We propose a new criterion based on Taylor expansion that approximates the change in the cost function induced by pruning network parameters. We focus on transfer learning, where large pretrained networks are adapted to specialized tasks. The proposed criterion demonstrates superior performance compared to other criteria, e.g. the norm of kernel weights or feature map activation, for pruning large CNNs after adaptation to fine-grained classification tasks (Birds-200 and Flowers-102) relaying only on the first order gradient information. We also show that pruning can lead to more than 10x theoretical (5x practical) reduction in adapted 3D-convolutional filters with a small drop in accuracy in a recurrent gesture classifier. Finally, we show results for the large-scale ImageNet dataset to emphasize the flexibility of our approach. |
Tasks | Transfer Learning |
Published | 2016-11-19 |
URL | http://arxiv.org/abs/1611.06440v2 |
http://arxiv.org/pdf/1611.06440v2.pdf | |
PWC | https://paperswithcode.com/paper/pruning-convolutional-neural-networks-for |
Repo | https://github.com/dongkwan-kim/Adaptive-Forgetting |
Framework | tf |
Biconvex Relaxation for Semidefinite Programming in Computer Vision
Title | Biconvex Relaxation for Semidefinite Programming in Computer Vision |
Authors | Sohil Shah, Abhay Kumar, Carlos Castillo, David Jacobs, Christoph Studer, Tom Goldstein |
Abstract | Semidefinite programming is an indispensable tool in computer vision, but general-purpose solvers for semidefinite programs are often too slow and memory intensive for large-scale problems. We propose a general framework to approximately solve large-scale semidefinite problems (SDPs) at low complexity. Our approach, referred to as biconvex relaxation (BCR), transforms a general SDP into a specific biconvex optimization problem, which can then be solved in the original, low-dimensional variable space at low complexity. The resulting biconvex problem is solved using an efficient alternating minimization (AM) procedure. Since AM has the potential to get stuck in local minima, we propose a general initialization scheme that enables BCR to start close to a global optimum - this is key for our algorithm to quickly converge to optimal or near-optimal solutions. We showcase the efficacy of our approach on three applications in computer vision, namely segmentation, co-segmentation, and manifold metric learning. BCR achieves solution quality comparable to state-of-the-art SDP methods with speedups between 4X and 35X. At the same time, BCR handles a more general set of SDPs than previous approaches, which are more specialized. |
Tasks | Metric Learning |
Published | 2016-05-31 |
URL | http://arxiv.org/abs/1605.09527v2 |
http://arxiv.org/pdf/1605.09527v2.pdf | |
PWC | https://paperswithcode.com/paper/biconvex-relaxation-for-semidefinite |
Repo | https://github.com/Axeldnahcram/biconvex_relaxation |
Framework | none |
Approximation Vector Machines for Large-scale Online Learning
Title | Approximation Vector Machines for Large-scale Online Learning |
Authors | Trung Le, Tu Dinh Nguyen, Vu Nguyen, Dinh Phung |
Abstract | One of the most challenging problems in kernel online learning is to bound the model size and to promote the model sparsity. Sparse models not only improve computation and memory usage, but also enhance the generalization capacity, a principle that concurs with the law of parsimony. However, inappropriate sparsity modeling may also significantly degrade the performance. In this paper, we propose Approximation Vector Machine (AVM), a model that can simultaneously encourage the sparsity and safeguard its risk in compromising the performance. When an incoming instance arrives, we approximate this instance by one of its neighbors whose distance to it is less than a predefined threshold. Our key intuition is that since the newly seen instance is expressed by its nearby neighbor the optimal performance can be analytically formulated and maintained. We develop theoretical foundations to support this intuition and further establish an analysis to characterize the gap between the approximation and optimal solutions. This gap crucially depends on the frequency of approximation and the predefined threshold. We perform the convergence analysis for a wide spectrum of loss functions including Hinge, smooth Hinge, and Logistic for classification task, and $l_1$, $l_2$, and $\epsilon$-insensitive for regression task. We conducted extensive experiments for classification task in batch and online modes, and regression task in online mode over several benchmark datasets. The results show that our proposed AVM achieved a comparable predictive performance with current state-of-the-art methods while simultaneously achieving significant computational speed-up due to the ability of the proposed AVM in maintaining the model size. |
Tasks | |
Published | 2016-04-22 |
URL | http://arxiv.org/abs/1604.06518v4 |
http://arxiv.org/pdf/1604.06518v4.pdf | |
PWC | https://paperswithcode.com/paper/approximation-vector-machines-for-large-scale |
Repo | https://github.com/tund/avm |
Framework | none |