Paper Group ANR 49
Stochastic Gradient Monomial Gamma Sampler. S-Isomap++: Multi Manifold Learning from Streaming Data. Few-shot Learning by Exploiting Visual Concepts within CNNs. DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding. A Weakly Supervised Approach to Train Temporal Relation Classifiers and Acquire Regular Event Pairs …
Stochastic Gradient Monomial Gamma Sampler
Title | Stochastic Gradient Monomial Gamma Sampler |
Authors | Yizhe Zhang, Changyou Chen, Zhe Gan, Ricardo Henao, Lawrence Carin |
Abstract | Recent advances in stochastic gradient techniques have made it possible to estimate posterior distributions from large datasets via Markov Chain Monte Carlo (MCMC). However, when the target posterior is multimodal, mixing performance is often poor. This results in inadequate exploration of the posterior distribution. A framework is proposed to improve the sampling efficiency of stochastic gradient MCMC, based on Hamiltonian Monte Carlo. A generalized kinetic function is leveraged, delivering superior stationary mixing, especially for multimodal distributions. Techniques are also discussed to overcome the practical issues introduced by this generalization. It is shown that the proposed approach is better at exploring complex multimodal posterior distributions, as demonstrated on multiple applications and in comparison with other stochastic gradient MCMC methods. |
Tasks | |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01498v2 |
http://arxiv.org/pdf/1706.01498v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-gradient-monomial-gamma-sampler |
Repo | |
Framework | |
S-Isomap++: Multi Manifold Learning from Streaming Data
Title | S-Isomap++: Multi Manifold Learning from Streaming Data |
Authors | Suchismit Mahapatra, Varun Chandola |
Abstract | Manifold learning based methods have been widely used for non-linear dimensionality reduction (NLDR). However, in many practical settings, the need to process streaming data is a challenge for such methods, owing to the high computational complexity involved. Moreover, most methods operate under the assumption that the input data is sampled from a single manifold, embedded in a high dimensional space. We propose a method for streaming NLDR when the observed data is either sampled from multiple manifolds or irregularly sampled from a single manifold. We show that existing NLDR methods, such as Isomap, fail in such situations, primarily because they rely on smoothness and continuity of the underlying manifold, which is violated in the scenarios explored in this paper. However, the proposed algorithm is able to learn effectively in presence of multiple, and potentially intersecting, manifolds, while allowing for the input data to arrive as a massive stream. |
Tasks | Dimensionality Reduction |
Published | 2017-10-17 |
URL | http://arxiv.org/abs/1710.06462v3 |
http://arxiv.org/pdf/1710.06462v3.pdf | |
PWC | https://paperswithcode.com/paper/s-isomap-multi-manifold-learning-from |
Repo | |
Framework | |
Few-shot Learning by Exploiting Visual Concepts within CNNs
Title | Few-shot Learning by Exploiting Visual Concepts within CNNs |
Authors | Boyang Deng, Qing Liu, Siyuan Qiao, Alan Yuille |
Abstract | Convolutional neural networks (CNNs) are one of the driving forces for the advancement of computer vision. Despite their promising performances on many tasks, CNNs still face major obstacles on the road to achieving ideal machine intelligence. One is that CNNs are complex and hard to interpret. Another is that standard CNNs require large amounts of annotated data, which is sometimes hard to obtain, and it is desirable to learn to recognize objects from few examples. In this work, we address these limitations of CNNs by developing novel, flexible, and interpretable models for few-shot learning. Our models are based on the idea of encoding objects in terms of visual concepts (VCs), which are interpretable visual cues represented by the feature vectors within CNNs. We first adapt the learning of VCs to the few-shot setting, and then uncover two key properties of feature encoding using VCs, which we call category sensitivity and spatial pattern. Motivated by these properties, we present two intuitive models for the problem of few-shot learning. Experiments show that our models achieve competitive performances, while being more flexible and interpretable than alternative state-of-the-art few-shot learning methods. We conclude that using VCs helps expose the natural capability of CNNs for few-shot learning. |
Tasks | Few-Shot Learning |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1711.08277v3 |
http://arxiv.org/pdf/1711.08277v3.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-learning-by-exploiting-visual |
Repo | |
Framework | |
DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding
Title | DeepCoder: Semi-parametric Variational Autoencoders for Automatic Facial Action Coding |
Authors | Dieu Linh Tran, Robert Walecki, Ognjen Rudovic, Stefanos Eleftheriadis, Bjørn Schuller, Maja Pantic |
Abstract | Human face exhibits an inherent hierarchy in its representations (i.e., holistic facial expressions can be encoded via a set of facial action units (AUs) and their intensity). Variational (deep) auto-encoders (VAE) have shown great results in unsupervised extraction of hierarchical latent representations from large amounts of image data, while being robust to noise and other undesired artifacts. Potentially, this makes VAEs a suitable approach for learning facial features for AU intensity estimation. Yet, most existing VAE-based methods apply classifiers learned separately from the encoded features. By contrast, the non-parametric (probabilistic) approaches, such as Gaussian Processes (GPs), typically outperform their parametric counterparts, but cannot deal easily with large amounts of data. To this end, we propose a novel VAE semi-parametric modeling framework, named DeepCoder, which combines the modeling power of parametric (convolutional) and nonparametric (ordinal GPs) VAEs, for joint learning of (1) latent representations at multiple levels in a task hierarchy1, and (2) classification of multiple ordinal outputs. We show on benchmark datasets for AU intensity estimation that the proposed DeepCoder outperforms the state-of-the-art approaches, and related VAEs and deep learning models. |
Tasks | Gaussian Processes |
Published | 2017-04-07 |
URL | http://arxiv.org/abs/1704.02206v2 |
http://arxiv.org/pdf/1704.02206v2.pdf | |
PWC | https://paperswithcode.com/paper/deepcoder-semi-parametric-variational |
Repo | |
Framework | |
A Weakly Supervised Approach to Train Temporal Relation Classifiers and Acquire Regular Event Pairs Simultaneously
Title | A Weakly Supervised Approach to Train Temporal Relation Classifiers and Acquire Regular Event Pairs Simultaneously |
Authors | Wenlin Yao, Saipravallika Nettyam, Ruihong Huang |
Abstract | Capabilities of detecting temporal relations between two events can benefit many applications. Most of existing temporal relation classifiers were trained in a supervised manner. Instead, we explore the observation that regular event pairs show a consistent temporal relation despite of their various contexts, and these rich contexts can be used to train a contextual temporal relation classifier, which can further recognize new temporal relation contexts and identify new regular event pairs. We focus on detecting after and before temporal relations and design a weakly supervised learning approach that extracts thousands of regular event pairs and learns a contextual temporal relation classifier simultaneously. Evaluation shows that the acquired regular event pairs are of high quality and contain rich commonsense knowledge and domain specific knowledge. In addition, the weakly supervised trained temporal relation classifier achieves comparable performance with the state-of-the-art supervised systems. |
Tasks | |
Published | 2017-07-28 |
URL | http://arxiv.org/abs/1707.09410v1 |
http://arxiv.org/pdf/1707.09410v1.pdf | |
PWC | https://paperswithcode.com/paper/a-weakly-supervised-approach-to-train |
Repo | |
Framework | |
Towards String-to-Tree Neural Machine Translation
Title | Towards String-to-Tree Neural Machine Translation |
Authors | Roee Aharoni, Yoav Goldberg |
Abstract | We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees. An experiment on the WMT16 German-English news translation task resulted in an improved BLEU score when compared to a syntax-agnostic NMT baseline trained on the same dataset. An analysis of the translations from the syntax-aware system shows that it performs more reordering during translation in comparison to the baseline. A small-scale human evaluation also showed an advantage to the syntax-aware system. |
Tasks | Machine Translation |
Published | 2017-04-16 |
URL | http://arxiv.org/abs/1704.04743v3 |
http://arxiv.org/pdf/1704.04743v3.pdf | |
PWC | https://paperswithcode.com/paper/towards-string-to-tree-neural-machine |
Repo | |
Framework | |
Phase-error estimation and image reconstruction from digital-holography data using a Bayesian framework
Title | Phase-error estimation and image reconstruction from digital-holography data using a Bayesian framework |
Authors | Casey J. Pellizzari, Mark F. Spencer, Charles A. Bouman |
Abstract | The estimation of phase errors from digital-holography data is critical for applications such as imaging or wave-front sensing. Conventional techniques require multiple i.i.d. data and perform poorly in the presence of high noise or large phase errors. In this paper we propose a method to estimate isoplanatic phase errors from a single data realization. We develop a model-based iterative reconstruction algorithm which computes the maximum a posteriori estimate of the phase and the speckle-free object reflectance. Using simulated data, we show that the algorithm is robust against high noise and strong phase errors. |
Tasks | Image Reconstruction |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1708.01142v1 |
http://arxiv.org/pdf/1708.01142v1.pdf | |
PWC | https://paperswithcode.com/paper/phase-error-estimation-and-image |
Repo | |
Framework | |
Separation-Free Super-Resolution from Compressed Measurements is Possible: an Orthonormal Atomic Norm Minimization Approach
Title | Separation-Free Super-Resolution from Compressed Measurements is Possible: an Orthonormal Atomic Norm Minimization Approach |
Authors | Weiyu Xu, Jirong Yi, Soura Dasgupta, Jian-Feng Cai, Mathews Jacob, Myung Cho |
Abstract | We consider the problem of recovering the superposition of $R$ distinct complex exponential functions from compressed non-uniform time-domain samples. Total Variation (TV) minimization or atomic norm minimization was proposed in the literature to recover the $R$ frequencies or the missing data. However, it is known that in order for TV minimization and atomic norm minimization to recover the missing data or the frequencies, the underlying $R$ frequencies are required to be well-separated, even when the measurements are noiseless. This paper shows that the Hankel matrix recovery approach can super-resolve the $R$ complex exponentials and their frequencies from compressed non-uniform measurements, regardless of how close their frequencies are to each other. We propose a new concept of orthonormal atomic norm minimization (OANM), and demonstrate that the success of Hankel matrix recovery in separation-free super-resolution comes from the fact that the nuclear norm of a Hankel matrix is an orthonormal atomic norm. More specifically, we show that, in traditional atomic norm minimization, the underlying parameter values $\textbf{must}$ be well separated to achieve successful signal recovery, if the atoms are changing continuously with respect to the continuously-valued parameter. In contrast, for the OANM, it is possible the OANM is successful even though the original atoms can be arbitrarily close. As a byproduct of this research, we provide one matrix-theoretic inequality of nuclear norm, and give its proof from the theory of compressed sensing. |
Tasks | Super-Resolution |
Published | 2017-11-04 |
URL | http://arxiv.org/abs/1711.01396v1 |
http://arxiv.org/pdf/1711.01396v1.pdf | |
PWC | https://paperswithcode.com/paper/separation-free-super-resolution-from |
Repo | |
Framework | |
Automatic Response Category Combination in Multinomial Logistic Regression
Title | Automatic Response Category Combination in Multinomial Logistic Regression |
Authors | Bradley S. Price, Charles J. Geyer, Adam J. Rothman |
Abstract | We propose a penalized likelihood method that simultaneously fits the multinomial logistic regression model and combines subsets of the response categories. The penalty is non differentiable when pairs of columns in the optimization variable are equal. This encourages pairwise equality of these columns in the estimator, which corresponds to response category combination. We use an alternating direction method of multipliers algorithm to compute the estimator and we discuss the algorithm’s convergence. Prediction and model selection are also addressed. |
Tasks | Model Selection |
Published | 2017-05-10 |
URL | http://arxiv.org/abs/1705.03594v1 |
http://arxiv.org/pdf/1705.03594v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-response-category-combination-in |
Repo | |
Framework | |
Towards a Quantum World Wide Web
Title | Towards a Quantum World Wide Web |
Authors | Diederik Aerts, Jonito Aerts Arguelles, Lester Beltran, Lyneth Beltran, Isaac Distrito, Massimiliano Sassoli de Bianchi, Sandro Sozzo, Tomas Veloz |
Abstract | We elaborate a quantum model for the meaning associated with corpora of written documents, like the pages forming the World Wide Web. To that end, we are guided by how physicists constructed quantum theory for microscopic entities, which unlike classical objects cannot be fully represented in our spatial theater. We suggest that a similar construction needs to be carried out by linguists and computational scientists, to capture the full meaning carried by collections of documental entities. More precisely, we show how to associate a quantum-like ‘entity of meaning’ to a ‘language entity formed by printed documents’, considering the latter as the collection of traces that are left by the former, in specific results of search actions that we describe as measurements. In other words, we offer a perspective where a collection of documents, like the Web, is described as the space of manifestation of a more complex entity - the QWeb - which is the object of our modeling, drawing its inspiration from previous studies on operational-realistic approaches to quantum physics and quantum modeling of human cognition and decision-making. We emphasize that a consistent QWeb model needs to account for the observed correlations between words appearing in printed documents, e.g., co-occurrences, as the latter would depend on the ‘meaning connections’ existing between the concepts that are associated with these words. In that respect, we show that both ‘context and interference (quantum) effects’ are required to explain the probabilities calculated by counting the relative number of documents containing certain words and co-ocurrrences of words. |
Tasks | Decision Making |
Published | 2017-03-20 |
URL | http://arxiv.org/abs/1703.06642v2 |
http://arxiv.org/pdf/1703.06642v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-quantum-world-wide-web |
Repo | |
Framework | |
NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps
Title | NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps |
Authors | Alessandro Aimar, Hesham Mostafa, Enrico Calabrese, Antonio Rios-Navarro, Ricardo Tapiador-Morales, Iulia-Alexandra Lungu, Moritz B. Milde, Federico Corradi, Alejandro Linares-Barranco, Shih-Chii Liu, Tobi Delbruck |
Abstract | Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving many state-of-the-art (SOA) visual processing tasks. Even though Graphical Processing Units (GPUs) are most often used in training and deploying CNNs, their power efficiency is less than 10 GOp/s/W for single-frame runtime inference. We propose a flexible and efficient CNN accelerator architecture called NullHop that implements SOA CNNs useful for low-power and low-latency application scenarios. NullHop exploits the sparsity of neuron activations in CNNs to accelerate the computation and reduce memory requirements. The flexible architecture allows high utilization of available computing resources across kernel sizes ranging from 1x1 to 7x7. NullHop can process up to 128 input and 128 output feature maps per layer in a single pass. We implemented the proposed architecture on a Xilinx Zynq FPGA platform and present results showing how our implementation reduces external memory transfers and compute time in five different CNNs ranging from small ones up to the widely known large VGG16 and VGG19 CNNs. Post-synthesis simulations using Mentor Modelsim in a 28nm process with a clock frequency of 500 MHz show that the VGG19 network achieves over 450 GOp/s. By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the MAC units, and achieves a power efficiency of over 3TOp/s/W in a core area of 6.3mm$^2$. As further proof of NullHop’s usability, we interfaced its FPGA implementation with a neuromorphic event camera for real time interactive demonstrations. |
Tasks | |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01406v2 |
http://arxiv.org/pdf/1706.01406v2.pdf | |
PWC | https://paperswithcode.com/paper/nullhop-a-flexible-convolutional-neural |
Repo | |
Framework | |
Emergence of Invariance and Disentanglement in Deep Representations
Title | Emergence of Invariance and Disentanglement in Deep Representations |
Authors | Alessandro Achille, Stefano Soatto |
Abstract | Using established principles from Statistics and Information Theory, we show that invariance to nuisance factors in a deep neural network is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations. We then decompose the cross-entropy loss used during training and highlight the presence of an inherent overfitting term. We propose regularizing the loss by bounding such a term in two equivalent ways: One with a Kullbach-Leibler term, which relates to a PAC-Bayes perspective; the other using the information in the weights as a measure of complexity of a learned model, yielding a novel Information Bottleneck for the weights. Finally, we show that invariance and independence of the components of the representation learned by the network are bounded above and below by the information in the weights, and therefore are implicitly optimized during training. The theory enables us to quantify and predict sharp phase transitions between underfitting and overfitting of random labels when using our regularized loss, which we verify in experiments, and sheds light on the relation between the geometry of the loss function, invariance properties of the learned representation, and generalization error. |
Tasks | |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01350v3 |
http://arxiv.org/pdf/1706.01350v3.pdf | |
PWC | https://paperswithcode.com/paper/emergence-of-invariance-and-disentanglement |
Repo | |
Framework | |
Detection of curved lines with B-COSFIRE filters: A case study on crack delineation
Title | Detection of curved lines with B-COSFIRE filters: A case study on crack delineation |
Authors | Nicola Strisciuglio, George Azzopardi, Nicolai Petkov |
Abstract | The detection of curvilinear structures is an important step for various computer vision applications, ranging from medical image analysis for segmentation of blood vessels, to remote sensing for the identification of roads and rivers, and to biometrics and robotics, among others. %The visual system of the brain has remarkable abilities to detect curvilinear structures in noisy images. This is a nontrivial task especially for the detection of thin or incomplete curvilinear structures surrounded with noise. We propose a general purpose curvilinear structure detector that uses the brain-inspired trainable B-COSFIRE filters. It consists of four main steps, namely nonlinear filtering with B-COSFIRE, thinning with non-maximum suppression, hysteresis thresholding and morphological closing. We demonstrate its effectiveness on a data set of noisy images with cracked pavements, where we achieve state-of-the-art results (F-measure=0.865). The proposed method can be employed in any computer vision methodology that requires the delineation of curvilinear and elongated structures. |
Tasks | |
Published | 2017-07-24 |
URL | http://arxiv.org/abs/1707.07747v1 |
http://arxiv.org/pdf/1707.07747v1.pdf | |
PWC | https://paperswithcode.com/paper/detection-of-curved-lines-with-b-cosfire |
Repo | |
Framework | |
Bias-Variance Tradeoff of Graph Laplacian Regularizer
Title | Bias-Variance Tradeoff of Graph Laplacian Regularizer |
Authors | Pin-Yu Chen, Sijia Liu |
Abstract | This paper presents a bias-variance tradeoff of graph Laplacian regularizer, which is widely used in graph signal processing and semi-supervised learning tasks. The scaling law of the optimal regularization parameter is specified in terms of the spectral graph properties and a novel signal-to-noise ratio parameter, which suggests selecting a mediocre regularization parameter is often suboptimal. The analysis is applied to three applications, including random, band-limited, and multiple-sampled graph signals. Experiments on synthetic and real-world graphs demonstrate near-optimal performance of the established analysis. |
Tasks | |
Published | 2017-06-02 |
URL | http://arxiv.org/abs/1706.00544v1 |
http://arxiv.org/pdf/1706.00544v1.pdf | |
PWC | https://paperswithcode.com/paper/bias-variance-tradeoff-of-graph-laplacian |
Repo | |
Framework | |
Fast k-means based on KNN Graph
Title | Fast k-means based on KNN Graph |
Authors | Cheng-Hao Deng, Wan-Lei Zhao |
Abstract | In the era of big data, k-means clustering has been widely adopted as a basic processing tool in various contexts. However, its computational cost could be prohibitively high as the data size and the cluster number are large. It is well known that the processing bottleneck of k-means lies in the operation of seeking closest centroid in each iteration. In this paper, a novel solution towards the scalability issue of k-means is presented. In the proposal, k-means is supported by an approximate k-nearest neighbors graph. In the k-means iteration, each data sample is only compared to clusters that its nearest neighbors reside. Since the number of nearest neighbors we consider is much less than k, the processing cost in this step becomes minor and irrelevant to k. The processing bottleneck is therefore overcome. The most interesting thing is that k-nearest neighbor graph is constructed by iteratively calling the fast $k$-means itself. Comparing with existing fast k-means variants, the proposed algorithm achieves hundreds to thousands times speed-up while maintaining high clustering quality. As it is tested on 10 million 512-dimensional data, it takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the same scale of clustering, it would take 3 years for traditional k-means. |
Tasks | |
Published | 2017-05-04 |
URL | http://arxiv.org/abs/1705.01813v1 |
http://arxiv.org/pdf/1705.01813v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-k-means-based-on-knn-graph |
Repo | |
Framework | |