May 5, 2019

2936 words 14 mins read

Paper Group ANR 569

Paper Group ANR 569

Generative Knowledge Transfer for Neural Language Models. Inferring Sparsity: Compressed Sensing using Generalized Restricted Boltzmann Machines. Fast Face-swap Using Convolutional Neural Networks. A Factorization Approach to Inertial Affine Structure from Motion. Efficient L1-Norm Principal-Component Analysis via Bit Flipping. Structured Sparse Co …

Generative Knowledge Transfer for Neural Language Models

Title Generative Knowledge Transfer for Neural Language Models
Authors Sungho Shin, Kyuyeon Hwang, Wonyong Sung
Abstract In this paper, we propose a generative knowledge transfer technique that trains an RNN based language model (student network) using text and output probabilities generated from a previously trained RNN (teacher network). The text generation can be conducted by either the teacher or the student network. We can also improve the performance by taking the ensemble of soft labels obtained from multiple teacher networks. This method can be used for privacy conscious language model adaptation because no user data is directly used for training. Especially, when the soft labels of multiple devices are aggregated via a trusted third party, we can expect very strong privacy protection.
Tasks Language Modelling, Text Generation, Transfer Learning
Published 2016-08-14
URL http://arxiv.org/abs/1608.04077v3
PDF http://arxiv.org/pdf/1608.04077v3.pdf
PWC https://paperswithcode.com/paper/generative-knowledge-transfer-for-neural
Repo
Framework

Inferring Sparsity: Compressed Sensing using Generalized Restricted Boltzmann Machines

Title Inferring Sparsity: Compressed Sensing using Generalized Restricted Boltzmann Machines
Authors Eric W. Tramel, Andre Manoel, Francesco Caltagirone, Marylou Gabrié, Florent Krzakala
Abstract In this work, we consider compressed sensing reconstruction from $M$ measurements of $K$-sparse structured signals which do not possess a writable correlation model. Assuming that a generative statistical model, such as a Boltzmann machine, can be trained in an unsupervised manner on example signals, we demonstrate how this signal model can be used within a Bayesian framework of signal reconstruction. By deriving a message-passing inference for general distribution restricted Boltzmann machines, we are able to integrate these inferred signal models into approximate message passing for compressed sensing reconstruction. Finally, we show for the MNIST dataset that this approach can be very effective, even for $M < K$.
Tasks
Published 2016-06-13
URL http://arxiv.org/abs/1606.03956v1
PDF http://arxiv.org/pdf/1606.03956v1.pdf
PWC https://paperswithcode.com/paper/inferring-sparsity-compressed-sensing-using
Repo
Framework

Fast Face-swap Using Convolutional Neural Networks

Title Fast Face-swap Using Convolutional Neural Networks
Authors Iryna Korshunova, Wenzhe Shi, Joni Dambre, Lucas Theis
Abstract We consider the problem of face swapping in images, where an input identity is transformed into a target identity while preserving pose, facial expression, and lighting. To perform this mapping, we use convolutional neural networks trained to capture the appearance of the target identity from an unstructured collection of his/her photographs.This approach is enabled by framing the face swapping problem in terms of style transfer, where the goal is to render an image in the style of another one. Building on recent advances in this area, we devise a new loss function that enables the network to produce highly photorealistic results. By combining neural networks with simple pre- and post-processing steps, we aim at making face swap work in real-time with no input from the user.
Tasks Face Swapping, Style Transfer
Published 2016-11-29
URL http://arxiv.org/abs/1611.09577v2
PDF http://arxiv.org/pdf/1611.09577v2.pdf
PWC https://paperswithcode.com/paper/fast-face-swap-using-convolutional-neural
Repo
Framework

A Factorization Approach to Inertial Affine Structure from Motion

Title A Factorization Approach to Inertial Affine Structure from Motion
Authors Roberto Tron
Abstract We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model. This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation. In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements. We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives.
Tasks
Published 2016-08-09
URL http://arxiv.org/abs/1608.02680v1
PDF http://arxiv.org/pdf/1608.02680v1.pdf
PWC https://paperswithcode.com/paper/a-factorization-approach-to-inertial-affine
Repo
Framework

Efficient L1-Norm Principal-Component Analysis via Bit Flipping

Title Efficient L1-Norm Principal-Component Analysis via Bit Flipping
Authors Panos P. Markopoulos, Sandipan Kundu, Shubham Chamadia, Dimitris A. Pados
Abstract It was shown recently that the $K$ L1-norm principal components (L1-PCs) of a real-valued data matrix $\mathbf X \in \mathbb R^{D \times N}$ ($N$ data samples of $D$ dimensions) can be exactly calculated with cost $\mathcal{O}(2^{NK})$ or, when advantageous, $\mathcal{O}(N^{dK - K + 1})$ where $d=\mathrm{rank}(\mathbf X)$, $K<d$ [1],[2]. In applications where $\mathbf X$ is large (e.g., “big” data of large $N$ and/or “heavy” data of large $d$), these costs are prohibitive. In this work, we present a novel suboptimal algorithm for the calculation of the $K < d$ L1-PCs of $\mathbf X$ of cost $\mathcal O(ND \mathrm{min} { N,D} + N^2(K^4 + dK^2) + dNK^3)$, which is comparable to that of standard (L2-norm) PC analysis. Our theoretical and experimental studies show that the proposed algorithm calculates the exact optimal L1-PCs with high frequency and achieves higher value in the L1-PC optimization metric than any known alternative algorithm of comparable computational cost. The superiority of the calculated L1-PCs over standard L2-PCs (singular vectors) in characterizing potentially faulty data/measurements is demonstrated with experiments on data dimensionality reduction and disease diagnosis from genomic data.
Tasks Dimensionality Reduction
Published 2016-10-06
URL http://arxiv.org/abs/1610.01959v1
PDF http://arxiv.org/pdf/1610.01959v1.pdf
PWC https://paperswithcode.com/paper/efficient-l1-norm-principal-component
Repo
Framework

Structured Sparse Convolutional Autoencoder

Title Structured Sparse Convolutional Autoencoder
Authors Ehsan Hosseini-Asl
Abstract This paper aims to improve the feature learning in Convolutional Networks (Convnet) by capturing the structure of objects. A new sparsity function is imposed on the extracted featuremap to capture the structure and shape of the learned object, extracting interpretable features to improve the prediction performance. The proposed algorithm is based on organizing the activation within and across featuremap by constraining the node activities through $\ell_{2}$ and $\ell_{1}$ normalization in a structured form.
Tasks
Published 2016-04-17
URL http://arxiv.org/abs/1604.04812v3
PDF http://arxiv.org/pdf/1604.04812v3.pdf
PWC https://paperswithcode.com/paper/structured-sparse-convolutional-autoencoder
Repo
Framework

Surveillance Video Parsing with Single Frame Supervision

Title Surveillance Video Parsing with Single Frame Supervision
Authors Si Liu, Changhu Wang, Ruihe Qian, Han Yu, Renda Bao
Abstract Surveillance video parsing, which segments the video frames into several labels, e.g., face, pants, left-leg, has wide applications. However,pixel-wisely annotating all frames is tedious and inefficient. In this paper, we develop a Single frame Video Parsing (SVP) method which requires only one labeled frame per video in training stage. To parse one particular frame, the video segment preceding the frame is jointly considered. SVP (1) roughly parses the frames within the video segment, (2) estimates the optical flow between frames and (3) fuses the rough parsing results warped by optical flow to produce the refined parsing result. The three components of SVP, namely frame parsing, optical flow estimation and temporal fusion are integrated in an end-to-end manner. Experimental results on two surveillance video datasets show the superiority of SVP over state-of-the-arts.
Tasks Optical Flow Estimation
Published 2016-11-29
URL http://arxiv.org/abs/1611.09587v1
PDF http://arxiv.org/pdf/1611.09587v1.pdf
PWC https://paperswithcode.com/paper/surveillance-video-parsing-with-single-frame
Repo
Framework

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

Title Recursive Recurrent Nets with Attention Modeling for OCR in the Wild
Authors Chen-Yu Lee, Simon Osindero
Abstract We present recursive recurrent neural networks with attention modeling (R$^2$AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient and effective image feature extraction; (2) an implicitly learned character-level language model, embodied in a recurrent neural network which avoids the need to use N-grams; and (3) the use of a soft-attention mechanism, allowing the model to selectively exploit image features in a coordinated way, and allowing for end-to-end training within a standard backpropagation framework. We validate our method with state-of-the-art performance on challenging benchmark datasets: Street View Text, IIIT5k, ICDAR and Synth90k.
Tasks Language Modelling, Optical Character Recognition
Published 2016-03-09
URL http://arxiv.org/abs/1603.03101v1
PDF http://arxiv.org/pdf/1603.03101v1.pdf
PWC https://paperswithcode.com/paper/recursive-recurrent-nets-with-attention
Repo
Framework

The Opacity of Backbones

Title The Opacity of Backbones
Authors Lane A. Hemaspaandra, David E. Narváez
Abstract This paper approaches, using structural complexity theory, the question of whether there is a chasm between knowing an object exists and getting one’s hands on the object or its properties. In particular, we study the nontransparency of so-called backbones. A backbone of a boolean formula $F$ is a collection $S$ of its variables for which there is a unique partial assignment $a_S$ such that $F[a_S]$ is satisfiable [MZK+99,WGS03]. We show that, under the widely believed assumption that integer factoring is hard, there exist sets of boolean formulas that have obvious, nontrivial backbones yet finding the values, $a_S$, of those backbones is intractable. We also show that, under the same assumption, there exist sets of boolean formulas that obviously have large backbones yet producing such a backbone $S$ is intractable. Furthermore, we show that if integer factoring is not merely worst-case hard but is frequently hard, as is widely believed, then the frequency of hardness in our two results is not too much less than that frequency. These results hold more generally, namely, in the settings where, respectively, one’s assumption is that P $\neq$ NP $\cap$ coNP or that some problem in NP $\cap$ coNP is frequently hard.
Tasks
Published 2016-06-11
URL http://arxiv.org/abs/1606.03634v5
PDF http://arxiv.org/pdf/1606.03634v5.pdf
PWC https://paperswithcode.com/paper/the-opacity-of-backbones
Repo
Framework

An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve Replicability of Cluster Assignments for Mapping Application

Title An Initial Seed Selection Algorithm for K-means Clustering of Georeferenced Data to Improve Replicability of Cluster Assignments for Mapping Application
Authors Fouad Khan
Abstract K-means is one of the most widely used clustering algorithms in various disciplines, especially for large datasets. However the method is known to be highly sensitive to initial seed selection of cluster centers. K-means++ has been proposed to overcome this problem and has been shown to have better accuracy and computational efficiency than k-means. In many clustering problems though -such as when classifying georeferenced data for mapping applications- standardization of clustering methodology, specifically, the ability to arrive at the same cluster assignment for every run of the method i.e. replicability of the methodology, may be of greater significance than any perceived measure of accuracy, especially when the solution is known to be non-unique, as in the case of k-means clustering. Here we propose a simple initial seed selection algorithm for k-means clustering along one attribute that draws initial cluster boundaries along the ‘deepest valleys’ or greatest gaps in dataset. Thus, it incorporates a measure to maximize distance between consecutive cluster centers which augments the conventional k-means optimization for minimum distance between cluster center and cluster members. Unlike existing initialization methods, no additional parameters or degrees of freedom are introduced to the clustering algorithm. This improves the replicability of cluster assignments by as much as 100% over k-means and k-means++, virtually reducing the variance over different runs to zero, without introducing any additional parameters to the clustering process. Further, the proposed method is more computationally efficient than k-means++ and in some cases, more accurate.
Tasks
Published 2016-04-17
URL http://arxiv.org/abs/1604.04893v1
PDF http://arxiv.org/pdf/1604.04893v1.pdf
PWC https://paperswithcode.com/paper/an-initial-seed-selection-algorithm-for-k
Repo
Framework

Video Depth-From-Defocus

Title Video Depth-From-Defocus
Authors Hyeongwoo Kim, Christian Richardt, Christian Theobalt
Abstract Many compelling video post-processing effects, in particular aesthetic focus editing and refocusing effects, are feasible if per-frame depth information is available. Existing computational methods to capture RGB and depth either purposefully modify the optics (coded aperture, light-field imaging), or employ active RGB-D cameras. Since these methods are less practical for users with normal cameras, we present an algorithm to capture all-in-focus RGB-D video of dynamic scenes with an unmodified commodity video camera. Our algorithm turns the often unwanted defocus blur into a valuable signal. The input to our method is a video in which the focus plane is continuously moving back and forth during capture, and thus defocus blur is provoked and strongly visible. This can be achieved by manually turning the focus ring of the lens during recording. The core algorithmic ingredient is a new video-based depth-from-defocus algorithm that computes space-time-coherent depth maps, deblurred all-in-focus video, and the focus distance for each frame. We extensively evaluate our approach, and show that it enables compelling video post-processing effects, such as different types of refocusing.
Tasks
Published 2016-10-12
URL http://arxiv.org/abs/1610.03782v1
PDF http://arxiv.org/pdf/1610.03782v1.pdf
PWC https://paperswithcode.com/paper/video-depth-from-defocus
Repo
Framework

Scalable image coding based on epitomes

Title Scalable image coding based on epitomes
Authors Martin Alain, Christine Guillemot, Dominique Thoreau, Philippe Guillotel
Abstract In this paper, we propose a novel scheme for scalable image coding based on the concept of epitome. An epitome can be seen as a factorized representation of an image. Focusing on spatial scalability, the enhancement layer of the proposed scheme contains only the epitome of the input image. The pixels of the enhancement layer not contained in the epitome are then restored using two approaches inspired from local learning-based super-resolution methods. In the first method, a locally linear embedding model is learned on base layer patches and then applied to the corresponding epitome patches to reconstruct the enhancement layer. The second approach learns linear mappings between pairs of co-located base layer and epitome patches. Experiments have shown that significant improvement of the rate-distortion performances can be achieved compared to an SHVC reference.
Tasks Super-Resolution
Published 2016-06-28
URL http://arxiv.org/abs/1606.08694v1
PDF http://arxiv.org/pdf/1606.08694v1.pdf
PWC https://paperswithcode.com/paper/scalable-image-coding-based-on-epitomes
Repo
Framework

Estimation of low rank density matrices by Pauli measurements

Title Estimation of low rank density matrices by Pauli measurements
Authors Dong Xia
Abstract Density matrices are positively semi-definite Hermitian matrices with unit trace that describe the states of quantum systems. Many quantum systems of physical interest can be represented as high-dimensional low rank density matrices. A popular problem in {\it quantum state tomography} (QST) is to estimate the unknown low rank density matrix of a quantum system by conducting Pauli measurements. Our main contribution is twofold. First, we establish the minimax lower bounds in Schatten $p$-norms with $1\leq p\leq +\infty$ for low rank density matrices estimation by Pauli measurements. In our previous paper, these minimax lower bounds are proved under the trace regression model with Gaussian noise and the noise is assumed to have common variance. In this paper, we prove these bounds under the Binomial observation model which meets the actual model in QST. Second, we study the Dantzig estimator (DE) for estimating the unknown low rank density matrix under the Binomial observation model by using Pauli measurements. In our previous papers, we studied the least squares estimator and the projection estimator, where we proved the optimal convergence rates for the least squares estimator in Schatten $p$-norms with $1\leq p\leq 2$ and, under a stronger condition, the optimal convergence rates for the projection estimator in Schatten $p$-norms with $1\leq p\leq +\infty$. In this paper, we show that the results of these two distinct estimators can be simultaneously obtained by the Dantzig estimator. Moreover, better convergence rates in Schatten norm distances can be proved for Dantzig estimator under conditions weaker than those needed in previous papers. When the objective function of DE is replaced by the negative von Neumann entropy, we obtain sharp convergence rate in Kullback-Leibler divergence.
Tasks Quantum State Tomography
Published 2016-10-16
URL http://arxiv.org/abs/1610.04811v2
PDF http://arxiv.org/pdf/1610.04811v2.pdf
PWC https://paperswithcode.com/paper/estimation-of-low-rank-density-matrices-by
Repo
Framework

Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation

Title Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation
Authors Yacine Jernite, Anna Choromanska, David Sontag
Abstract We consider multi-class classification where the predictor has a hierarchical structure that allows for a very large number of labels both at train and test time. The predictive power of such models can heavily depend on the structure of the tree, and although past work showed how to learn the tree structure, it expected that the feature vectors remained static. We provide a novel algorithm to simultaneously perform representation learning for the input data and learning of the hierarchi- cal predictor. Our approach optimizes an objec- tive function which favors balanced and easily- separable multi-way node partitions. We theoret- ically analyze this objective, showing that it gives rise to a boosting style property and a bound on classification error. We next show how to extend the algorithm to conditional density estimation. We empirically validate both variants of the al- gorithm on text classification and language mod- eling, respectively, and show that they compare favorably to common baselines in terms of accu- racy and running time.
Tasks Density Estimation, Representation Learning, Text Classification
Published 2016-10-14
URL http://arxiv.org/abs/1610.04658v2
PDF http://arxiv.org/pdf/1610.04658v2.pdf
PWC https://paperswithcode.com/paper/simultaneous-learning-of-trees-and
Repo
Framework

Asynchronous Stochastic Proximal Optimization Algorithms with Variance Reduction

Title Asynchronous Stochastic Proximal Optimization Algorithms with Variance Reduction
Authors Qi Meng, Wei Chen, Jingcheng Yu, Taifeng Wang, Zhi-Ming Ma, Tie-Yan Liu
Abstract Regularized empirical risk minimization (R-ERM) is an important branch of machine learning, since it constrains the capacity of the hypothesis space and guarantees the generalization ability of the learning algorithm. Two classic proximal optimization algorithms, i.e., proximal stochastic gradient descent (ProxSGD) and proximal stochastic coordinate descent (ProxSCD) have been widely used to solve the R-ERM problem. Recently, variance reduction technique was proposed to improve ProxSGD and ProxSCD, and the corresponding ProxSVRG and ProxSVRCD have better convergence rate. These proximal algorithms with variance reduction technique have also achieved great success in applications at small and moderate scales. However, in order to solve large-scale R-ERM problems and make more practical impacts, the parallel version of these algorithms are sorely needed. In this paper, we propose asynchronous ProxSVRG (Async-ProxSVRG) and asynchronous ProxSVRCD (Async-ProxSVRCD) algorithms, and prove that Async-ProxSVRG can achieve near linear speedup when the training data is sparse, while Async-ProxSVRCD can achieve near linear speedup regardless of the sparse condition, as long as the number of block partitions are appropriately set. We have conducted experiments on a regularized logistic regression task. The results verified our theoretical findings and demonstrated the practical efficiency of the asynchronous stochastic proximal algorithms with variance reduction.
Tasks
Published 2016-09-27
URL http://arxiv.org/abs/1609.08435v1
PDF http://arxiv.org/pdf/1609.08435v1.pdf
PWC https://paperswithcode.com/paper/asynchronous-stochastic-proximal-optimization
Repo
Framework
comments powered by Disqus