Paper Group ANR 72
A Boosting Method to Face Image Super-resolution. An Empirical Study of Continuous Connectivity Degree Sequence Equivalents. Deep Quantization: Encoding Convolutional Activations with Deep Generative Model. Gaze Embeddings for Zero-Shot Image Classification. Generalized BackPropagation, Étude De Cas: Orthogonality. Distributed Information-Theoretic …
A Boosting Method to Face Image Super-resolution
Title | A Boosting Method to Face Image Super-resolution |
Authors | Shanjun Mao, Da Zhou, Yiping Zhang, Zhihong Zhang, Jingjing Cao |
Abstract | Recently sparse representation has gained great success in face image super-resolution. The conventional sparsity-based methods enforce sparse coding on face image patches and the representation fidelity is measured by $\ell_{2}$-norm. Such a sparse coding model regularizes all facial patches equally, which however ignores distinct natures of different facial patches for image reconstruction. In this paper, we propose a new weighted-patch super-resolution method based on AdaBoost. Specifically, in each iteration of the AdaBoost operation, each facial patch is weighted automatically according to the performance of the model on it, so as to highlight those patches that are more critical for improving the reconstruction power in next step. In this way, through the AdaBoost training procedure, we can focus more on the patches (face regions) with richer information. Various experimental results on standard face database show that our proposed method outperforms state-of-the-art methods in terms of both objective metrics and visual quality. |
Tasks | Image Reconstruction, Image Super-Resolution, Super-Resolution |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.01805v3 |
http://arxiv.org/pdf/1609.01805v3.pdf | |
PWC | https://paperswithcode.com/paper/a-boosting-method-to-face-image-super |
Repo | |
Framework | |
An Empirical Study of Continuous Connectivity Degree Sequence Equivalents
Title | An Empirical Study of Continuous Connectivity Degree Sequence Equivalents |
Authors | Daniel Moyer, Boris A. Gutman, Joshua Faskowitz, Neda Jahanshad, Paul M. Thompson |
Abstract | In the present work we demonstrate the use of a parcellation free connectivity model based on Poisson point processes. This model produces for each subject a continuous bivariate intensity function that represents for every possible pair of points the relative rate at which we observe tracts terminating at those points. We fit this model to explore degree sequence equivalents for spatial continuum graphs, and to investigate the local differences between estimated intensity functions for two different tractography methods. This is a companion paper to Moyer et al. (2016), where the model was originally defined. |
Tasks | Point Processes |
Published | 2016-11-18 |
URL | http://arxiv.org/abs/1611.06197v1 |
http://arxiv.org/pdf/1611.06197v1.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-continuous-connectivity |
Repo | |
Framework | |
Deep Quantization: Encoding Convolutional Activations with Deep Generative Model
Title | Deep Quantization: Encoding Convolutional Activations with Deep Generative Model |
Authors | Zhaofan Qiu, Ting Yao, Tao Mei |
Abstract | Deep convolutional neural networks (CNNs) have proven highly effective for visual recognition, where learning a universal representation from activations of convolutional layer plays a fundamental problem. In this paper, we present Fisher Vector encoding with Variational Auto-Encoder (FV-VAE), a novel deep architecture that quantizes the local activations of convolutional layer in a deep generative model, by training them in an end-to-end manner. To incorporate FV encoding strategy into deep generative models, we introduce Variational Auto-Encoder model, which steers a variational inference and learning in a neural network which can be straightforwardly optimized using standard stochastic gradient method. Different from the FV characterized by conventional generative models (e.g., Gaussian Mixture Model) which parsimoniously fit a discrete mixture model to data distribution, the proposed FV-VAE is more flexible to represent the natural property of data for better generalization. Extensive experiments are conducted on three public datasets, i.e., UCF101, ActivityNet, and CUB-200-2011 in the context of video action recognition and fine-grained image classification, respectively. Superior results are reported when compared to state-of-the-art representations. Most remarkably, our proposed FV-VAE achieves to-date the best published accuracy of 94.2% on UCF101. |
Tasks | Fine-Grained Image Classification, Image Classification, Quantization, Temporal Action Localization |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09502v1 |
http://arxiv.org/pdf/1611.09502v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-quantization-encoding-convolutional |
Repo | |
Framework | |
Gaze Embeddings for Zero-Shot Image Classification
Title | Gaze Embeddings for Zero-Shot Image Classification |
Authors | Nour Karessli, Zeynep Akata, Bernt Schiele, Andreas Bulling |
Abstract | Zero-shot image classification using auxiliary information, such as attributes describing discriminative object properties, requires time-consuming annotation by domain experts. We instead propose a method that relies on human gaze as auxiliary information, exploiting that even non-expert users have a natural ability to judge class membership. We present a data collection paradigm that involves a discrimination task to increase the information content obtained from gaze data. Our method extracts discriminative descriptors from the data and learns a compatibility function between image and gaze using three novel gaze embeddings: Gaze Histograms (GH), Gaze Features with Grid (GFG) and Gaze Features with Sequence (GFS). We introduce two new gaze-annotated datasets for fine-grained image classification and show that human gaze data is indeed class discriminative, provides a competitive alternative to expert-annotated attributes, and outperforms other baselines for zero-shot image classification. |
Tasks | Fine-Grained Image Classification, Image Classification |
Published | 2016-11-28 |
URL | http://arxiv.org/abs/1611.09309v2 |
http://arxiv.org/pdf/1611.09309v2.pdf | |
PWC | https://paperswithcode.com/paper/gaze-embeddings-for-zero-shot-image |
Repo | |
Framework | |
Generalized BackPropagation, Étude De Cas: Orthogonality
Title | Generalized BackPropagation, Étude De Cas: Orthogonality |
Authors | Mehrtash Harandi, Basura Fernando |
Abstract | This paper introduces an extension of the backpropagation algorithm that enables us to have layers with constrained weights in a deep network. In particular, we make use of the Riemannian geometry and optimization techniques on matrix manifolds to step outside of normal practice in training deep networks, equipping the network with structures such as orthogonality or positive definiteness. Based on our development, we make another contribution by introducing the Stiefel layer, a layer with orthogonal weights. Among various applications, Stiefel layers can be used to design orthogonal filter banks, perform dimensionality reduction and feature extraction. We demonstrate the benefits of having orthogonality in deep networks through a broad set of experiments, ranging from unsupervised feature learning to fine-grained image classification. |
Tasks | Dimensionality Reduction, Fine-Grained Image Classification, Image Classification |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05927v1 |
http://arxiv.org/pdf/1611.05927v1.pdf | |
PWC | https://paperswithcode.com/paper/generalized-backpropagation-etude-de-cas |
Repo | |
Framework | |
Distributed Information-Theoretic Clustering
Title | Distributed Information-Theoretic Clustering |
Authors | Georg Pichler, Pablo Piantanida, Gerald Matz |
Abstract | We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sequences $X^n$ and $Y^n$, respectively. The goal is to find rate-limited encodings $f(x^n)$ and $g(z^n)$ that maximize the mutual information $I(f(X^n); g(Y^n))/n$. We discuss connections of this problem with hypothesis testing against independence, pattern recognition, the information bottleneck method. Improving previous cardinality bounds for the inner and outer bounds allows us to thoroughly study the special case of a binary symmetric source and to quantify the gap between the inner and the outer bound in this special case. Furthermore, we investigate a multiple description (MD) extension of the CEO problem with mutual information constraint. Surprisingly, this MD-CEO problem permits a tight single-letter characterization of the achievable region. |
Tasks | |
Published | 2016-02-15 |
URL | https://arxiv.org/abs/1602.04605v5 |
https://arxiv.org/pdf/1602.04605v5.pdf | |
PWC | https://paperswithcode.com/paper/distributed-information-theoretic-clustering |
Repo | |
Framework | |
Top-down Neural Attention by Excitation Backprop
Title | Top-down Neural Attention by Excitation Backprop |
Authors | Jianming Zhang, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Stan Sclaroff |
Abstract | We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps. Inspired by a top-down human visual attention model, we propose a new backpropagation scheme, called Excitation Backprop, to pass along top-down signals downwards in the network hierarchy via a probabilistic Winner-Take-All process. Furthermore, we introduce the concept of contrastive attention to make the top-down attention maps more discriminative. In experiments, we demonstrate the accuracy and generalizability of our method in weakly supervised localization tasks on the MS COCO, PASCAL VOC07 and ImageNet datasets. The usefulness of our method is further validated in the text-to-region association task. On the Flickr30k Entities dataset, we achieve promising performance in phrase localization by leveraging the top-down attention of a CNN model that has been trained on weakly labeled web images. |
Tasks | |
Published | 2016-08-01 |
URL | http://arxiv.org/abs/1608.00507v1 |
http://arxiv.org/pdf/1608.00507v1.pdf | |
PWC | https://paperswithcode.com/paper/top-down-neural-attention-by-excitation |
Repo | |
Framework | |
Verifier Theory and Unverifiability
Title | Verifier Theory and Unverifiability |
Authors | Roman V. Yampolskiy |
Abstract | Despite significant developments in Proof Theory, surprisingly little attention has been devoted to the concept of proof verifier. In particular, the mathematical community may be interested in studying different types of proof verifiers (people, programs, oracles, communities, superintelligences) as mathematical objects. Such an effort could reveal their properties, their powers and limitations (particularly in human mathematicians), minimum and maximum complexity, as well as self-verification and self-reference issues. We propose an initial classification system for verifiers and provide some rudimentary analysis of solved and open problems in this important domain. Our main contribution is a formal introduction of the notion of unverifiability, for which the paper could serve as a general citation in domains of theorem proving, as well as software and AI verification. |
Tasks | Automated Theorem Proving |
Published | 2016-09-01 |
URL | http://arxiv.org/abs/1609.00331v3 |
http://arxiv.org/pdf/1609.00331v3.pdf | |
PWC | https://paperswithcode.com/paper/verifier-theory-and-unverifiability |
Repo | |
Framework | |
Blind Image Denoising via Dependent Dirichlet Process Tree
Title | Blind Image Denoising via Dependent Dirichlet Process Tree |
Authors | Fengyuan Zhu, Guangyong Chen, Jianye Hao, Pheng-Ann Heng |
Abstract | Most existing image denoising approaches assumed the noise to be homogeneous white Gaussian distributed with known intensity. However, in real noisy images, the noise models are usually unknown beforehand and can be much more complex. This paper addresses this problem and proposes a novel blind image denoising algorithm to recover the clean image from noisy one with the unknown noise model. To model the empirical noise of an image, our method introduces the mixture of Gaussian distribution, which is flexible enough to approximate different continuous distributions. The problem of blind image denoising is reformulated as a learning problem. The procedure is to first build a two-layer structural model for noisy patches and consider the clean ones as latent variable. To control the complexity of the noisy patch model, this work proposes a novel Bayesian nonparametric prior called “Dependent Dirichlet Process Tree” to build the model. Then, this study derives a variational inference algorithm to estimate model parameters and recover clean patches. We apply our method on synthesis and real noisy images with different noise models. Comparing with previous approaches, ours achieves better performance. The experimental results indicate the efficiency of the proposed algorithm to cope with practical image denoising tasks. |
Tasks | Denoising, Image Denoising |
Published | 2016-01-13 |
URL | http://arxiv.org/abs/1601.03117v1 |
http://arxiv.org/pdf/1601.03117v1.pdf | |
PWC | https://paperswithcode.com/paper/blind-image-denoising-via-dependent-dirichlet |
Repo | |
Framework | |
Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection
Title | Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection |
Authors | Xiang Wang, Huimin Ma, Xiaozhi Chen, Shaodi You |
Abstract | In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection. The proposed framework is aiming to address two limits of the existing CNN based methods. First, region-based CNN methods lack sufficient context to accurately locate salient object since they deal with each region independently. Second, pixel-based CNN methods suffer from blurry boundaries due to the presence of convolutional and pooling layers. Motivated by these, we first propose an end-to-end edge-preserved neural network based on Fast R-CNN framework (named RegionNet) to efficiently generate saliency map with sharp object boundaries. Later, to further improve it, multi-scale spatial context is attached to RegionNet to consider the relationship between regions and the global scenes. Furthermore, our method can be generally applied to RGB-D saliency detection by depth refinement. The proposed framework achieves both clear detection boundary and multi-scale contextual robustness simultaneously for the first time, and thus achieves an optimized performance. Experiments on six RGB and two RGB-D benchmark datasets demonstrate that the proposed method achieves state-of-the-art performance. |
Tasks | Object Detection, Saliency Detection, Salient Object Detection |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.08029v2 |
http://arxiv.org/pdf/1608.08029v2.pdf | |
PWC | https://paperswithcode.com/paper/edge-preserving-and-multi-scale-contextual |
Repo | |
Framework | |
Linear dynamical neural population models through nonlinear embeddings
Title | Linear dynamical neural population models through nonlinear embeddings |
Authors | Yuanjun Gao, Evan Archer, Liam Paninski, John P. Cunningham |
Abstract | A body of recent work in modeling neural activity focuses on recovering low-dimensional latent features that capture the statistical structure of large-scale neural populations. Most such approaches have focused on linear generative models, where inference is computationally tractable. Here, we propose fLDS, a general class of nonlinear generative models that permits the firing rate of each neuron to vary as an arbitrary smooth function of a latent, linear dynamical state. This extra flexibility allows the model to capture a richer set of neural variability than a purely linear model, but retains an easily visualizable low-dimensional latent space. To fit this class of non-conjugate models we propose a variational inference scheme, along with a novel approximate posterior capable of capturing rich temporal correlations across time. We show that our techniques permit inference in a wide class of generative models.We also show in application to two neural datasets that, compared to state-of-the-art neural population models, fLDS captures a much larger proportion of neural variability with a small number of latent dimensions, providing superior predictive performance and interpretability. |
Tasks | |
Published | 2016-05-26 |
URL | http://arxiv.org/abs/1605.08454v2 |
http://arxiv.org/pdf/1605.08454v2.pdf | |
PWC | https://paperswithcode.com/paper/linear-dynamical-neural-population-models |
Repo | |
Framework | |
Faster Projection-free Convex Optimization over the Spectrahedron
Title | Faster Projection-free Convex Optimization over the Spectrahedron |
Authors | Dan Garber |
Abstract | Minimizing a convex function over the spectrahedron, i.e., the set of all positive semidefinite matrices with unit trace, is an important optimization task with many applications in optimization, machine learning, and signal processing. It is also notoriously difficult to solve in large-scale since standard techniques require expensive matrix decompositions. An alternative, is the conditional gradient method (aka Frank-Wolfe algorithm) that regained much interest in recent years, mostly due to its application to this specific setting. The key benefit of the CG method is that it avoids expensive matrix decompositions all together, and simply requires a single eigenvector computation per iteration, which is much more efficient. On the downside, the CG method, in general, converges with an inferior rate. The error for minimizing a $\beta$-smooth function after $t$ iterations scales like $\beta/t$. This convergence rate does not improve even if the function is also strongly convex. In this work we present a modification of the CG method tailored for convex optimization over the spectrahedron. The per-iteration complexity of the method is essentially identical to that of the standard CG method: only a single eigenvecor computation is required. For minimizing an $\alpha$-strongly convex and $\beta$-smooth function, the expected approximation error of the method after $t$ iterations is: $$O\left({\min{\frac{\beta{}}{t} ,\left({\frac{\beta\sqrt{\textrm{rank}(\textbf{X}^*)}}{\alpha^{1/4}t}}\right)^{4/3}, \left({\frac{\beta}{\sqrt{\alpha}\lambda_{\min}(\textbf{X}^*)t}}\right)^{2}}}\right) ,$$ where $\textbf{X}^*$ is the optimal solution. To the best of our knowledge, this is the first result that attains provably faster convergence rates for a CG variant for optimization over the spectrahedron. We also present encouraging preliminary empirical results. |
Tasks | |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06203v1 |
http://arxiv.org/pdf/1605.06203v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-projection-free-convex-optimization |
Repo | |
Framework | |
Unsupervised Classification in Hyperspectral Imagery with Nonlocal Total Variation and Primal-Dual Hybrid Gradient Algorithm
Title | Unsupervised Classification in Hyperspectral Imagery with Nonlocal Total Variation and Primal-Dual Hybrid Gradient Algorithm |
Authors | Wei Zhu, Victoria Chayes, Alexandre Tiard, Stephanie Sanchez, Devin Dahlberg, Andrea L. Bertozzi, Stanley Osher, Dominique Zosso, Da Kuang |
Abstract | In this paper, a graph-based nonlocal total variation method (NLTV) is proposed for unsupervised classification of hyperspectral images (HSI). The variational problem is solved by the primal-dual hybrid gradient (PDHG) algorithm. By squaring the labeling function and using a stable simplex clustering routine, an unsupervised clustering method with random initialization can be implemented. The effectiveness of this proposed algorithm is illustrated on both synthetic and real-world HSI, and numerical results show that the proposed algorithm outperforms other standard unsupervised clustering methods such as spherical K-means, nonnegative matrix factorization (NMF), and the graph-based Merriman-Bence-Osher (MBO) scheme. |
Tasks | Classification Of Hyperspectral Images |
Published | 2016-04-27 |
URL | http://arxiv.org/abs/1604.08182v2 |
http://arxiv.org/pdf/1604.08182v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-classification-in-hyperspectral |
Repo | |
Framework | |
The LAMBADA dataset: Word prediction requiring a broad discourse context
Title | The LAMBADA dataset: Word prediction requiring a broad discourse context |
Authors | Denis Paperno, Germán Kruszewski, Angeliki Lazaridou, Quan Ngoc Pham, Raffaella Bernardi, Sandro Pezzelle, Marco Baroni, Gemma Boleda, Raquel Fernández |
Abstract | We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the whole passage, but not if they only see the last sentence preceding the target word. To succeed on LAMBADA, computational models cannot simply rely on local context, but must be able to keep track of information in the broader discourse. We show that LAMBADA exemplifies a wide range of linguistic phenomena, and that none of several state-of-the-art language models reaches accuracy above 1% on this novel benchmark. We thus propose LAMBADA as a challenging test set, meant to encourage the development of new models capable of genuine understanding of broad context in natural language text. |
Tasks | |
Published | 2016-06-20 |
URL | http://arxiv.org/abs/1606.06031v1 |
http://arxiv.org/pdf/1606.06031v1.pdf | |
PWC | https://paperswithcode.com/paper/the-lambada-dataset-word-prediction-requiring |
Repo | |
Framework | |
A Greedy Approach for Budgeted Maximum Inner Product Search
Title | A Greedy Approach for Budgeted Maximum Inner Product Search |
Authors | Hsiang-Fu Yu, Cho-Jui Hsieh, Qi Lei, Inderjit S. Dhillon |
Abstract | Maximum Inner Product Search (MIPS) is an important task in many machine learning applications such as the prediction phase of a low-rank matrix factorization model for a recommender system. There have been some works on how to perform MIPS in sub-linear time recently. However, most of them do not have the flexibility to control the trade-off between search efficient and search quality. In this paper, we study the MIPS problem with a computational budget. By carefully studying the problem structure of MIPS, we develop a novel Greedy-MIPS algorithm, which can handle budgeted MIPS by design. While simple and intuitive, Greedy-MIPS yields surprisingly superior performance compared to state-of-the-art approaches. As a specific example, on a candidate set containing half a million vectors of dimension 200, Greedy-MIPS runs 200x faster than the naive approach while yielding search results with the top-5 precision greater than 75%. |
Tasks | Recommendation Systems |
Published | 2016-10-11 |
URL | http://arxiv.org/abs/1610.03317v1 |
http://arxiv.org/pdf/1610.03317v1.pdf | |
PWC | https://paperswithcode.com/paper/a-greedy-approach-for-budgeted-maximum-inner |
Repo | |
Framework | |