Paper Group ANR 492
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition. Target-Side Context for Discriminative Models in Statistical Machine Translation. Solve-Select-Scale: A Three Step Process For Sparse Signal Estimation. Learning to Learn Neural Networks. Predicting Drug Interactions and Mutagenicity with Ensemble Classi …
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition
Title | Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition |
Authors | Xiao Liu, Jiang Wang, Shilei Wen, Errui Ding, Yuanqing Lin |
Abstract | A key challenge in fine-grained recognition is how to find and represent discriminative local regions. Recent attention models are capable of learning discriminative region localizers only from category labels with reinforcement learning. However, not utilizing any explicit part information, they are not able to accurately find multiple distinctive regions. In this work, we introduce an attribute-guided attention localization scheme where the local region localizers are learned under the guidance of part attribute descriptions. By designing a novel reward strategy, we are able to learn to locate regions that are spatially and semantically distinctive with reinforcement learning algorithm. The attribute labeling requirement of the scheme is more amenable than the accurate part location annotation required by traditional part-based fine-grained recognition methods. Experimental results on the CUB-200-2011 dataset demonstrate the superiority of the proposed scheme on both fine-grained recognition and attribute recognition. |
Tasks | |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06217v2 |
http://arxiv.org/pdf/1605.06217v2.pdf | |
PWC | https://paperswithcode.com/paper/localizing-by-describing-attribute-guided |
Repo | |
Framework | |
Target-Side Context for Discriminative Models in Statistical Machine Translation
Title | Target-Side Context for Discriminative Models in Statistical Machine Translation |
Authors | Aleš Tamchyna, Alexander Fraser, Ondřej Bojar, Marcin Junczys-Dowmunt |
Abstract | Discriminative translation models utilizing source context have been shown to help statistical machine translation performance. We propose a novel extension of this work using target context information. Surprisingly, we show that this model can be efficiently integrated directly in the decoding process. Our approach scales to large training data sizes and results in consistent improvements in translation quality on four language pairs. We also provide an analysis comparing the strengths of the baseline source-context model with our extended source-context and target-context model and we show that our extension allows us to better capture morphological coherence. Our work is freely available as part of Moses. |
Tasks | Machine Translation |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01149v1 |
http://arxiv.org/pdf/1607.01149v1.pdf | |
PWC | https://paperswithcode.com/paper/target-side-context-for-discriminative-models |
Repo | |
Framework | |
Solve-Select-Scale: A Three Step Process For Sparse Signal Estimation
Title | Solve-Select-Scale: A Three Step Process For Sparse Signal Estimation |
Authors | Mithun Das Gupta |
Abstract | In the theory of compressed sensing (CS), the sparsity $\x_0$ of the unknown signal $\mathbf{x} \in \mathcal{R}^n$ is of prime importance and the focus of reconstruction algorithms has mainly been either $\x_0$ or its convex relaxation (via $\x_1$). However, it is typically unknown in practice and has remained a challenge when nothing about the size of the support is known. As pointed recently, $\x_0$ might not be the best metric to minimize directly, both due to its inherent complexity as well as its noise performance. Recently a novel stable measure of sparsity $s(\mathbf{x}) := \mathbf{x}_1^2/\mathbf{x}_2^2$ has been investigated by Lopes \cite{Lopes2012}, which is a sharp lower bound on $\mathbf{x}_0$. The estimation procedure for this measure uses only a small number of linear measurements, does not rely on any sparsity assumptions, and requires very little computation. The usage of the quantity $s(\mathbf{x})$ in sparse signal estimation problems has not received much importance yet. We develop the idea of incorporating $s(\mathbf{x})$ into the signal estimation framework. We also provide a three step algorithm to solve problems of the form $\mathbf{Ax=b}$ with no additional assumptions on the original signal $\mathbf{x}$. |
Tasks | |
Published | 2016-05-16 |
URL | http://arxiv.org/abs/1605.04657v1 |
http://arxiv.org/pdf/1605.04657v1.pdf | |
PWC | https://paperswithcode.com/paper/solve-select-scale-a-three-step-process-for |
Repo | |
Framework | |
Learning to Learn Neural Networks
Title | Learning to Learn Neural Networks |
Authors | Tom Bosc |
Abstract | Meta-learning consists in learning learning algorithms. We use a Long Short Term Memory (LSTM) based network to learn to compute on-line updates of the parameters of another neural network. These parameters are stored in the cell state of the LSTM. Our framework allows to compare learned algorithms to hand-made algorithms within the traditional train and test methodology. In an experiment, we learn a learning algorithm for a one-hidden layer Multi-Layer Perceptron (MLP) on non-linearly separable datasets. The learned algorithm is able to update parameters of both layers and generalise well on similar datasets. |
Tasks | Meta-Learning |
Published | 2016-10-19 |
URL | http://arxiv.org/abs/1610.06072v1 |
http://arxiv.org/pdf/1610.06072v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-learn-neural-networks |
Repo | |
Framework | |
Predicting Drug Interactions and Mutagenicity with Ensemble Classifiers on Subgraphs of Molecules
Title | Predicting Drug Interactions and Mutagenicity with Ensemble Classifiers on Subgraphs of Molecules |
Authors | Andrew Schaumberg, Angela Yu, Tatsuhiro Koshi, Xiaochan Zong, Santoshkalyan Rayadhurgam |
Abstract | In this study, we intend to solve a mutual information problem in interacting molecules of any type, such as proteins, nucleic acids, and small molecules. Using machine learning techniques, we accurately predict pairwise interactions, which can be of medical and biological importance. Graphs are are useful in this problem for their generality to all types of molecules, due to the inherent association of atoms through atomic bonds. Subgraphs can represent different molecular domains. These domains can be biologically significant as most molecules only have portions that are of functional significance and can interact with other domains. Thus, we use subgraphs as features in different machine learning algorithms to predict if two drugs interact and predict potential single molecule effects. |
Tasks | |
Published | 2016-01-27 |
URL | http://arxiv.org/abs/1601.07233v1 |
http://arxiv.org/pdf/1601.07233v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-drug-interactions-and-mutagenicity |
Repo | |
Framework | |
Object-Centric Representation Learning from Unlabeled Videos
Title | Object-Centric Representation Learning from Unlabeled Videos |
Authors | Ruohan Gao, Dinesh Jayaraman, Kristen Grauman |
Abstract | Supervised (pre-)training currently yields state-of-the-art performance for representation learning for visual recognition, yet it comes at the cost of (1) intensive manual annotations and (2) an inherent restriction in the scope of data relevant for learning. In this work, we explore unsupervised feature learning from unlabeled video. We introduce a novel object-centric approach to temporal coherence that encourages similar representations to be learned for object-like regions segmented from nearby frames. Our framework relies on a Siamese-triplet network to train a deep convolutional neural network (CNN) representation. Compared to existing temporal coherence methods, our idea has the advantage of lightweight preprocessing of the unlabeled video (no tracking required) while still being able to extract object-level regions from which to learn invariances. Furthermore, as we show in results on several standard datasets, our method typically achieves substantial accuracy gains over competing unsupervised methods for image classification and retrieval tasks. |
Tasks | Image Classification, Representation Learning |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00500v1 |
http://arxiv.org/pdf/1612.00500v1.pdf | |
PWC | https://paperswithcode.com/paper/object-centric-representation-learning-from |
Repo | |
Framework | |
Wavelet Scattering on the Pitch Spiral
Title | Wavelet Scattering on the Pitch Spiral |
Authors | Vincent Lostanlen, Stéphane Mallat |
Abstract | We present a new representation of harmonic sounds that linearizes the dynamics of pitch and spectral envelope, while remaining stable to deformations in the time-frequency plane. It is an instance of the scattering transform, a generic operator which cascades wavelet convolutions and modulus nonlinearities. It is derived from the pitch spiral, in that convolutions are successively performed in time, log-frequency, and octave index. We give a closed-form approximation of spiral scattering coefficients for a nonstationary generalization of the harmonic source-filter model. |
Tasks | |
Published | 2016-01-03 |
URL | http://arxiv.org/abs/1601.00287v1 |
http://arxiv.org/pdf/1601.00287v1.pdf | |
PWC | https://paperswithcode.com/paper/wavelet-scattering-on-the-pitch-spiral |
Repo | |
Framework | |
A Variational Perspective on Accelerated Methods in Optimization
Title | A Variational Perspective on Accelerated Methods in Optimization |
Authors | Andre Wibisono, Ashia C. Wilson, Michael I. Jordan |
Abstract | Accelerated gradient methods play a central role in optimization, achieving optimal rates in many settings. While many generalizations and extensions of Nesterov’s original acceleration method have been proposed, it is not yet clear what is the natural scope of the acceleration concept. In this paper, we study accelerated methods from a continuous-time perspective. We show that there is a Lagrangian functional that we call the \emph{Bregman Lagrangian} which generates a large class of accelerated methods in continuous time, including (but not limited to) accelerated gradient descent, its non-Euclidean extension, and accelerated higher-order gradient methods. We show that the continuous-time limit of all of these methods correspond to traveling the same curve in spacetime at different speeds. From this perspective, Nesterov’s technique and many of its generalizations can be viewed as a systematic way to go from the continuous-time curves generated by the Bregman Lagrangian to a family of discrete-time accelerated algorithms. |
Tasks | |
Published | 2016-03-14 |
URL | http://arxiv.org/abs/1603.04245v1 |
http://arxiv.org/pdf/1603.04245v1.pdf | |
PWC | https://paperswithcode.com/paper/a-variational-perspective-on-accelerated |
Repo | |
Framework | |
Streaming Algorithms for News and Scientific Literature Recommendation: Submodular Maximization with a d-Knapsack Constraint
Title | Streaming Algorithms for News and Scientific Literature Recommendation: Submodular Maximization with a d-Knapsack Constraint |
Authors | Qilian Yu, Easton Li Xu, Shuguang Cui |
Abstract | Submodular maximization problems belong to the family of combinatorial optimization problems and enjoy wide applications. In this paper, we focus on the problem of maximizing a monotone submodular function subject to a $d$-knapsack constraint, for which we propose a streaming algorithm that achieves a $\left(\frac{1}{1+2d}-\epsilon\right)$-approximation of the optimal value, while it only needs one single pass through the dataset without storing all the data in the memory. In our experiments, we extensively evaluate the effectiveness of our proposed algorithm via two applications: news recommendation and scientific literature recommendation. It is observed that the proposed streaming algorithm achieves both execution speedup and memory saving by several orders of magnitude, compared with existing approaches. |
Tasks | Combinatorial Optimization |
Published | 2016-03-17 |
URL | http://arxiv.org/abs/1603.05614v3 |
http://arxiv.org/pdf/1603.05614v3.pdf | |
PWC | https://paperswithcode.com/paper/streaming-algorithms-for-news-and-scientific |
Repo | |
Framework | |
Two-stream convolutional neural network for accurate RGB-D fingertip detection using depth and edge information
Title | Two-stream convolutional neural network for accurate RGB-D fingertip detection using depth and edge information |
Authors | Hengkai Guo, Guijin Wang, Xinghao Chen |
Abstract | Accurate detection of fingertips in depth image is critical for human-computer interaction. In this paper, we present a novel two-stream convolutional neural network (CNN) for RGB-D fingertip detection. Firstly edge image is extracted from raw depth image using random forest. Then the edge information is combined with depth information in our CNN structure. We study several fusion approaches and suggest a slow fusion strategy as a promising way of fingertip detection. As shown in our experiments, our real-time algorithm outperforms state-of-the-art fingertip detection methods on the public dataset HandNet with an average 3D error of 9.9mm, and shows comparable accuracy of fingertip estimation on NYU hand dataset. |
Tasks | |
Published | 2016-12-23 |
URL | http://arxiv.org/abs/1612.07978v1 |
http://arxiv.org/pdf/1612.07978v1.pdf | |
PWC | https://paperswithcode.com/paper/two-stream-convolutional-neural-network-for |
Repo | |
Framework | |
Gamblets for opening the complexity-bottleneck of implicit schemes for hyperbolic and parabolic ODEs/PDEs with rough coefficients
Title | Gamblets for opening the complexity-bottleneck of implicit schemes for hyperbolic and parabolic ODEs/PDEs with rough coefficients |
Authors | Houman Owhadi, Lei Zhang |
Abstract | Implicit schemes are popular methods for the integration of time dependent PDEs such as hyperbolic and parabolic PDEs. However the necessity to solve corresponding linear systems at each time step constitutes a complexity bottleneck in their application to PDEs with rough coefficients. We present a generalization of gamblets introduced in \cite{OwhadiMultigrid:2015} enabling the resolution of these implicit systems in near-linear complexity and provide rigorous a-priori error bounds on the resulting numerical approximations of hyperbolic and parabolic PDEs. These generalized gamblets induce a multiresolution decomposition of the solution space that is adapted to both the underlying (hyperbolic and parabolic) PDE (and the system of ODEs resulting from space discretization) and to the time-steps of the numerical scheme. |
Tasks | |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07686v2 |
http://arxiv.org/pdf/1606.07686v2.pdf | |
PWC | https://paperswithcode.com/paper/gamblets-for-opening-the-complexity |
Repo | |
Framework | |
Inertial-Based Scale Estimation for Structure from Motion on Mobile Devices
Title | Inertial-Based Scale Estimation for Structure from Motion on Mobile Devices |
Authors | Janne Mustaniemi, Juho Kannala, Simo Särkkä, Jiri Matas, Janne Heikkilä |
Abstract | Structure from motion algorithms have an inherent limitation that the reconstruction can only be determined up to the unknown scale factor. Modern mobile devices are equipped with an inertial measurement unit (IMU), which can be used for estimating the scale of the reconstruction. We propose a method that recovers the metric scale given inertial measurements and camera poses. In the process, we also perform a temporal and spatial alignment of the camera and the IMU. Therefore, our solution can be easily combined with any existing visual reconstruction software. The method can cope with noisy camera pose estimates, typically caused by motion blur or rolling shutter artifacts, via utilizing a Rauch-Tung-Striebel (RTS) smoother. Furthermore, the scale estimation is performed in the frequency domain, which provides more robustness to inaccurate sensor time stamps and noisy IMU samples than the previously used time domain representation. In contrast to previous methods, our approach has no parameters that need to be tuned for achieving a good performance. In the experiments, we show that the algorithm outperforms the state-of-the-art in both accuracy and convergence speed of the scale estimate. The accuracy of the scale is around $1%$ from the ground truth depending on the recording. We also demonstrate that our method can improve the scale accuracy of the Project Tango’s build-in motion tracking. |
Tasks | |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09498v2 |
http://arxiv.org/pdf/1611.09498v2.pdf | |
PWC | https://paperswithcode.com/paper/inertial-based-scale-estimation-for-structure |
Repo | |
Framework | |
Deep Tensor Convolution on Multicores
Title | Deep Tensor Convolution on Multicores |
Authors | David Budden, Alexander Matveev, Shibani Santurkar, Shraman Ray Chaudhuri, Nir Shavit |
Abstract | Deep convolutional neural networks (ConvNets) of 3-dimensional kernels allow joint modeling of spatiotemporal features. These networks have improved performance of video and volumetric image analysis, but have been limited in size due to the low memory ceiling of GPU hardware. Existing CPU implementations overcome this constraint but are impractically slow. Here we extend and optimize the faster Winograd-class of convolutional algorithms to the $N$-dimensional case and specifically for CPU hardware. First, we remove the need to manually hand-craft algorithms by exploiting the relaxed constraints and cheap sparse access of CPU memory. Second, we maximize CPU utilization and multicore scalability by transforming data matrices to be cache-aware, integer multiples of AVX vector widths. Treating 2-dimensional ConvNets as a special (and the least beneficial) case of our approach, we demonstrate a 5 to 25-fold improvement in throughput compared to previous state-of-the-art. |
Tasks | |
Published | 2016-11-20 |
URL | http://arxiv.org/abs/1611.06565v3 |
http://arxiv.org/pdf/1611.06565v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-tensor-convolution-on-multicores |
Repo | |
Framework | |
Distributed Coding of Multiview Sparse Sources with Joint Recovery
Title | Distributed Coding of Multiview Sparse Sources with Joint Recovery |
Authors | Huynh Van Luong, Nikos Deligiannis, Søren Forchhammer, André Kaup |
Abstract | In support of applications involving multiview sources in distributed object recognition using lightweight cameras, we propose a new method for the distributed coding of sparse sources as visual descriptor histograms extracted from multiview images. The problem is challenging due to the computational and energy constraints at each camera as well as the limitations regarding inter-camera communication. Our approach addresses these challenges by exploiting the sparsity of the visual descriptor histograms as well as their intra- and inter-camera correlations. Our method couples distributed source coding of the sparse sources with a new joint recovery algorithm that incorporates multiple side information signals, where prior knowledge (low quality) of all the sparse sources is initially sent to exploit their correlations. Experimental evaluation using the histograms of shift-invariant feature transform (SIFT) descriptors extracted from multiview images shows that our method leads to bit-rate saving of up to 43% compared to the state-of-the-art distributed compressed sensing method with independent encoding of the sources. |
Tasks | Object Recognition |
Published | 2016-07-18 |
URL | http://arxiv.org/abs/1607.04965v1 |
http://arxiv.org/pdf/1607.04965v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-coding-of-multiview-sparse |
Repo | |
Framework | |
A Solution to Time-Varying Markov Decision Processes
Title | A Solution to Time-Varying Markov Decision Processes |
Authors | Lantao Liu, Gaurav S. Sukhatme |
Abstract | We consider a decision-making problem where the environment varies both in space and time. Such problems arise naturally when considering e.g., the navigation of an underwater robot amidst ocean currents or the navigation of an aerial vehicle in wind. To model such spatiotemporal variation, we extend the standard Markov Decision Process (MDP) to a new framework called the Time-Varying Markov Decision Process (TVMDP). The TVMDP has a time-varying state transition model and transforms the standard MDP that considers only immediate and static uncertainty descriptions of state transitions, to a framework that is able to adapt to future time-varying transition dynamics over some horizon. We show how to solve a TVMDP via a redesign of the MDP value propagation mechanisms by incorporating the introduced dynamics along the temporal dimension. We validate our framework in a marine robotics navigation setting using spatiotemporal ocean data and show that it outperforms prior efforts. |
Tasks | Decision Making |
Published | 2016-05-03 |
URL | http://arxiv.org/abs/1605.01018v3 |
http://arxiv.org/pdf/1605.01018v3.pdf | |
PWC | https://paperswithcode.com/paper/a-solution-to-time-varying-markov-decision |
Repo | |
Framework | |