May 6, 2019

2802 words 14 mins read

Paper Group ANR 314

Paper Group ANR 314

Domain Adaptation with Soft-margin multiple feature-kernel learning beats Deep Learning for surveillance face recognition. The CUDA LATCH Binary Descriptor: Because Sometimes Faster Means Better. Inducing Multilingual Text Analysis Tools Using Bidirectional Recurrent Neural Networks. Weakly Supervised Learning of Affordances. Machine Learned Resume …

Domain Adaptation with Soft-margin multiple feature-kernel learning beats Deep Learning for surveillance face recognition

Title Domain Adaptation with Soft-margin multiple feature-kernel learning beats Deep Learning for surveillance face recognition
Authors Samik Banerjee, Sukhendu Das
Abstract Face recognition (FR) is the most preferred mode for biometric-based surveillance, due to its passive nature of detecting subjects, amongst all different types of biometric traits. FR under surveillance scenario does not give satisfactory performance due to low contrast, noise and poor illumination conditions on probes, as compared to the training samples. A state-of-the-art technology, Deep Learning, even fails to perform well in these scenarios. We propose a novel soft-margin based learning method for multiple feature-kernel combinations, followed by feature transformed using Domain Adaptation, which outperforms many recent state-of-the-art techniques, when tested using three real-world surveillance face datasets.
Tasks Domain Adaptation, Face Recognition
Published 2016-10-05
URL http://arxiv.org/abs/1610.01374v2
PDF http://arxiv.org/pdf/1610.01374v2.pdf
PWC https://paperswithcode.com/paper/domain-adaptation-with-soft-margin-multiple
Repo
Framework

The CUDA LATCH Binary Descriptor: Because Sometimes Faster Means Better

Title The CUDA LATCH Binary Descriptor: Because Sometimes Faster Means Better
Authors Christopher Parker, Matthew Daiter, Kareem Omar, Gil Levi, Tal Hassner
Abstract Accuracy, descriptor size, and the time required for extraction and matching are all important factors when selecting local image descriptors. To optimize over all these requirements, this paper presents a CUDA port for the recent Learned Arrangement of Three Patches (LATCH) binary descriptors to the GPU platform. The design of LATCH makes it well suited for GPU processing. Owing to its small size and binary nature, the GPU can further be used to efficiently match LATCH features. Taken together, this leads to breakneck descriptor extraction and matching speeds. We evaluate the trade off between these speeds and the quality of results in a feature matching intensive application. To this end, we use our proposed CUDA LATCH (CLATCH) to recover structure from motion (SfM), comparing 3D reconstructions and speed using different representations. Our results show that CLATCH provides high quality 3D reconstructions at fractions of the time required by other representations, with little, if any, loss of reconstruction quality.
Tasks
Published 2016-09-13
URL http://arxiv.org/abs/1609.03986v2
PDF http://arxiv.org/pdf/1609.03986v2.pdf
PWC https://paperswithcode.com/paper/the-cuda-latch-binary-descriptor-because
Repo
Framework

Inducing Multilingual Text Analysis Tools Using Bidirectional Recurrent Neural Networks

Title Inducing Multilingual Text Analysis Tools Using Bidirectional Recurrent Neural Networks
Authors Othman Zennaki, Nasredine Semmar, Laurent Besacier
Abstract This work focuses on the rapid development of linguistic annotation tools for resource-poor languages. We experiment several cross-lingual annotation projection methods using Recurrent Neural Networks (RNN) models. The distinctive feature of our approach is that our multilingual word representation requires only a parallel corpus between the source and target language. More precisely, our method has the following characteristics: (a) it does not use word alignment information, (b) it does not assume any knowledge about foreign languages, which makes it applicable to a wide range of resource-poor languages, (c) it provides truly multilingual taggers. We investigate both uni- and bi-directional RNN models and propose a method to include external information (for instance low level information from POS) in the RNN to train higher level taggers (for instance, super sense taggers). We demonstrate the validity and genericity of our model by using parallel corpora (obtained by manual or automatic translation). Our experiments are conducted to induce cross-lingual POS and super sense taggers.
Tasks Word Alignment
Published 2016-09-29
URL http://arxiv.org/abs/1609.09382v1
PDF http://arxiv.org/pdf/1609.09382v1.pdf
PWC https://paperswithcode.com/paper/inducing-multilingual-text-analysis-tools
Repo
Framework

Weakly Supervised Learning of Affordances

Title Weakly Supervised Learning of Affordances
Authors Abhilash Srikantha, Juergen Gall
Abstract Localizing functional regions of objects or affordances is an important aspect of scene understanding. In this work, we cast the problem of affordance segmentation as that of semantic image segmentation. In order to explore various levels of supervision, we introduce a pixel-annotated affordance dataset of 3090 images containing 9916 object instances with rich contextual information in terms of human-object interactions. We use a deep convolutional neural network within an expectation maximization framework to take advantage of weakly labeled data like image level annotations or keypoint annotations. We show that a further reduction in supervision is possible with a minimal loss in performance when human pose is used as context.
Tasks Human-Object Interaction Detection, Scene Understanding, Semantic Segmentation
Published 2016-05-10
URL http://arxiv.org/abs/1605.02964v2
PDF http://arxiv.org/pdf/1605.02964v2.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-learning-of-affordances
Repo
Framework

Machine Learned Resume-Job Matching Solution

Title Machine Learned Resume-Job Matching Solution
Authors Yiou Lin, Hang Lei, Prince Clement Addo, Xiaoyu Li
Abstract Job search through online matching engines nowadays are very prominent and beneficial to both job seekers and employers. But the solutions of traditional engines without understanding the semantic meanings of different resumes have not kept pace with the incredible changes in machine learning techniques and computing capability. These solutions are usually driven by manual rules and predefined weights of keywords which lead to an inefficient and frustrating search experience. To this end, we present a machine learned solution with rich features and deep learning methods. Our solution includes three configurable modules that can be plugged with little restrictions. Namely, unsupervised feature extraction, base classifiers training and ensemble method learning. In our solution, rather than using manual rules, machine learned methods to automatically detect the semantic similarity of positions are proposed. Then four competitive “shallow” estimators and “deep” estimators are selected. Finally, ensemble methods to bag these estimators and aggregate their individual predictions to form a final prediction are verified. Experimental results of over 47 thousand resumes show that our solution can significantly improve the predication precision current position, salary, educational background and company scale.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2016-07-26
URL http://arxiv.org/abs/1607.07657v1
PDF http://arxiv.org/pdf/1607.07657v1.pdf
PWC https://paperswithcode.com/paper/machine-learned-resume-job-matching-solution
Repo
Framework

On the Diffusion Geometry of Graph Laplacians and Applications

Title On the Diffusion Geometry of Graph Laplacians and Applications
Authors Xiuyuan Cheng, Manas Rachh, Stefan Steinerberger
Abstract We study directed, weighted graphs $G=(V,E)$ and consider the (not necessarily symmetric) averaging operator $$ (\mathcal{L}u)(i) = -\sum_{j \sim_{} i}{p_{ij} (u(j) - u(i))},$$ where $p_{ij}$ are normalized edge weights. Given a vertex $i \in V$, we define the diffusion distance to a set $B \subset V$ as the smallest number of steps $d_{B}(i) \in \mathbb{N}$ required for half of all random walks started in $i$ and moving randomly with respect to the weights $p_{ij}$ to visit $B$ within $d_{B}(i)$ steps. Our main result is that the eigenfunctions interact nicely with this notion of distance. In particular, if $u$ satisfies $\mathcal{L}u = \lambda u$ on $V$ and $$ B = \left{ i \in V: - \varepsilon \leq u(i) \leq \varepsilon \right} \neq \emptyset,$$ then, for all $i \in V$, $$ d_{B}(i) \log{\left( \frac{1}{1-\lambda} \right) } \geq \log{\left( \frac{ u(i) }{\u_{L^{\infty}}} \right)} - \log{\left(\frac{1}{2} + \varepsilon\right)}.$$ $d_B(i)$ is a remarkably good approximation of $u$ in the sense of having very high correlation. The result implies that the classical one-dimensional spectral embedding preserves particular aspects of geometry in the presence of clustered data. We also give a continuous variant of the result which has a connection to the hot spots conjecture.
Tasks
Published 2016-11-09
URL http://arxiv.org/abs/1611.03033v1
PDF http://arxiv.org/pdf/1611.03033v1.pdf
PWC https://paperswithcode.com/paper/on-the-diffusion-geometry-of-graph-laplacians
Repo
Framework

End-to-End Deep Reinforcement Learning for Lane Keeping Assist

Title End-to-End Deep Reinforcement Learning for Lane Keeping Assist
Authors Ahmad El Sallab, Mohammed Abdou, Etienne Perot, Senthil Yogamani
Abstract Reinforcement learning is considered to be a strong AI paradigm which can be used to teach machines through interaction with the environment and learning from their mistakes, but it has not yet been successfully used for automotive applications. There has recently been a revival of interest in the topic, however, driven by the ability of deep learning algorithms to learn good representations of the environment. Motivated by Google DeepMind’s successful demonstrations of learning for games from Breakout to Go, we will propose different methods for autonomous driving using deep reinforcement learning. This is of particular interest as it is difficult to pose autonomous driving as a supervised learning problem as it has a strong interaction with the environment including other vehicles, pedestrians and roadworks. As this is a relatively new area of research for autonomous driving, we will formulate two main categories of algorithms: 1) Discrete actions category, and 2) Continuous actions category. For the discrete actions category, we will deal with Deep Q-Network Algorithm (DQN) while for the continuous actions category, we will deal with Deep Deterministic Actor Critic Algorithm (DDAC). In addition to that, We will also discover the performance of these two categories on an open source car simulator for Racing called (TORCS) which stands for The Open Racing car Simulator. Our simulation results demonstrate learning of autonomous maneuvering in a scenario of complex road curvatures and simple interaction with other vehicles. Finally, we explain the effect of some restricted conditions, put on the car during the learning phase, on the convergence time for finishing its learning phase.
Tasks Autonomous Driving
Published 2016-12-13
URL http://arxiv.org/abs/1612.04340v1
PDF http://arxiv.org/pdf/1612.04340v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-deep-reinforcement-learning-for
Repo
Framework

Asymptotically exact inference in differentiable generative models

Title Asymptotically exact inference in differentiable generative models
Authors Matthew M. Graham, Amos J. Storkey
Abstract Many generative models can be expressed as a differentiable function of random inputs drawn from some simple probability density. This framework includes both deep generative architectures such as Variational Autoencoders and a large class of procedurally defined simulator models. We present a method for performing efficient MCMC inference in such models when conditioning on observations of the model output. For some models this offers an asymptotically exact inference method where Approximate Bayesian Computation might otherwise be employed. We use the intuition that inference corresponds to integrating a density across the manifold corresponding to the set of inputs consistent with the observed outputs. This motivates the use of a constrained variant of Hamiltonian Monte Carlo which leverages the smooth geometry of the manifold to coherently move between inputs exactly consistent with observations. We validate the method by performing inference tasks in a diverse set of models.
Tasks
Published 2016-05-25
URL http://arxiv.org/abs/1605.07826v4
PDF http://arxiv.org/pdf/1605.07826v4.pdf
PWC https://paperswithcode.com/paper/asymptotically-exact-inference-in
Repo
Framework

High-Dimensional $L_2$Boosting: Rate of Convergence

Title High-Dimensional $L_2$Boosting: Rate of Convergence
Authors Ye Luo, Martin Spindler
Abstract Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of $L_2$Boosting, which is tailored for regression, in a high-dimensional setting. Moreover, we introduce so-called \textquotedblleft post-Boosting\textquotedblright. This is a post-selection estimator which applies ordinary least squares to the variables selected in the first stage by $L_2$Boosting. Another variant is \textquotedblleft Orthogonal Boosting\textquotedblright\ where after each step an orthogonal projection is conducted. We show that both post-$L_2$Boosting and the orthogonal boosting achieve the same rate of convergence as LASSO in a sparse, high-dimensional setting. We show that the rate of convergence of the classical $L_2$Boosting depends on the design matrix described by a sparse eigenvalue constant. To show the latter results, we derive new approximation results for the pure greedy algorithm, based on analyzing the revisiting behavior of $L_2$Boosting. We also introduce feasible rules for early stopping, which can be easily implemented and used in applied work. Our results also allow a direct comparison between LASSO and boosting which has been missing from the literature. Finally, we present simulation studies and applications to illustrate the relevance of our theoretical results and to provide insights into the practical aspects of boosting. In these simulation studies, post-$L_2$Boosting clearly outperforms LASSO.
Tasks
Published 2016-02-29
URL http://arxiv.org/abs/1602.08927v2
PDF http://arxiv.org/pdf/1602.08927v2.pdf
PWC https://paperswithcode.com/paper/high-dimensional-l_2boosting-rate-of
Repo
Framework

Approximate search with quantized sparse representations

Title Approximate search with quantized sparse representations
Authors Himalaya Jain, Patrick Pérez, Rémi Gribonval, Joaquin Zepeda, Hervé Jégou
Abstract This paper tackles the task of storing a large collection of vectors, such as visual descriptors, and of searching in it. To this end, we propose to approximate database vectors by constrained sparse coding, where possible atom weights are restricted to belong to a finite subset. This formulation encompasses, as particular cases, previous state-of-the-art methods such as product or residual quantization. As opposed to traditional sparse coding methods, quantized sparse coding includes memory usage as a design constraint, thereby allowing us to index a large collection such as the BIGANN billion-sized benchmark. Our experiments, carried out on standard benchmarks, show that our formulation leads to competitive solutions when considering different trade-offs between learning/coding time, index size and search quality.
Tasks Quantization
Published 2016-08-10
URL http://arxiv.org/abs/1608.03308v1
PDF http://arxiv.org/pdf/1608.03308v1.pdf
PWC https://paperswithcode.com/paper/approximate-search-with-quantized-sparse
Repo
Framework

The IBM 2016 English Conversational Telephone Speech Recognition System

Title The IBM 2016 English Conversational Telephone Speech Recognition System
Authors George Saon, Tom Sercu, Steven Rennie, Hong-Kwang J. Kuo
Abstract We describe a collection of acoustic and language modeling techniques that lowered the word error rate of our English conversational telephone LVCSR system to a record 6.6% on the Switchboard subset of the Hub5 2000 evaluation testset. On the acoustic side, we use a score fusion of three strong models: recurrent nets with maxout activations, very deep convolutional nets with 3x3 kernels, and bidirectional long short-term memory nets which operate on FMLLR and i-vector features. On the language modeling side, we use an updated model “M” and hierarchical neural network LMs.
Tasks Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition
Published 2016-04-27
URL http://arxiv.org/abs/1604.08242v2
PDF http://arxiv.org/pdf/1604.08242v2.pdf
PWC https://paperswithcode.com/paper/the-ibm-2016-english-conversational-telephone
Repo
Framework

On interestingness measures of formal concepts

Title On interestingness measures of formal concepts
Authors Sergei O. Kuznetsov, Tatiana Makhalova
Abstract Formal concepts and closed itemsets proved to be of big importance for knowledge discovery, both as a tool for concise representation of association rules and a tool for clustering and constructing domain taxonomies and ontologies. Exponential explosion makes it difficult to consider the whole concept lattice arising from data, one needs to select most useful and interesting concepts. In this paper interestingness measures of concepts are considered and compared with respect to various aspects, such as efficiency of computation and applicability to noisy data and performing ranking correlation.
Tasks
Published 2016-11-08
URL http://arxiv.org/abs/1611.02646v2
PDF http://arxiv.org/pdf/1611.02646v2.pdf
PWC https://paperswithcode.com/paper/on-interestingness-measures-of-formal
Repo
Framework

CB2CF: A Neural Multiview Content-to-Collaborative Filtering Model for Completely Cold Item Recommendations

Title CB2CF: A Neural Multiview Content-to-Collaborative Filtering Model for Completely Cold Item Recommendations
Authors Oren Barkan, Noam Koenigstein, Eylon Yogev, Ori Katz
Abstract In Recommender Systems research, algorithms are often characterized as either Collaborative Filtering (CF) or Content Based (CB). CF algorithms are trained using a dataset of user preferences while CB algorithms are typically based on item profiles. These approaches harness different data sources and therefore the resulting recommended items are generally very different. This paper presents the CB2CF, a deep neural multiview model that serves as a bridge from items content into their CF representations. CB2CF is a real-world algorithm designed for Microsoft Store services that handle around a billion users worldwide. CB2CF is demonstrated on movies and apps recommendations, where it is shown to outperform an alternative CB model on completely cold items.
Tasks Recommendation Systems
Published 2016-11-01
URL https://arxiv.org/abs/1611.00384v2
PDF https://arxiv.org/pdf/1611.00384v2.pdf
PWC https://paperswithcode.com/paper/the-deep-journey-from-content-to
Repo
Framework

Reading Comprehension using Entity-based Memory Network

Title Reading Comprehension using Entity-based Memory Network
Authors Xun Wang, Katsuhito Sudoh, Masaaki Nagata, Tomohide Shibata, Daisuke Kawahara, Sadao Kurohashi
Abstract This paper introduces a novel neural network model for question answering, the \emph{entity-based memory network}. It enhances neural networks’ ability of representing and calculating information over a long period by keeping records of entities contained in text. The core component is a memory pool which comprises entities’ states. These entities’ states are continuously updated according to the input text. Questions with regard to the input text are used to search the memory pool for related entities and answers are further predicted based on the states of retrieved entities. Compared with previous memory network models, the proposed model is capable of handling fine-grained information and more sophisticated relations based on entities. We formulated several different tasks as question answering problems and tested the proposed model. Experiments reported satisfying results.
Tasks Question Answering, Reading Comprehension
Published 2016-12-12
URL http://arxiv.org/abs/1612.03551v3
PDF http://arxiv.org/pdf/1612.03551v3.pdf
PWC https://paperswithcode.com/paper/reading-comprehension-using-entity-based
Repo
Framework

Optimal bandwidth estimation for a fast manifold learning algorithm to detect circular structure in high-dimensional data

Title Optimal bandwidth estimation for a fast manifold learning algorithm to detect circular structure in high-dimensional data
Authors Susovan Pal, Praneeth Vepakomma
Abstract We provide a way to infer about existence of topological circularity in high-dimensional data sets in $\mathbb{R}^d$ from its projection in $\mathbb{R}^2$ obtained through a fast manifold learning map as a function of the high-dimensional dataset $\mathbb{X}$ and a particular choice of a positive real $\sigma$ known as bandwidth parameter. At the same time we also provide a way to estimate the optimal bandwidth for fast manifold learning in this setting through minimization of these functions of bandwidth. We also provide limit theorems to characterize the behavior of our proposed functions of bandwidth.
Tasks
Published 2016-12-28
URL http://arxiv.org/abs/1612.08932v1
PDF http://arxiv.org/pdf/1612.08932v1.pdf
PWC https://paperswithcode.com/paper/optimal-bandwidth-estimation-for-a-fast
Repo
Framework
comments powered by Disqus