Paper Group ANR 314
Domain Adaptation with Soft-margin multiple feature-kernel learning beats Deep Learning for surveillance face recognition. The CUDA LATCH Binary Descriptor: Because Sometimes Faster Means Better. Inducing Multilingual Text Analysis Tools Using Bidirectional Recurrent Neural Networks. Weakly Supervised Learning of Affordances. Machine Learned Resume …
Domain Adaptation with Soft-margin multiple feature-kernel learning beats Deep Learning for surveillance face recognition
Title | Domain Adaptation with Soft-margin multiple feature-kernel learning beats Deep Learning for surveillance face recognition |
Authors | Samik Banerjee, Sukhendu Das |
Abstract | Face recognition (FR) is the most preferred mode for biometric-based surveillance, due to its passive nature of detecting subjects, amongst all different types of biometric traits. FR under surveillance scenario does not give satisfactory performance due to low contrast, noise and poor illumination conditions on probes, as compared to the training samples. A state-of-the-art technology, Deep Learning, even fails to perform well in these scenarios. We propose a novel soft-margin based learning method for multiple feature-kernel combinations, followed by feature transformed using Domain Adaptation, which outperforms many recent state-of-the-art techniques, when tested using three real-world surveillance face datasets. |
Tasks | Domain Adaptation, Face Recognition |
Published | 2016-10-05 |
URL | http://arxiv.org/abs/1610.01374v2 |
http://arxiv.org/pdf/1610.01374v2.pdf | |
PWC | https://paperswithcode.com/paper/domain-adaptation-with-soft-margin-multiple |
Repo | |
Framework | |
The CUDA LATCH Binary Descriptor: Because Sometimes Faster Means Better
Title | The CUDA LATCH Binary Descriptor: Because Sometimes Faster Means Better |
Authors | Christopher Parker, Matthew Daiter, Kareem Omar, Gil Levi, Tal Hassner |
Abstract | Accuracy, descriptor size, and the time required for extraction and matching are all important factors when selecting local image descriptors. To optimize over all these requirements, this paper presents a CUDA port for the recent Learned Arrangement of Three Patches (LATCH) binary descriptors to the GPU platform. The design of LATCH makes it well suited for GPU processing. Owing to its small size and binary nature, the GPU can further be used to efficiently match LATCH features. Taken together, this leads to breakneck descriptor extraction and matching speeds. We evaluate the trade off between these speeds and the quality of results in a feature matching intensive application. To this end, we use our proposed CUDA LATCH (CLATCH) to recover structure from motion (SfM), comparing 3D reconstructions and speed using different representations. Our results show that CLATCH provides high quality 3D reconstructions at fractions of the time required by other representations, with little, if any, loss of reconstruction quality. |
Tasks | |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03986v2 |
http://arxiv.org/pdf/1609.03986v2.pdf | |
PWC | https://paperswithcode.com/paper/the-cuda-latch-binary-descriptor-because |
Repo | |
Framework | |
Inducing Multilingual Text Analysis Tools Using Bidirectional Recurrent Neural Networks
Title | Inducing Multilingual Text Analysis Tools Using Bidirectional Recurrent Neural Networks |
Authors | Othman Zennaki, Nasredine Semmar, Laurent Besacier |
Abstract | This work focuses on the rapid development of linguistic annotation tools for resource-poor languages. We experiment several cross-lingual annotation projection methods using Recurrent Neural Networks (RNN) models. The distinctive feature of our approach is that our multilingual word representation requires only a parallel corpus between the source and target language. More precisely, our method has the following characteristics: (a) it does not use word alignment information, (b) it does not assume any knowledge about foreign languages, which makes it applicable to a wide range of resource-poor languages, (c) it provides truly multilingual taggers. We investigate both uni- and bi-directional RNN models and propose a method to include external information (for instance low level information from POS) in the RNN to train higher level taggers (for instance, super sense taggers). We demonstrate the validity and genericity of our model by using parallel corpora (obtained by manual or automatic translation). Our experiments are conducted to induce cross-lingual POS and super sense taggers. |
Tasks | Word Alignment |
Published | 2016-09-29 |
URL | http://arxiv.org/abs/1609.09382v1 |
http://arxiv.org/pdf/1609.09382v1.pdf | |
PWC | https://paperswithcode.com/paper/inducing-multilingual-text-analysis-tools |
Repo | |
Framework | |
Weakly Supervised Learning of Affordances
Title | Weakly Supervised Learning of Affordances |
Authors | Abhilash Srikantha, Juergen Gall |
Abstract | Localizing functional regions of objects or affordances is an important aspect of scene understanding. In this work, we cast the problem of affordance segmentation as that of semantic image segmentation. In order to explore various levels of supervision, we introduce a pixel-annotated affordance dataset of 3090 images containing 9916 object instances with rich contextual information in terms of human-object interactions. We use a deep convolutional neural network within an expectation maximization framework to take advantage of weakly labeled data like image level annotations or keypoint annotations. We show that a further reduction in supervision is possible with a minimal loss in performance when human pose is used as context. |
Tasks | Human-Object Interaction Detection, Scene Understanding, Semantic Segmentation |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.02964v2 |
http://arxiv.org/pdf/1605.02964v2.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-learning-of-affordances |
Repo | |
Framework | |
Machine Learned Resume-Job Matching Solution
Title | Machine Learned Resume-Job Matching Solution |
Authors | Yiou Lin, Hang Lei, Prince Clement Addo, Xiaoyu Li |
Abstract | Job search through online matching engines nowadays are very prominent and beneficial to both job seekers and employers. But the solutions of traditional engines without understanding the semantic meanings of different resumes have not kept pace with the incredible changes in machine learning techniques and computing capability. These solutions are usually driven by manual rules and predefined weights of keywords which lead to an inefficient and frustrating search experience. To this end, we present a machine learned solution with rich features and deep learning methods. Our solution includes three configurable modules that can be plugged with little restrictions. Namely, unsupervised feature extraction, base classifiers training and ensemble method learning. In our solution, rather than using manual rules, machine learned methods to automatically detect the semantic similarity of positions are proposed. Then four competitive “shallow” estimators and “deep” estimators are selected. Finally, ensemble methods to bag these estimators and aggregate their individual predictions to form a final prediction are verified. Experimental results of over 47 thousand resumes show that our solution can significantly improve the predication precision current position, salary, educational background and company scale. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2016-07-26 |
URL | http://arxiv.org/abs/1607.07657v1 |
http://arxiv.org/pdf/1607.07657v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learned-resume-job-matching-solution |
Repo | |
Framework | |
On the Diffusion Geometry of Graph Laplacians and Applications
Title | On the Diffusion Geometry of Graph Laplacians and Applications |
Authors | Xiuyuan Cheng, Manas Rachh, Stefan Steinerberger |
Abstract | We study directed, weighted graphs $G=(V,E)$ and consider the (not necessarily symmetric) averaging operator $$ (\mathcal{L}u)(i) = -\sum_{j \sim_{} i}{p_{ij} (u(j) - u(i))},$$ where $p_{ij}$ are normalized edge weights. Given a vertex $i \in V$, we define the diffusion distance to a set $B \subset V$ as the smallest number of steps $d_{B}(i) \in \mathbb{N}$ required for half of all random walks started in $i$ and moving randomly with respect to the weights $p_{ij}$ to visit $B$ within $d_{B}(i)$ steps. Our main result is that the eigenfunctions interact nicely with this notion of distance. In particular, if $u$ satisfies $\mathcal{L}u = \lambda u$ on $V$ and $$ B = \left{ i \in V: - \varepsilon \leq u(i) \leq \varepsilon \right} \neq \emptyset,$$ then, for all $i \in V$, $$ d_{B}(i) \log{\left( \frac{1}{1-\lambda} \right) } \geq \log{\left( \frac{ u(i) }{\u_{L^{\infty}}} \right)} - \log{\left(\frac{1}{2} + \varepsilon\right)}.$$ $d_B(i)$ is a remarkably good approximation of $u$ in the sense of having very high correlation. The result implies that the classical one-dimensional spectral embedding preserves particular aspects of geometry in the presence of clustered data. We also give a continuous variant of the result which has a connection to the hot spots conjecture. |
Tasks | |
Published | 2016-11-09 |
URL | http://arxiv.org/abs/1611.03033v1 |
http://arxiv.org/pdf/1611.03033v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-diffusion-geometry-of-graph-laplacians |
Repo | |
Framework | |
End-to-End Deep Reinforcement Learning for Lane Keeping Assist
Title | End-to-End Deep Reinforcement Learning for Lane Keeping Assist |
Authors | Ahmad El Sallab, Mohammed Abdou, Etienne Perot, Senthil Yogamani |
Abstract | Reinforcement learning is considered to be a strong AI paradigm which can be used to teach machines through interaction with the environment and learning from their mistakes, but it has not yet been successfully used for automotive applications. There has recently been a revival of interest in the topic, however, driven by the ability of deep learning algorithms to learn good representations of the environment. Motivated by Google DeepMind’s successful demonstrations of learning for games from Breakout to Go, we will propose different methods for autonomous driving using deep reinforcement learning. This is of particular interest as it is difficult to pose autonomous driving as a supervised learning problem as it has a strong interaction with the environment including other vehicles, pedestrians and roadworks. As this is a relatively new area of research for autonomous driving, we will formulate two main categories of algorithms: 1) Discrete actions category, and 2) Continuous actions category. For the discrete actions category, we will deal with Deep Q-Network Algorithm (DQN) while for the continuous actions category, we will deal with Deep Deterministic Actor Critic Algorithm (DDAC). In addition to that, We will also discover the performance of these two categories on an open source car simulator for Racing called (TORCS) which stands for The Open Racing car Simulator. Our simulation results demonstrate learning of autonomous maneuvering in a scenario of complex road curvatures and simple interaction with other vehicles. Finally, we explain the effect of some restricted conditions, put on the car during the learning phase, on the convergence time for finishing its learning phase. |
Tasks | Autonomous Driving |
Published | 2016-12-13 |
URL | http://arxiv.org/abs/1612.04340v1 |
http://arxiv.org/pdf/1612.04340v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-deep-reinforcement-learning-for |
Repo | |
Framework | |
Asymptotically exact inference in differentiable generative models
Title | Asymptotically exact inference in differentiable generative models |
Authors | Matthew M. Graham, Amos J. Storkey |
Abstract | Many generative models can be expressed as a differentiable function of random inputs drawn from some simple probability density. This framework includes both deep generative architectures such as Variational Autoencoders and a large class of procedurally defined simulator models. We present a method for performing efficient MCMC inference in such models when conditioning on observations of the model output. For some models this offers an asymptotically exact inference method where Approximate Bayesian Computation might otherwise be employed. We use the intuition that inference corresponds to integrating a density across the manifold corresponding to the set of inputs consistent with the observed outputs. This motivates the use of a constrained variant of Hamiltonian Monte Carlo which leverages the smooth geometry of the manifold to coherently move between inputs exactly consistent with observations. We validate the method by performing inference tasks in a diverse set of models. |
Tasks | |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.07826v4 |
http://arxiv.org/pdf/1605.07826v4.pdf | |
PWC | https://paperswithcode.com/paper/asymptotically-exact-inference-in |
Repo | |
Framework | |
High-Dimensional $L_2$Boosting: Rate of Convergence
Title | High-Dimensional $L_2$Boosting: Rate of Convergence |
Authors | Ye Luo, Martin Spindler |
Abstract | Boosting is one of the most significant developments in machine learning. This paper studies the rate of convergence of $L_2$Boosting, which is tailored for regression, in a high-dimensional setting. Moreover, we introduce so-called \textquotedblleft post-Boosting\textquotedblright. This is a post-selection estimator which applies ordinary least squares to the variables selected in the first stage by $L_2$Boosting. Another variant is \textquotedblleft Orthogonal Boosting\textquotedblright\ where after each step an orthogonal projection is conducted. We show that both post-$L_2$Boosting and the orthogonal boosting achieve the same rate of convergence as LASSO in a sparse, high-dimensional setting. We show that the rate of convergence of the classical $L_2$Boosting depends on the design matrix described by a sparse eigenvalue constant. To show the latter results, we derive new approximation results for the pure greedy algorithm, based on analyzing the revisiting behavior of $L_2$Boosting. We also introduce feasible rules for early stopping, which can be easily implemented and used in applied work. Our results also allow a direct comparison between LASSO and boosting which has been missing from the literature. Finally, we present simulation studies and applications to illustrate the relevance of our theoretical results and to provide insights into the practical aspects of boosting. In these simulation studies, post-$L_2$Boosting clearly outperforms LASSO. |
Tasks | |
Published | 2016-02-29 |
URL | http://arxiv.org/abs/1602.08927v2 |
http://arxiv.org/pdf/1602.08927v2.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-l_2boosting-rate-of |
Repo | |
Framework | |
Approximate search with quantized sparse representations
Title | Approximate search with quantized sparse representations |
Authors | Himalaya Jain, Patrick Pérez, Rémi Gribonval, Joaquin Zepeda, Hervé Jégou |
Abstract | This paper tackles the task of storing a large collection of vectors, such as visual descriptors, and of searching in it. To this end, we propose to approximate database vectors by constrained sparse coding, where possible atom weights are restricted to belong to a finite subset. This formulation encompasses, as particular cases, previous state-of-the-art methods such as product or residual quantization. As opposed to traditional sparse coding methods, quantized sparse coding includes memory usage as a design constraint, thereby allowing us to index a large collection such as the BIGANN billion-sized benchmark. Our experiments, carried out on standard benchmarks, show that our formulation leads to competitive solutions when considering different trade-offs between learning/coding time, index size and search quality. |
Tasks | Quantization |
Published | 2016-08-10 |
URL | http://arxiv.org/abs/1608.03308v1 |
http://arxiv.org/pdf/1608.03308v1.pdf | |
PWC | https://paperswithcode.com/paper/approximate-search-with-quantized-sparse |
Repo | |
Framework | |
The IBM 2016 English Conversational Telephone Speech Recognition System
Title | The IBM 2016 English Conversational Telephone Speech Recognition System |
Authors | George Saon, Tom Sercu, Steven Rennie, Hong-Kwang J. Kuo |
Abstract | We describe a collection of acoustic and language modeling techniques that lowered the word error rate of our English conversational telephone LVCSR system to a record 6.6% on the Switchboard subset of the Hub5 2000 evaluation testset. On the acoustic side, we use a score fusion of three strong models: recurrent nets with maxout activations, very deep convolutional nets with 3x3 kernels, and bidirectional long short-term memory nets which operate on FMLLR and i-vector features. On the language modeling side, we use an updated model “M” and hierarchical neural network LMs. |
Tasks | Language Modelling, Large Vocabulary Continuous Speech Recognition, Speech Recognition |
Published | 2016-04-27 |
URL | http://arxiv.org/abs/1604.08242v2 |
http://arxiv.org/pdf/1604.08242v2.pdf | |
PWC | https://paperswithcode.com/paper/the-ibm-2016-english-conversational-telephone |
Repo | |
Framework | |
On interestingness measures of formal concepts
Title | On interestingness measures of formal concepts |
Authors | Sergei O. Kuznetsov, Tatiana Makhalova |
Abstract | Formal concepts and closed itemsets proved to be of big importance for knowledge discovery, both as a tool for concise representation of association rules and a tool for clustering and constructing domain taxonomies and ontologies. Exponential explosion makes it difficult to consider the whole concept lattice arising from data, one needs to select most useful and interesting concepts. In this paper interestingness measures of concepts are considered and compared with respect to various aspects, such as efficiency of computation and applicability to noisy data and performing ranking correlation. |
Tasks | |
Published | 2016-11-08 |
URL | http://arxiv.org/abs/1611.02646v2 |
http://arxiv.org/pdf/1611.02646v2.pdf | |
PWC | https://paperswithcode.com/paper/on-interestingness-measures-of-formal |
Repo | |
Framework | |
CB2CF: A Neural Multiview Content-to-Collaborative Filtering Model for Completely Cold Item Recommendations
Title | CB2CF: A Neural Multiview Content-to-Collaborative Filtering Model for Completely Cold Item Recommendations |
Authors | Oren Barkan, Noam Koenigstein, Eylon Yogev, Ori Katz |
Abstract | In Recommender Systems research, algorithms are often characterized as either Collaborative Filtering (CF) or Content Based (CB). CF algorithms are trained using a dataset of user preferences while CB algorithms are typically based on item profiles. These approaches harness different data sources and therefore the resulting recommended items are generally very different. This paper presents the CB2CF, a deep neural multiview model that serves as a bridge from items content into their CF representations. CB2CF is a real-world algorithm designed for Microsoft Store services that handle around a billion users worldwide. CB2CF is demonstrated on movies and apps recommendations, where it is shown to outperform an alternative CB model on completely cold items. |
Tasks | Recommendation Systems |
Published | 2016-11-01 |
URL | https://arxiv.org/abs/1611.00384v2 |
https://arxiv.org/pdf/1611.00384v2.pdf | |
PWC | https://paperswithcode.com/paper/the-deep-journey-from-content-to |
Repo | |
Framework | |
Reading Comprehension using Entity-based Memory Network
Title | Reading Comprehension using Entity-based Memory Network |
Authors | Xun Wang, Katsuhito Sudoh, Masaaki Nagata, Tomohide Shibata, Daisuke Kawahara, Sadao Kurohashi |
Abstract | This paper introduces a novel neural network model for question answering, the \emph{entity-based memory network}. It enhances neural networks’ ability of representing and calculating information over a long period by keeping records of entities contained in text. The core component is a memory pool which comprises entities’ states. These entities’ states are continuously updated according to the input text. Questions with regard to the input text are used to search the memory pool for related entities and answers are further predicted based on the states of retrieved entities. Compared with previous memory network models, the proposed model is capable of handling fine-grained information and more sophisticated relations based on entities. We formulated several different tasks as question answering problems and tested the proposed model. Experiments reported satisfying results. |
Tasks | Question Answering, Reading Comprehension |
Published | 2016-12-12 |
URL | http://arxiv.org/abs/1612.03551v3 |
http://arxiv.org/pdf/1612.03551v3.pdf | |
PWC | https://paperswithcode.com/paper/reading-comprehension-using-entity-based |
Repo | |
Framework | |
Optimal bandwidth estimation for a fast manifold learning algorithm to detect circular structure in high-dimensional data
Title | Optimal bandwidth estimation for a fast manifold learning algorithm to detect circular structure in high-dimensional data |
Authors | Susovan Pal, Praneeth Vepakomma |
Abstract | We provide a way to infer about existence of topological circularity in high-dimensional data sets in $\mathbb{R}^d$ from its projection in $\mathbb{R}^2$ obtained through a fast manifold learning map as a function of the high-dimensional dataset $\mathbb{X}$ and a particular choice of a positive real $\sigma$ known as bandwidth parameter. At the same time we also provide a way to estimate the optimal bandwidth for fast manifold learning in this setting through minimization of these functions of bandwidth. We also provide limit theorems to characterize the behavior of our proposed functions of bandwidth. |
Tasks | |
Published | 2016-12-28 |
URL | http://arxiv.org/abs/1612.08932v1 |
http://arxiv.org/pdf/1612.08932v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-bandwidth-estimation-for-a-fast |
Repo | |
Framework | |