May 6, 2019

3035 words 15 mins read

Paper Group ANR 360

Cross-lingual Dataless Classification for Languages with Small Wikipedia Presence. SemanticFusion: Dense 3D Semantic Mapping with Convolutional Neural Networks. Delay and Cooperation in Nonstochastic Bandits. Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets. Deep Feature-based Face Detection on Mobil …

Cross-lingual Dataless Classification for Languages with Small Wikipedia Presence


Title	Cross-lingual Dataless Classification for Languages with Small Wikipedia Presence
Authors	Yangqiu Song, Stephen Mayhew, Dan Roth
Abstract	This paper presents an approach to classify documents in any language into an English topical label space, without any text categorization training data. The approach, Cross-Lingual Dataless Document Classification (CLDDC) relies on mapping the English labels or short category description into a Wikipedia-based semantic representation, and on the use of the target language Wikipedia. Consequently, performance could suffer when Wikipedia in the target language is small. In this paper, we focus on languages with small Wikipedias, (Small-Wikipedia languages, SWLs). We use a word-level dictionary to convert documents in a SWL to a large-Wikipedia language (LWLs), and then perform CLDDC based on the LWL’s Wikipedia. This approach can be applied to thousands of languages, which can be contrasted with machine translation, which is a supervision heavy approach and can be done for about 100 languages. We also develop a ranking algorithm that makes use of language similarity metrics to automatically select a good LWL, and show that this significantly improves classification of SWLs’ documents, performing comparably to the best bridge possible.
Tasks	Document Classification, Machine Translation, Text Categorization
Published	2016-11-13
URL	http://arxiv.org/abs/1611.04122v1
PDF	http://arxiv.org/pdf/1611.04122v1.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-dataless-classification-for
Repo
Framework

SemanticFusion: Dense 3D Semantic Mapping with Convolutional Neural Networks


Title	SemanticFusion: Dense 3D Semantic Mapping with Convolutional Neural Networks
Authors	John McCormac, Ankur Handa, Andrew Davison, Stefan Leutenegger
Abstract	Ever more robust, accurate and detailed mapping using visual sensing has proven to be an enabling factor for mobile robots across a wide variety of applications. For the next level of robot intelligence and intuitive user interaction, maps need extend beyond geometry and appearence - they need to contain semantics. We address this challenge by combining Convolutional Neural Networks (CNNs) and a state of the art dense Simultaneous Localisation and Mapping (SLAM) system, ElasticFusion, which provides long-term dense correspondence between frames of indoor RGB-D video even during loopy scanning trajectories. These correspondences allow the CNN’s semantic predictions from multiple view points to be probabilistically fused into a map. This not only produces a useful semantic 3D map, but we also show on the NYUv2 dataset that fusing multiple predictions leads to an improvement even in the 2D semantic labelling over baseline single frame predictions. We also show that for a smaller reconstruction dataset with larger variation in prediction viewpoint, the improvement over single frame segmentation increases. Our system is efficient enough to allow real-time interactive use at frame-rates of approximately 25Hz.
Tasks
Published	2016-09-16
URL	http://arxiv.org/abs/1609.05130v2
PDF	http://arxiv.org/pdf/1609.05130v2.pdf
PWC	https://paperswithcode.com/paper/semanticfusion-dense-3d-semantic-mapping-with
Repo
Framework

Delay and Cooperation in Nonstochastic Bandits


Title	Delay and Cooperation in Nonstochastic Bandits
Authors	Nicolo’ Cesa-Bianchi, Claudio Gentile, Yishay Mansour, Alberto Minora
Abstract	We study networks of communicating learning agents that cooperate to solve a common nonstochastic bandit problem. Agents use an underlying communication network to get messages about actions selected by other agents, and drop messages that took more than $d$ hops to arrive, where $d$ is a delay parameter. We introduce \textsc{Exp3-Coop}, a cooperative version of the {\sc Exp3} algorithm and prove that with $K$ actions and $N$ agents the average per-agent regret after $T$ rounds is at most of order $\sqrt{\bigl(d+1 + \tfrac{K}{N}\alpha_{\le d}\bigr)(T\ln K)}$, where $\alpha_{\le d}$ is the independence number of the $d$-th power of the connected communication graph $G$. We then show that for any connected graph, for $d=\sqrt{K}$ the regret bound is $K^{1/4}\sqrt{T}$, strictly better than the minimax regret $\sqrt{KT}$ for noncooperating agents. More informed choices of $d$ lead to bounds which are arbitrarily close to the full information minimax regret $\sqrt{T\ln K}$ when $G$ is dense. When $G$ has sparse components, we show that a variant of \textsc{Exp3-Coop}, allowing agents to choose their parameters according to their centrality in $G$, strictly improves the regret. Finally, as a by-product of our analysis, we provide the first characterization of the minimax regret for bandit learning with delay.
Tasks
Published	2016-02-15
URL	http://arxiv.org/abs/1602.04741v2
PDF	http://arxiv.org/pdf/1602.04741v2.pdf
PWC	https://paperswithcode.com/paper/delay-and-cooperation-in-nonstochastic
Repo
Framework

Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets


Title	Interactive Elicitation of Knowledge on Feature Relevance Improves Predictions in Small Data Sets
Authors	Luana Micallef, Iiris Sundin, Pekka Marttinen, Muhammad Ammad-ud-din, Tomi Peltola, Marta Soare, Giulio Jacucci, Samuel Kaski
Abstract	Providing accurate predictions is challenging for machine learning algorithms when the number of features is larger than the number of samples in the data. Prior knowledge can improve machine learning models by indicating relevant variables and parameter values. Yet, this prior knowledge is often tacit and only available from domain experts. We present a novel approach that uses interactive visualization to elicit the tacit prior knowledge and uses it to improve the accuracy of prediction models. The main component of our approach is a user model that models the domain expert’s knowledge of the relevance of different features for a prediction task. In particular, based on the expert’s earlier input, the user model guides the selection of the features on which to elicit user’s knowledge next. The results of a controlled user study show that the user model significantly improves prior knowledge elicitation and prediction accuracy, when predicting the relative citation counts of scientific documents in a specific domain.
Tasks
Published	2016-12-07
URL	http://arxiv.org/abs/1612.02487v2
PDF	http://arxiv.org/pdf/1612.02487v2.pdf
PWC	https://paperswithcode.com/paper/interactive-elicitation-of-knowledge-on
Repo
Framework

Deep Feature-based Face Detection on Mobile Devices


Title	Deep Feature-based Face Detection on Mobile Devices
Authors	Sayantan Sarkar, Vishal M. Patel, Rama Chellappa
Abstract	We propose a deep feature-based face detector for mobile devices to detect user’s face acquired by the front facing camera. The proposed method is able to detect faces in images containing extreme pose and illumination variations as well as partial faces. The main challenge in developing deep feature-based algorithms for mobile devices is the constrained nature of the mobile platform and the non-availability of CUDA enabled GPUs on such devices. Our implementation takes into account the special nature of the images captured by the front-facing camera of mobile devices and exploits the GPUs present in mobile devices without CUDA-based frameorks, to meet these challenges.
Tasks	Face Detection
Published	2016-02-16
URL	http://arxiv.org/abs/1602.04868v1
PDF	http://arxiv.org/pdf/1602.04868v1.pdf
PWC	https://paperswithcode.com/paper/deep-feature-based-face-detection-on-mobile
Repo
Framework

Connecting Phrase based Statistical Machine Translation Adaptation


Title	Connecting Phrase based Statistical Machine Translation Adaptation
Authors	Rui Wang, Hai Zhao, Bao-Liang Lu, Masao Utiyama, Eiichro Sumita
Abstract	Although more additional corpora are now available for Statistical Machine Translation (SMT), only the ones which belong to the same or similar domains with the original corpus can indeed enhance SMT performance directly. Most of the existing adaptation methods focus on sentence selection. In comparison, phrase is a smaller and more fine grained unit for data selection, therefore we propose a straightforward and efficient connecting phrase based adaptation method, which is applied to both bilingual phrase pair and monolingual n-gram adaptation. The proposed method is evaluated on IWSLT/NIST data sets, and the results show that phrase based SMT performance are significantly improved (up to +1.6 in comparison with phrase based SMT baseline system and +0.9 in comparison with existing methods).
Tasks	Machine Translation
Published	2016-07-29
URL	http://arxiv.org/abs/1607.08693v1
PDF	http://arxiv.org/pdf/1607.08693v1.pdf
PWC	https://paperswithcode.com/paper/connecting-phrase-based-statistical-machine
Repo
Framework

Superimposition of eye fundus images for longitudinal analysis from large public health databases


Title	Superimposition of eye fundus images for longitudinal analysis from large public health databases
Authors	Guillaume Noyel, Rebecca Thomas, Gavin Bhakta, Andrew Crowder, David Owens, Peter Boyle
Abstract	In this paper, a method is presented for superimposition (i.e. registration) of eye fundus images from persons with diabetes screened over many years for diabetic retinopathy. The method is fully automatic and robust to camera changes and colour variations across the images both in space and time. All the stages of the process are designed for longitudinal analysis of cohort public health databases where retinal examinations are made at approximately yearly intervals. The method relies on a model correcting two radial distortions and an affine transformation between pairs of images which is robustly fitted on salient points. Each stage involves linear estimators followed by non-linear optimisation. The model of image warping is also invertible for fast computation. The method has been validated (1) on a simulated montage and (2) on public health databases with 69 patients with high quality images (271 pairs acquired mostly with different types of camera and 268 pairs acquired mostly with the same type of camera) with success rates of 92% and 98%, and five patients (20 pairs) with low quality images with a success rate of 100%. Compared to two state-of-the-art methods, ours gives better results.
Tasks
Published	2016-07-07
URL	http://arxiv.org/abs/1607.01971v3
PDF	http://arxiv.org/pdf/1607.01971v3.pdf
PWC	https://paperswithcode.com/paper/superimposition-of-eye-fundus-images-for
Repo
Framework

Building the Signature of Set Theory Using the MathSem Program


Title	Building the Signature of Set Theory Using the MathSem Program
Authors	Andrey Luxemburg
Abstract	Knowledge representation is a popular research field in IT. As mathematical knowledge is most formalized, its representation is important and interesting. Mathematical knowledge consists of various mathematical theories. In this paper we consider a deductive system that derives mathematical notions, axioms and theorems. All these notions, axioms and theorems can be considered as the part of elementary set theory. This theory will be represented as a semantic net.
Tasks
Published	2016-03-31
URL	http://arxiv.org/abs/1603.09488v1
PDF	http://arxiv.org/pdf/1603.09488v1.pdf
PWC	https://paperswithcode.com/paper/building-the-signature-of-set-theory-using
Repo
Framework

An Iterative Transfer Learning Based Ensemble Technique for Automatic Short Answer Grading


Title	An Iterative Transfer Learning Based Ensemble Technique for Automatic Short Answer Grading
Authors	Shourya Roy, Himanshu S. Bhatt, Y. Narahari
Abstract	Automatic short answer grading (ASAG) techniques are designed to automatically assess short answers to questions in natural language, having a length of a few words to a few sentences. Supervised ASAG techniques have been demonstrated to be effective but suffer from a couple of key practical limitations. They are greatly reliant on instructor provided model answers and need labeled training data in the form of graded student answers for every assessment task. To overcome these, in this paper, we introduce an ASAG technique with two novel features. We propose an iterative technique on an ensemble of (a) a text classifier of student answers and (b) a classifier using numeric features derived from various similarity measures with respect to model answers. Second, we employ canonical correlation analysis based transfer learning on a common feature representation to build the classifier ensemble for questions having no labelled data. The proposed technique handsomely beats all winning supervised entries on the SCIENTSBANK dataset from the Student Response Analysis task of SemEval 2013. Additionally, we demonstrate generalizability and benefits of the proposed technique through evaluation on multiple ASAG datasets from different subject topics and standards.
Tasks	Transfer Learning
Published	2016-09-16
URL	http://arxiv.org/abs/1609.04909v3
PDF	http://arxiv.org/pdf/1609.04909v3.pdf
PWC	https://paperswithcode.com/paper/an-iterative-transfer-learning-based-ensemble
Repo
Framework

Learning Multi-level Features For Sensor-based Human Action Recognition


Title	Learning Multi-level Features For Sensor-based Human Action Recognition
Authors	Yan Xu, Zhengyang Shen, Xin Zhang, Yifan Gao, Shujian Deng, Yipei Wang, Yubo Fan, Eric I-Chao Chang
Abstract	This paper proposes a multi-level feature learning framework for human action recognition using a single body-worn inertial sensor. The framework consists of three phases, respectively designed to analyze signal-based (low-level), components (mid-level) and semantic (high-level) information. Low-level features capture the time and frequency domain property while mid-level representations learn the composition of the action. The Max-margin Latent Pattern Learning (MLPL) method is proposed to learn high-level semantic descriptions of latent action patterns as the output of our framework. The proposed method achieves the state-of-the-art performances, 88.7%, 98.8% and 72.6% (weighted F1 score) respectively, on Skoda, WISDM and OPP datasets.
Tasks	Temporal Action Localization
Published	2016-11-22
URL	http://arxiv.org/abs/1611.07143v2
PDF	http://arxiv.org/pdf/1611.07143v2.pdf
PWC	https://paperswithcode.com/paper/learning-multi-level-features-for-sensor
Repo
Framework

Auxiliary gradient-based sampling algorithms


Title	Auxiliary gradient-based sampling algorithms
Authors	Michalis K. Titsias, Omiros Papaspiliopoulos
Abstract	We introduce a new family of MCMC samplers that combine auxiliary variables, Gibbs sampling and Taylor expansions of the target density. Our approach permits the marginalisation over the auxiliary variables yielding marginal samplers, or the augmentation of the auxiliary variables, yielding auxiliary samplers. The well-known Metropolis-adjusted Langevin algorithm (MALA) and preconditioned Crank-Nicolson Langevin (pCNL) algorithm are shown to be special cases. We prove that marginal samplers are superior in terms of asymptotic variance and demonstrate cases where they are slower in computing time compared to auxiliary samplers. In the context of latent Gaussian models we propose new auxiliary and marginal samplers whose implementation requires a single tuning parameter, which can be found automatically during the transient phase. Extensive experimentation shows that the increase in efficiency (measured as effective sample size per unit of computing time) relative to (optimised implementations of) pCNL, elliptical slice sampling and MALA ranges from 10-fold in binary classification problems to 25-fold in log-Gaussian Cox processes to 100-fold in Gaussian process regression, and it is on par with Riemann manifold Hamiltonian Monte Carlo in an example where the latter has the same complexity as the aforementioned algorithms. We explain this remarkable improvement in terms of the way alternative samplers try to approximate the eigenvalues of the target. We introduce a novel MCMC sampling scheme for hyperparameter learning that builds upon the auxiliary samplers. The MATLAB code for reproducing the experiments in the article is publicly available and a Supplement to this article contains additional experiments and implementation details.
Tasks
Published	2016-10-30
URL	http://arxiv.org/abs/1610.09641v3
PDF	http://arxiv.org/pdf/1610.09641v3.pdf
PWC	https://paperswithcode.com/paper/auxiliary-gradient-based-sampling-algorithms
Repo
Framework

Automatic Action Annotation in Weakly Labeled Videos


Title	Automatic Action Annotation in Weakly Labeled Videos
Authors	Waqas Sultani, Mubarak Shah
Abstract	Manual spatio-temporal annotation of human action in videos is laborious, requires several annotators and contains human biases. In this paper, we present a weakly supervised approach to automatically obtain spatio-temporal annotations of an actor in action videos. We first obtain a large number of action proposals in each video. To capture a few most representative action proposals in each video and evade processing thousands of them, we rank them using optical flow and saliency in a 3D-MRF based framework and select a few proposals using MAP based proposal subset selection method. We demonstrate that this ranking preserves the high quality action proposals. Several such proposals are generated for each video of the same action. Our next challenge is to iteratively select one proposal from each video so that all proposals are globally consistent. We formulate this as Generalized Maximum Clique Graph problem using shape, global and fine grained similarity of proposals across the videos. The output of our method is the most action representative proposals from each video. Our method can also annotate multiple instances of the same action in a video. We have validated our approach on three challenging action datasets: UCF Sport, sub-JHMDB and THUMOS’13 and have obtained promising results compared to several baseline methods. Moreover, on UCF Sports, we demonstrate that action classifiers trained on these automatically obtained spatio-temporal annotations have comparable performance to the classifiers trained on ground truth annotation.
Tasks	Optical Flow Estimation
Published	2016-05-26
URL	http://arxiv.org/abs/1605.08125v1
PDF	http://arxiv.org/pdf/1605.08125v1.pdf
PWC	https://paperswithcode.com/paper/automatic-action-annotation-in-weakly-labeled
Repo
Framework

What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots


Title	What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots
Authors	Chengxi Ye, Yezhou Yang, Cornelia Fermuller, Yiannis Aloimonos
Abstract	For robots that have the capability to interact with the physical environment through their end effectors, understanding the surrounding scenes is not merely a task of image classification or object recognition. To perform actual tasks, it is critical for the robot to have a functional understanding of the visual scene. Here, we address the problem of localizing and recognition of functional areas from an arbitrary indoor scene, formulated as a two-stage deep learning based detection pipeline. A new scene functionality testing-bed, which is complied from two publicly available indoor scene datasets, is used for evaluation. Our method is evaluated quantitatively on the new dataset, demonstrating the ability to perform efficient recognition of functional areas from arbitrary indoor scenes. We also demonstrate that our detection model can be generalized onto novel indoor scenes by cross validating it with the images from two different datasets.
Tasks	Image Classification, Object Recognition, Scene Understanding
Published	2016-01-29
URL	http://arxiv.org/abs/1602.00032v2
PDF	http://arxiv.org/pdf/1602.00032v2.pdf
PWC	https://paperswithcode.com/paper/what-can-i-do-around-here-deep-functional
Repo
Framework

Generic Statistical Relational Entity Resolution in Knowledge Graphs


Title	Generic Statistical Relational Entity Resolution in Knowledge Graphs
Authors	Jay Pujara, Lise Getoor
Abstract	Entity resolution, the problem of identifying the underlying entity of references found in data, has been researched for many decades in many communities. A common theme in this research has been the importance of incorporating relational features into the resolution process. Relational entity resolution is particularly important in knowledge graphs (KGs), which have a regular structure capturing entities and their interrelationships. We identify three major problems in KG entity resolution: (1) intra-KG reference ambiguity; (2) inter-KG reference ambiguity; and (3) ambiguity when extending KGs with new facts. We implement a framework that generalizes across these three settings and exploits this regular structure of KGs. Our framework has many advantages over custom solutions widely deployed in industry, including collective inference, scalability, and interpretability. We apply our framework to two real-world KG entity resolution problems, ambiguity in NELL and merging data from Freebase and MusicBrainz, demonstrating the importance of relational features.
Tasks	Entity Resolution, Knowledge Graphs
Published	2016-07-04
URL	http://arxiv.org/abs/1607.00992v1
PDF	http://arxiv.org/pdf/1607.00992v1.pdf
PWC	https://paperswithcode.com/paper/generic-statistical-relational-entity
Repo
Framework

Multi-Task Cross-Lingual Sequence Tagging from Scratch


Title	Multi-Task Cross-Lingual Sequence Tagging from Scratch
Authors	Zhilin Yang, Ruslan Salakhutdinov, William Cohen
Abstract	We present a deep hierarchical recurrent neural network for sequence tagging. Given a sequence of words, our model employs deep gated recurrent units on both character and word levels to encode morphology and context information, and applies a conditional random field layer to predict the tags. Our model is task independent, language independent, and feature engineering free. We further extend our model to multi-task and cross-lingual joint training by sharing the architecture and parameters. Our model achieves state-of-the-art results in multiple languages on several benchmark tasks including POS tagging, chunking, and NER. We also demonstrate that multi-task and cross-lingual joint training can improve the performance in various cases.
Tasks	Chunking, Feature Engineering
Published	2016-03-20
URL	http://arxiv.org/abs/1603.06270v2
PDF	http://arxiv.org/pdf/1603.06270v2.pdf
PWC	https://paperswithcode.com/paper/multi-task-cross-lingual-sequence-tagging
Repo
Framework