October 19, 2019

2668 words 13 mins read

Paper Group ANR 395

Event-based Gesture Recognition with Dynamic Background Suppression using Smartphone Computational Capabilities. Mixed Likelihood Gaussian Process Latent Variable Model. Tree-Based Optimization: A Meta-Algorithm for Metaheuristic Optimization. Large Field and High Resolution: Detecting Needle in Haystack. Quantification and Analysis of Scientific L …

Event-based Gesture Recognition with Dynamic Background Suppression using Smartphone Computational Capabilities


Title	Event-based Gesture Recognition with Dynamic Background Suppression using Smartphone Computational Capabilities
Authors	Jean-Matthieu Maro, Ryad Benosman
Abstract	This paper introduces a framework of gesture recognition operating on the output of an event based camera using the computational resources of a mobile phone. We will introduce a new development around the concept of time-surfaces modified and adapted to run on the limited computational resources of a mobile platform. We also introduce a new method to remove dynamically backgrounds that makes full use of the high temporal resolution of event-based cameras. We assess the performances of the framework by operating on several dynamic scenarios in uncontrolled lighting conditions indoors and outdoors. We also introduce a new publicly available event-based dataset for gesture recognition selected through a clinical process to allow human-machine interactions for the visually-impaired and the elderly. We finally report comparisons with prior works that tackled event-based gesture recognition reporting comparable if not superior results if taking into account the limited computational and memory constraints of the used hardware.
Tasks	Gesture Recognition
Published	2018-11-19
URL	https://arxiv.org/abs/1811.07802v2
PDF	https://arxiv.org/pdf/1811.07802v2.pdf
PWC	https://paperswithcode.com/paper/event-based-gesture-recognition-with-dynamic
Repo
Framework

Mixed Likelihood Gaussian Process Latent Variable Model


Title	Mixed Likelihood Gaussian Process Latent Variable Model
Authors	Samuel Murray, Hedvig Kjellström
Abstract	We present the Mixed Likelihood Gaussian process latent variable model (GP-LVM), capable of modeling data with attributes of different types. The standard formulation of GP-LVM assumes that each observation is drawn from a Gaussian distribution, which makes the model unsuited for data with e.g. categorical or nominal attributes. Our model, for which we use a sampling based variational inference, instead assumes a separate likelihood for each observed dimension. This formulation results in more meaningful latent representations, and give better predictive performance for real world data with dimensions of different types.
Tasks
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07627v1
PDF	http://arxiv.org/pdf/1811.07627v1.pdf
PWC	https://paperswithcode.com/paper/mixed-likelihood-gaussian-process-latent
Repo
Framework

Tree-Based Optimization: A Meta-Algorithm for Metaheuristic Optimization


Title	Tree-Based Optimization: A Meta-Algorithm for Metaheuristic Optimization
Authors	Benyamin Ghojogh, Saeed Sharifian, Hoda Mohammadzade
Abstract	Designing search algorithms for finding global optima is one of the most active research fields, recently. These algorithms consist of two main categories, i.e., classic mathematical and metaheuristic algorithms. This article proposes a meta-algorithm, Tree-Based Optimization (TBO), which uses other heuristic optimizers as its sub-algorithms in order to improve the performance of search. The proposed algorithm is based on mathematical tree subject and improves performance and speed of search by iteratively removing parts of the search space having low fitness, in order to minimize and purify the search space. The experimental results on several well-known benchmarks show the outperforming performance of TBO algorithm in finding the global solution. Experiments on high dimensional search spaces show significantly better performance when using the TBO algorithm. The proposed algorithm improves the search algorithms in both accuracy and speed aspects, especially for high dimensional searching such as in VLSI CAD tools for Integrated Circuit (IC) design.
Tasks
Published	2018-09-25
URL	http://arxiv.org/abs/1809.09284v1
PDF	http://arxiv.org/pdf/1809.09284v1.pdf
PWC	https://paperswithcode.com/paper/tree-based-optimization-a-meta-algorithm-for
Repo
Framework

Large Field and High Resolution: Detecting Needle in Haystack


Title	Large Field and High Resolution: Detecting Needle in Haystack
Authors	Hadar Gorodissky, Daniel Harari, Shimon Ullman
Abstract	The growing use of convolutional neural networks (CNN) for a broad range of visual tasks, including tasks involving fine details, raises the problem of applying such networks to a large field of view, since the amount of computations increases significantly with the number of pixels. To deal effectively with this difficulty, we develop and compare methods of using CNNs for the task of small target localization in natural images, given a limited “budget” of samples to form an image. Inspired in part by human vision, we develop and compare variable sampling schemes, with peak resolution at the center and decreasing resolution with eccentricity, applied iteratively by re-centering the image at the previous predicted target location. The results indicate that variable resolution models significantly outperform constant resolution models. Surprisingly, variable resolution models and in particular multi-channel models, outperform the optimal, “budget-free” full-resolution model, using only 5% of the samples.
Tasks
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03576v1
PDF	http://arxiv.org/pdf/1804.03576v1.pdf
PWC	https://paperswithcode.com/paper/large-field-and-high-resolution-detecting
Repo
Framework

Quantification and Analysis of Scientific Language Variation Across Research Fields


Title	Quantification and Analysis of Scientific Language Variation Across Research Fields
Authors	Pei Zhou, Muhao Chen, Kai-Wei Chang, Carlo Zaniolo
Abstract	Quantifying differences in terminologies from various academic domains has been a longstanding problem yet to be solved. We propose a computational approach for analyzing linguistic variation among scientific research fields by capturing the semantic change of terms based on a neural language model. The model is trained on a large collection of literature in five computer science research fields, for which we obtain field-specific vector representations for key terms, and global vector representations for other words. Several quantitative approaches are introduced to identify the terms whose semantics have drastically changed, or remain unchanged across different research fields. We also propose a metric to quantify the overall linguistic variation of research fields. After quantitative evaluation on human annotated data and qualitative comparison with other methods, we show that our model can improve cross-disciplinary data collaboration by identifying terms that potentially induce confusion during interdisciplinary studies.
Tasks	Language Modelling
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01250v1
PDF	http://arxiv.org/pdf/1812.01250v1.pdf
PWC	https://paperswithcode.com/paper/quantification-and-analysis-of-scientific
Repo
Framework

Physical Representation-based Predicate Optimization for a Visual Analytics Database


Title	Physical Representation-based Predicate Optimization for a Visual Analytics Database
Authors	Michael R. Anderson, Michael Cafarella, German Ros, Thomas F. Wenisch
Abstract	Querying the content of images, video, and other non-textual data sources requires expensive content extraction methods. Modern extraction techniques are based on deep convolutional neural networks (CNNs) and can classify objects within images with astounding accuracy. Unfortunately, these methods are slow: processing a single image can take about 10 milliseconds on modern GPU-based hardware. As massive video libraries become ubiquitous, running a content-based query over millions of video frames is prohibitive. One promising approach to reduce the runtime cost of queries of visual content is to use a hierarchical model, such as a cascade, where simple cases are handled by an inexpensive classifier. Prior work has sought to design cascades that optimize the computational cost of inference by, for example, using smaller CNNs. However, we observe that there are critical factors besides the inference time that dramatically impact the overall query time. Notably, by treating the physical representation of the input image as part of our query optimization—that is, by including image transforms, such as resolution scaling or color-depth reduction, within the cascade—we can optimize data handling costs and enable drastically more efficient classifier cascades. In this paper, we propose Tahoma, which generates and evaluates many potential classifier cascades that jointly optimize the CNN architecture and input data representation. Our experiments on a subset of ImageNet show that Tahoma’s input transformations speed up cascades by up to 35 times. We also find up to a 98x speedup over the ResNet50 classifier with no loss in accuracy, and a 280x speedup if some accuracy is sacrificed.
Tasks
Published	2018-06-11
URL	http://arxiv.org/abs/1806.04226v3
PDF	http://arxiv.org/pdf/1806.04226v3.pdf
PWC	https://paperswithcode.com/paper/physical-representation-based-predicate
Repo
Framework

Exploring Gameplay With AI Agents


Title	Exploring Gameplay With AI Agents
Authors	Fernando de Mesentier Silva, Igor Borovikov, John Kolen, Navid Aghdaie, Kazi Zaman
Abstract	The process of playtesting a game is subjective, expensive and incomplete. In this paper, we present a playtesting approach that explores the game space with automated agents and collects data to answer questions posed by the designers. Rather than have agents interacting with an actual game client, this approach recreates the bare bone mechanics of the game as a separate system. Our agent is able to play in minutes what would take testers days of organic gameplay. The analysis of thousands of game simulations exposed imbalances in game actions, identified inconsequential rewards and evaluated the effectiveness of optional strategic choices. Our test case game, The Sims Mobile, was recently released and the findings shown here influenced design changes that resulted in improved player experience.
Tasks
Published	2018-11-16
URL	http://arxiv.org/abs/1811.06962v1
PDF	http://arxiv.org/pdf/1811.06962v1.pdf
PWC	https://paperswithcode.com/paper/exploring-gameplay-with-ai-agents
Repo
Framework

Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder


Title	Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder
Authors	Caio Corro, Ivan Titov
Abstract	Human annotation for syntactic parsing is expensive, and large resources are available only for a fraction of languages. A question we ask is whether one can leverage abundant unlabeled texts to improve syntactic parsers, beyond just using the texts to obtain more generalisable lexical features (i.e. beyond word embeddings). To this end, we propose a novel latent-variable generative model for semi-supervised syntactic dependency parsing. As exact inference is intractable, we introduce a differentiable relaxation to obtain approximate samples and compute gradients with respect to the parser parameters. Our method (Differentiable Perturb-and-Parse) relies on differentiable dynamic programming over stochastically perturbed edge scores. We demonstrate effectiveness of our approach with experiments on English, French and Swedish.
Tasks	Dependency Parsing, Word Embeddings
Published	2018-07-25
URL	http://arxiv.org/abs/1807.09875v2
PDF	http://arxiv.org/pdf/1807.09875v2.pdf
PWC	https://paperswithcode.com/paper/differentiable-perturb-and-parse-semi
Repo
Framework

Time Series Learning using Monotonic Logical Properties


Title	Time Series Learning using Monotonic Logical Properties
Authors	Marcell Vazquez-Chanlatte, Shromona Ghosh, Jyotirmoy V. Deshmukh, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia
Abstract	Cyber-physical systems of today are generating large volumes of time-series data. As manual inspection of such data is not tractable, the need for learning methods to help discover logical structure in the data has increased. We propose a logic-based framework that allows domain-specific knowledge to be embedded into formulas in a parametric logical specification over time-series data. The key idea is to then map a time series to a surface in the parameter space of the formula. Given this mapping, we identify the Hausdorff distance between boundaries as a natural distance metric between two time-series data under the lens of the parametric specification. This enables embedding non-trivial domain-specific knowledge into the distance metric and then using off-the-shelf machine learning tools to label the data. After labeling the data, we demonstrate how to extract a logical specification for each label. Finally, we showcase our technique on real world traffic data to learn classifiers/monitors for slow-downs and traffic jams.
Tasks	Time Series
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08924v2
PDF	http://arxiv.org/pdf/1802.08924v2.pdf
PWC	https://paperswithcode.com/paper/time-series-learning-using-monotonic-logical
Repo
Framework

Local Descriptors Optimized for Average Precision


Title	Local Descriptors Optimized for Average Precision
Authors	Kun He, Yan Lu, Stan Sclaroff
Abstract	Extraction of local feature descriptors is a vital stage in the solution pipelines for numerous computer vision tasks. Learning-based approaches improve performance in certain tasks, but still cannot replace handcrafted features in general. In this paper, we improve the learning of local feature descriptors by optimizing the performance of descriptor matching, which is a common stage that follows descriptor extraction in local feature based pipelines, and can be formulated as nearest neighbor retrieval. Specifically, we directly optimize a ranking-based retrieval performance metric, Average Precision, using deep neural networks. This general-purpose solution can also be viewed as a listwise learning to rank approach, which is advantageous compared to recent local ranking approaches. On standard benchmarks, descriptors learned with our formulation achieve state-of-the-art results in patch verification, patch retrieval, and image matching.
Tasks	Learning-To-Rank
Published	2018-04-15
URL	http://arxiv.org/abs/1804.05312v2
PDF	http://arxiv.org/pdf/1804.05312v2.pdf
PWC	https://paperswithcode.com/paper/local-descriptors-optimized-for-average
Repo
Framework

Direct Learning to Rank and Rerank


Title	Direct Learning to Rank and Rerank
Authors	Cynthia Rudin, Yining Wang
Abstract	Learning-to-rank techniques have proven to be extremely useful for prioritization problems, where we rank items in order of their estimated probabilities, and dedicate our limited resources to the top-ranked items. This work exposes a serious problem with the state of learning-to-rank algorithms, which is that they are based on convex proxies that lead to poor approximations. We then discuss the possibility of “exact” reranking algorithms based on mathematical programming. We prove that a relaxed version of the “exact” problem has the same optimal solution, and provide an empirical analysis.
Tasks	Learning-To-Rank
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07400v1
PDF	http://arxiv.org/pdf/1802.07400v1.pdf
PWC	https://paperswithcode.com/paper/direct-learning-to-rank-and-rerank
Repo
Framework

Towards Head Motion Compensation Using Multi-Scale Convolutional Neural Networks


Title	Towards Head Motion Compensation Using Multi-Scale Convolutional Neural Networks
Authors	Omer Rajput, Nils Gessert, Martin Gromniak, Lars Matthäus, Alexander Schlaefer
Abstract	Head pose estimation and tracking is useful in variety of medical applications. With the advent of RGBD cameras like Kinect, it has become feasible to do markerless tracking by estimating the head pose directly from the point clouds. One specific medical application is robot assisted transcranial magnetic stimulation (TMS) where any patient motion is compensated with the help of a robot. For increased patient comfort, it is important to track the head without markers. In this regard, we address the head pose estimation problem using two different approaches. In the first approach, we build upon the more traditional approach of model based head tracking, where a head model is morphed according to the particular head to be tracked and the morphed model is used to track the head in the point cloud streams. In the second approach, we propose a new multi-scale convolutional neural network architecture for more accurate pose regression. Additionally, we outline a systematic data set acquisition strategy using a head phantom mounted on the robot and ground-truth labels generated using a highly accurate tracking system.
Tasks	Head Pose Estimation, Motion Compensation, Pose Estimation
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03651v1
PDF	http://arxiv.org/pdf/1807.03651v1.pdf
PWC	https://paperswithcode.com/paper/towards-head-motion-compensation-using-multi
Repo
Framework

Minimization of Gini impurity via connections with the k-means problem


Title	Minimization of Gini impurity via connections with the k-means problem
Authors	Eduardo Sany Laber, Lucas Murtinho
Abstract	The Gini impurity is one of the measures used to select attribute in Decision Trees/Random Forest construction. In this note we discuss connections between the problem of computing the partition with minimum Weighted Gini impurity and the $k$-means clustering problem. Based on these connections we show that the computation of the partition with minimum Weighted Gini is a NP-Complete problem and we also discuss how to obtain new algorithms with provable approximation for the Gini Minimization problem.
Tasks
Published	2018-09-28
URL	http://arxiv.org/abs/1810.00029v1
PDF	http://arxiv.org/pdf/1810.00029v1.pdf
PWC	https://paperswithcode.com/paper/minimization-of-gini-impurity-via-connections
Repo
Framework

Sequential Context Encoding for Duplicate Removal


Title	Sequential Context Encoding for Duplicate Removal
Authors	Lu Qi, Shu Liu, Jianping Shi, Jiaya Jia
Abstract	Duplicate removal is a critical step to accomplish a reasonable amount of predictions in prevalent proposal-based object detection frameworks. Albeit simple and effective, most previous algorithms utilize a greedy process without making sufficient use of properties of input data. In this work, we design a new two-stage framework to effectively select the appropriate proposal candidate for each object. The first stage suppresses most of easy negative object proposals, while the second stage selects true positives in the reduced proposal set. These two stages share the same network structure, \ie, an encoder and a decoder formed as recurrent neural networks (RNN) with global attention and context gate. The encoder scans proposal candidates in a sequential manner to capture the global context information, which is then fed to the decoder to extract optimal proposals. In our extensive experiments, the proposed method outperforms other alternatives by a large margin.
Tasks	Object Detection
Published	2018-10-20
URL	http://arxiv.org/abs/1810.08770v1
PDF	http://arxiv.org/pdf/1810.08770v1.pdf
PWC	https://paperswithcode.com/paper/sequential-context-encoding-for-duplicate
Repo
Framework

Emotion Orientated Recommendation System for Hiroshima Tourist by Fuzzy Petri Net


Title	Emotion Orientated Recommendation System for Hiroshima Tourist by Fuzzy Petri Net
Authors	Takumi Ichimura, Issei Tachibana
Abstract	We developed an Android Smartophone application software for tourist information system. Especially, the agent system recommends the sightseeing spot and local hospitality corresponding to the current feelings. The system such as concierge can estimate user’s emotion and mood by Emotion Generating Calculations and Mental State Transition Network. In this paper, the system decides the next candidates for spots and foods by the reasoning of fuzzy Petri Net in order to make more smooth communication between human and smartphone. The system was developed for Hiroshima Tourist Information and described some hospitality about the concierge system.
Tasks
Published	2018-04-08
URL	http://arxiv.org/abs/1804.02657v1
PDF	http://arxiv.org/pdf/1804.02657v1.pdf
PWC	https://paperswithcode.com/paper/emotion-orientated-recommendation-system-for
Repo
Framework