July 27, 2019

3207 words 16 mins read

Paper Group ANR 647

Interactive Video Object Segmentation in the Wild. Online Multi-Armed Bandit. An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos. Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions. Exact alignment recovery for correlated Erdős-Rényi graphs. Supporting Navigation of Outdoor Sh …

Interactive Video Object Segmentation in the Wild


Title	Interactive Video Object Segmentation in the Wild
Authors	Arnaud Benard, Michael Gygli
Abstract	In this paper we present our system for human-in-the-loop video object segmentation. The backbone of our system is a method for one-shot video object segmentation. While fast, this method requires an accurate pixel-level segmentation of one (or several) frames as input. As manually annotating such a segmentation is impractical, we propose a deep interactive image segmentation method, that can accurately segment objects with only a handful of clicks. On the GrabCut dataset, our method obtains 90% IOU with just 3.8 clicks on average, setting the new state of the art. Furthermore, as our method iteratively refines an initial segmentation, it can effectively correct frames where the video object segmentation fails, thus allowing users to quickly obtain high quality results even on challenging sequences. Finally, we investigate usage patterns and give insights in how many steps users take to annotate frames, what kind of corrections they provide, etc., thus giving important insights for further improving interactive video segmentation.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2017-12-31
URL	http://arxiv.org/abs/1801.00269v1
PDF	http://arxiv.org/pdf/1801.00269v1.pdf
PWC	https://paperswithcode.com/paper/interactive-video-object-segmentation-in-the
Repo
Framework

Online Multi-Armed Bandit


Title	Online Multi-Armed Bandit
Authors	Uma Roy, Ashwath Thirmulai, Joe Zurier
Abstract	We introduce a novel variant of the multi-armed bandit problem, in which bandits are streamed one at a time to the player, and at each point, the player can either choose to pull the current bandit or move on to the next bandit. Once a player has moved on from a bandit, they may never visit it again, which is a crucial difference between our problem and classic multi-armed bandit problems. In this online context, we study Bernoulli bandits (bandits with payout Ber($p_i$) for some underlying mean $p_i$) with underlying means drawn i.i.d. from various distributions, including the uniform distribution, and in general, all distributions that have a CDF satisfying certain differentiability conditions near zero. In all cases, we suggest several strategies and investigate their expected performance. Furthermore, we bound the performance of any optimal strategy and show that the strategies we have suggested are indeed optimal up to a constant factor. We also investigate the case where the distribution from which the underlying means are drawn is not known ahead of time. We again, are able to suggest algorithms that are optimal up to a constant factor for this case, given certain mild conditions on the universe of distributions.
Tasks
Published	2017-07-17
URL	http://arxiv.org/abs/1707.04987v1
PDF	http://arxiv.org/pdf/1707.04987v1.pdf
PWC	https://paperswithcode.com/paper/online-multi-armed-bandit
Repo
Framework

An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos


Title	An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos
Authors	Rui Hou, Chen Chen, Mubarak Shah
Abstract	In this paper, we propose an end-to-end 3D CNN for action detection and segmentation in videos. The proposed architecture is a unified deep network that is able to recognize and localize action based on 3D convolution features. A video is first divided into equal length clips and next for each clip a set of tube proposals are generated based on 3D CNN features. Finally, the tube proposals of different clips are linked together and spatio-temporal action detection is performed using these linked video proposals. This top-down action detection approach explicitly relies on a set of good tube proposals to perform well and training the bounding box regression usually requires a large number of annotated samples. To remedy this, we further extend the 3D CNN to an encoder-decoder structure and formulate the localization problem as action segmentation. The foreground regions (i.e. action regions) for each frame are segmented first then the segmented foreground maps are used to generate the bounding boxes. This bottom-up approach effectively avoids tube proposal generation by leveraging the pixel-wise annotations of segmentation. The segmentation framework also can be readily applied to a general problem of video object segmentation. Extensive experiments on several video datasets demonstrate the superior performance of our approach for action detection and video object segmentation compared to the state-of-the-arts.
Tasks	Action Detection, action segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2017-11-30
URL	http://arxiv.org/abs/1712.01111v1
PDF	http://arxiv.org/pdf/1712.01111v1.pdf
PWC	https://paperswithcode.com/paper/an-end-to-end-3d-convolutional-neural-network
Repo
Framework

Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions


Title	Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions
Authors	Eli David, H. Jaap van den Herik, Moshe Koppel, Nathan S. Netanyahu
Abstract	This paper demonstrates the use of genetic algorithms for evolving a grandmaster-level evaluation function for a chess program. This is achieved by combining supervised and unsupervised learning. In the supervised learning phase the organisms are evolved to mimic the behavior of human grandmasters, and in the unsupervised learning phase these evolved organisms are further improved upon by means of coevolution. While past attempts succeeded in creating a grandmaster-level program by mimicking the behavior of existing computer chess programs, this paper presents the first successful attempt at evolving a state-of-the-art evaluation function by learning only from databases of games played by humans. Our results demonstrate that the evolved program outperforms a two-time World Computer Chess Champion.
Tasks
Published	2017-11-18
URL	http://arxiv.org/abs/1711.06840v1
PDF	http://arxiv.org/pdf/1711.06840v1.pdf
PWC	https://paperswithcode.com/paper/simulating-human-grandmasters-evolution-and
Repo
Framework

Exact alignment recovery for correlated Erdős-Rényi graphs


Title	Exact alignment recovery for correlated Erdős-Rényi graphs
Authors	Daniel Cullina, Negar Kiyavash
Abstract	We consider the problem of perfectly recovering the vertex correspondence between two correlated Erd\H{o}s-R'enyi (ER) graphs on the same vertex set. The correspondence between the vertices can be obscured by randomly permuting the vertex labels of one of the graphs. We determine the information-theoretic threshold for exact recovery, i.e. the conditions under which the entire vertex correspondence can be correctly recovered given unbounded computational resources.
Tasks
Published	2017-11-18
URL	http://arxiv.org/abs/1711.06783v2
PDF	http://arxiv.org/pdf/1711.06783v2.pdf
PWC	https://paperswithcode.com/paper/exact-alignment-recovery-for-correlated-erdos
Repo
Framework


Title	Supporting Navigation of Outdoor Shopping Complexes for Visually-impaired Users through Multi-modal Data Fusion
Authors	Archana Paladugu, Parag S. Chandakkar, Peng Zhang, Baoxin Li
Abstract	Outdoor shopping complexes (OSC) are extremely difficult for people with visual impairment to navigate. Existing GPS devices are mostly designed for roadside navigation and seldom transition well into an OSC-like setting. We report our study on the challenges faced by a blind person in navigating OSC through developing a new mobile application named iExplore. We first report an exploratory study aiming at deriving specific design principles for building this system by learning the unique challenges of the problem. Then we present a methodology that can be used to derive the necessary information for the development of iExplore, followed by experimental validation of the technology by a group of visually impaired users in a local outdoor shopping center. User feedback and other experiments suggest that iExplore, while at its very initial phase, has the potential of filling a practical gap in existing assistive technologies for the visually impaired.
Tasks
Published	2017-04-05
URL	http://arxiv.org/abs/1704.01266v1
PDF	http://arxiv.org/pdf/1704.01266v1.pdf
PWC	https://paperswithcode.com/paper/supporting-navigation-of-outdoor-shopping
Repo
Framework

EmotioNet Challenge: Recognition of facial expressions of emotion in the wild


Title	EmotioNet Challenge: Recognition of facial expressions of emotion in the wild
Authors	C. Fabian Benitez-Quiroz, Ramprakash Srinivasan, Qianli Feng, Yan Wang, Aleix M. Martinez
Abstract	This paper details the methodology and results of the EmotioNet challenge. This challenge is the first to test the ability of computer vision algorithms in the automatic analysis of a large number of images of facial expressions of emotion in the wild. The challenge was divided into two tracks. The first track tested the ability of current computer vision algorithms in the automatic detection of action units (AUs). Specifically, we tested the detection of 11 AUs. The second track tested the algorithms’ ability to recognize emotion categories in images of facial expressions. Specifically, we tested the recognition of 16 basic and compound emotion categories. The results of the challenge suggest that current computer vision and machine learning algorithms are unable to reliably solve these two tasks. The limitations of current algorithms are more apparent when trying to recognize emotion. We also show that current algorithms are not affected by mild resolution changes, small occluders, gender or age, but that 3D pose is a major limiting factor on performance. We provide an in-depth discussion of the points that need special attention moving forward.
Tasks
Published	2017-03-03
URL	http://arxiv.org/abs/1703.01210v1
PDF	http://arxiv.org/pdf/1703.01210v1.pdf
PWC	https://paperswithcode.com/paper/emotionet-challenge-recognition-of-facial
Repo
Framework

A Deep Multi-View Learning Framework for City Event Extraction from Twitter Data Streams


Title	A Deep Multi-View Learning Framework for City Event Extraction from Twitter Data Streams
Authors	Nazli Farajidavar, Sefki Kolozali, Payam Barnaghi
Abstract	Cities have been a thriving place for citizens over the centuries due to their complex infrastructure. The emergence of the Cyber-Physical-Social Systems (CPSS) and context-aware technologies boost a growing interest in analysing, extracting and eventually understanding city events which subsequently can be utilised to leverage the citizen observations of their cities. In this paper, we investigate the feasibility of using Twitter textual streams for extracting city events. We propose a hierarchical multi-view deep learning approach to contextualise citizen observations of various city systems and services. Our goal has been to build a flexible architecture that can learn representations useful for tasks, thus avoiding excessive task-specific feature engineering. We apply our approach on a real-world dataset consisting of event reports and tweets of over four months from San Francisco Bay Area dataset and additional datasets collected from London. The results of our evaluations show that our proposed solution outperforms the existing models and can be used for extracting city related events with an averaged accuracy of 81% over all classes. To further evaluate the impact of our Twitter event extraction model, we have used two sources of authorised reports through collecting road traffic disruptions data from Transport for London API, and parsing the Time Out London website for sociocultural events. The analysis showed that 49.5% of the Twitter traffic comments are reported approximately five hours prior to the authorities official records. Moreover, we discovered that amongst the scheduled sociocultural event topics; tweets reporting transportation, cultural and social events are 31.75% more likely to influence the distribution of the Twitter comments than sport, weather and crime topics.
Tasks	Feature Engineering, MULTI-VIEW LEARNING
Published	2017-05-28
URL	http://arxiv.org/abs/1705.09975v1
PDF	http://arxiv.org/pdf/1705.09975v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-multi-view-learning-framework-for-city
Repo
Framework

Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning


Title	Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning
Authors	Lucas Lehnert, Stefanie Tellex, Michael L. Littman
Abstract	One question central to Reinforcement Learning is how to learn a feature representation that supports algorithm scaling and re-use of learned information from different tasks. Successor Features approach this problem by learning a feature representation that satisfies a temporal constraint. We present an implementation of an approach that decouples the feature representation from the reward function, making it suitable for transferring knowledge between domains. We then assess the advantages and limitations of using Successor Features for transfer.
Tasks
Published	2017-07-31
URL	http://arxiv.org/abs/1708.00102v1
PDF	http://arxiv.org/pdf/1708.00102v1.pdf
PWC	https://paperswithcode.com/paper/advantages-and-limitations-of-using-successor
Repo
Framework

Group Importance Sampling for Particle Filtering and MCMC


Title	Group Importance Sampling for Particle Filtering and MCMC
Authors	L. Martino, V. Elvira, G. Camps-Valls
Abstract	Bayesian methods and their implementations by means of sophisticated Monte Carlo techniques have become very popular in signal processing over the last years. Importance Sampling (IS) is a well-known Monte Carlo technique that approximates integrals involving a posterior distribution by means of weighted samples. In this work, we study the assignation of a single weighted sample which compresses the information contained in a population of weighted samples. Part of the theory that we present as Group Importance Sampling (GIS) has been employed implicitly in different works in the literature. The provided analysis yields several theoretical and practical consequences. For instance, we discuss the application of GIS into the Sequential Importance Resampling framework and show that Independent Multiple Try Metropolis schemes can be interpreted as a standard Metropolis-Hastings algorithm, following the GIS approach. We also introduce two novel Markov Chain Monte Carlo (MCMC) techniques based on GIS. The first one, named Group Metropolis Sampling method, produces a Markov chain of sets of weighted samples. All these sets are then employed for obtaining a unique global estimator. The second one is the Distributed Particle Metropolis-Hastings technique, where different parallel particle filters are jointly used to drive an MCMC algorithm. Different resampled trajectories are compared and then tested with a proper acceptance probability. The novel schemes are tested in different numerical experiments such as learning the hyperparameters of Gaussian Processes, two localization problems in a wireless sensor network (with synthetic and real data) and the tracking of vegetation parameters given satellite observations, where they are compared with several benchmark Monte Carlo techniques. Three illustrative Matlab demos are also provided.
Tasks	Gaussian Processes
Published	2017-04-10
URL	http://arxiv.org/abs/1704.02771v4
PDF	http://arxiv.org/pdf/1704.02771v4.pdf
PWC	https://paperswithcode.com/paper/group-importance-sampling-for-particle
Repo
Framework

Efficiency Analysis of ASP Encodings for Sequential Pattern Mining Tasks


Title	Efficiency Analysis of ASP Encodings for Sequential Pattern Mining Tasks
Authors	Thomas Guyet, Yves Moinard, René Quiniou, Torsten Schaub
Abstract	This article presents the use of Answer Set Programming (ASP) to mine sequential patterns. ASP is a high-level declarative logic programming paradigm for high level encoding combinatorial and optimization problem solving as well as knowledge representation and reasoning. Thus, ASP is a good candidate for implementing pattern mining with background knowledge, which has been a data mining issue for a long time. We propose encodings of the classical sequential pattern mining tasks within two representations of embeddings (fill-gaps vs skip-gaps) and for various kinds of patterns: frequent, constrained and condensed. We compare the computational performance of these encodings with each other to get a good insight into the efficiency of ASP encodings. The results show that the fill-gaps strategy is better on real problems due to lower memory consumption. Finally, compared to a constraint programming approach (CPSM), another declarative programming paradigm, our proposal showed comparable performance.
Tasks	Sequential Pattern Mining
Published	2017-11-14
URL	http://arxiv.org/abs/1711.05090v1
PDF	http://arxiv.org/pdf/1711.05090v1.pdf
PWC	https://paperswithcode.com/paper/efficiency-analysis-of-asp-encodings-for
Repo
Framework

Convolutional Neural Networks for Page Segmentation of Historical Document Images


Title	Convolutional Neural Networks for Page Segmentation of Historical Document Images
Authors	Kai Chen, Mathias Seuret
Abstract	This paper presents a Convolutional Neural Network (CNN) based page segmentation method for handwritten historical document images. We consider page segmentation as a pixel labeling problem, i.e., each pixel is classified as one of the predefined classes. Traditional methods in this area rely on carefully hand-crafted features or large amounts of prior knowledge. In contrast, we propose to learn features from raw image pixels using a CNN. While many researchers focus on developing deep CNN architectures to solve different problems, we train a simple CNN with only one convolution layer. We show that the simple architecture achieves competitive results against other deep architectures on different public datasets. Experiments also demonstrate the effectiveness and superiority of the proposed method compared to previous methods.
Tasks
Published	2017-04-05
URL	http://arxiv.org/abs/1704.01474v2
PDF	http://arxiv.org/pdf/1704.01474v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-page
Repo
Framework

Solar Power Forecasting Using Support Vector Regression


Title	Solar Power Forecasting Using Support Vector Regression
Authors	Mohamed Abuella, Badrul Chowdhury
Abstract	Generation and load balance is required in the economic scheduling of generating units in the smart grid. Variable energy generations, particularly from wind and solar energy resources, are witnessing a rapid boost, and, it is anticipated that with a certain level of their penetration, they can become noteworthy sources of uncertainty. As in the case of load demand, energy forecasting can also be used to mitigate some of the challenges that arise from the uncertainty in the resource. While wind energy forecasting research is considered mature, solar energy forecasting is witnessing a steadily growing attention from the research community. This paper presents a support vector regression model to produce solar power forecasts on a rolling basis for 24 hours ahead over an entire year, to mimic the practical business of energy forecasting. Twelve weather variables are considered from a high-quality benchmark dataset and new variables are extracted. The added value of the heat index and wind speed as additional variables to the model is studied across different seasons. The support vector regression model performance is compared with artificial neural networks and multiple linear regression models for energy forecasting.
Tasks
Published	2017-03-29
URL	http://arxiv.org/abs/1703.09851v1
PDF	http://arxiv.org/pdf/1703.09851v1.pdf
PWC	https://paperswithcode.com/paper/solar-power-forecasting-using-support-vector
Repo
Framework

Bootstrapping incremental dialogue systems from minimal data: the generalisation power of dialogue grammars


Title	Bootstrapping incremental dialogue systems from minimal data: the generalisation power of dialogue grammars
Authors	Arash Eshghi, Igor Shalyminov, Oliver Lemon
Abstract	We investigate an end-to-end method for automatically inducing task-based dialogue systems from small amounts of unannotated dialogue data. It combines an incremental semantic grammar - Dynamic Syntax and Type Theory with Records (DS-TTR) - with Reinforcement Learning (RL), where language generation and dialogue management are a joint decision problem. The systems thus produced are incremental: dialogues are processed word-by-word, shown previously to be essential in supporting natural, spontaneous dialogue. We hypothesised that the rich linguistic knowledge within the grammar should enable a combinatorially large number of dialogue variations to be processed, even when trained on very few dialogues. Our experiments show that our model can process 74% of the Facebook AI bAbI dataset even when trained on only 0.13% of the data (5 dialogues). It can in addition process 65% of bAbI+, a corpus we created by systematically adding incremental dialogue phenomena such as restarts and self-corrections to bAbI. We compare our model with a state-of-the-art retrieval model, MemN2N. We find that, in terms of semantic accuracy, MemN2N shows very poor robustness to the bAbI+ transformations even when trained on the full bAbI dataset.
Tasks	Dialogue Management, Text Generation
Published	2017-09-22
URL	http://arxiv.org/abs/1709.07858v1
PDF	http://arxiv.org/pdf/1709.07858v1.pdf
PWC	https://paperswithcode.com/paper/bootstrapping-incremental-dialogue-systems-1
Repo
Framework

Smart “Predict, then Optimize”


Title	Smart “Predict, then Optimize”
Authors	Adam N. Elmachtoub, Paul Grigas
Abstract	Many real-world analytics problems involve two significant challenges: prediction and optimization. Due to the typically complex nature of each challenge, the standard paradigm is predict-then-optimize. By and large, machine learning tools are intended to minimize prediction error and do not account for how the predictions will be used in the downstream optimization problem. In contrast, we propose a new and very general framework, called Smart “Predict, then Optimize” (SPO), which directly leverages the optimization problem structure, i.e., its objective and constraints, for designing better prediction models. A key component of our framework is the SPO loss function which measures the decision error induced by a prediction. Training a prediction model with respect to the SPO loss is computationally challenging, and thus we derive, using duality theory, a convex surrogate loss function which we call the SPO+ loss. Most importantly, we prove that the SPO+ loss is statistically consistent with respect to the SPO loss under mild conditions. Our SPO+ loss function can tractably handle any polyhedral, convex, or even mixed-integer optimization problem with a linear objective. Numerical experiments on shortest path and portfolio optimization problems show that the SPO framework can lead to significant improvement under the predict-then-optimize paradigm, in particular when the prediction model being trained is misspecified.
Tasks	Portfolio Optimization
Published	2017-10-22
URL	https://arxiv.org/abs/1710.08005v3
PDF	https://arxiv.org/pdf/1710.08005v3.pdf
PWC	https://paperswithcode.com/paper/smart-predict-then-optimize
Repo
Framework