July 29, 2019

2979 words 14 mins read

Paper Group ANR 47

Towards seamless multi-view scene analysis from satellite to street-level. End-to-End Neural Segmental Models for Speech Recognition. Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki’s Submission to VarDial 2017. Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices. A Flexible Framework for Hy …

Towards seamless multi-view scene analysis from satellite to street-level


Title	Towards seamless multi-view scene analysis from satellite to street-level
Authors	Sébastien Lefèvre, Devis Tuia, Jan Dirk Wegner, Timothée Produit, Ahmed Samy Nassar
Abstract	In this paper, we discuss and review how combined multi-view imagery from satellite to street-level can benefit scene analysis. Numerous works exist that merge information from remote sensing and images acquired from the ground for tasks like land cover mapping, object detection, or scene understanding. What makes the combination of overhead and street-level images challenging, is the strongly varying viewpoint, different scale, illumination, sensor modality and time of acquisition. Direct (dense) matching of images on a per-pixel basis is thus often impossible, and one has to resort to alternative strategies that will be discussed in this paper. We review recent works that attempt to combine images taken from the ground and overhead views for purposes like scene registration, reconstruction, or classification. Three methods that represent the wide range of potential methods and applications (change detection, image orientation, and tree cataloging) are described in detail. We show that cross-fertilization between remote sensing, computer vision and machine learning is very valuable to make the best of geographic data available from Earth Observation sensors and ground imagery. Despite its challenges, we believe that integrating these complementary data sources will lead to major breakthroughs in Big GeoData.
Tasks	Object Detection, Scene Understanding
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08101v1
PDF	http://arxiv.org/pdf/1705.08101v1.pdf
PWC	https://paperswithcode.com/paper/towards-seamless-multi-view-scene-analysis
Repo
Framework

End-to-End Neural Segmental Models for Speech Recognition


Title	End-to-End Neural Segmental Models for Speech Recognition
Authors	Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals
Abstract	Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time. Neural segmental models are segmental models that use neural network-based weight functions. Neural segmental models have achieved competitive results for speech recognition, and their end-to-end training has been explored in several studies. In this work, we review neural segmental models, which can be viewed as consisting of a neural network-based acoustic encoder and a finite-state transducer decoder. We study end-to-end segmental models with different weight functions, including ones based on frame-level neural classifiers and on segmental recurrent neural networks. We study how reducing the search space size impacts performance under different weight functions. We also compare several loss functions for end-to-end training. Finally, we explore training approaches, including multi-stage vs. end-to-end training and multitask training that combines segmental and frame-level losses.
Tasks	Speech Recognition
Published	2017-08-01
URL	http://arxiv.org/abs/1708.00531v2
PDF	http://arxiv.org/pdf/1708.00531v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-neural-segmental-models-for-speech
Repo
Framework


Title	Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki’s Submission to VarDial 2017
Authors	Jörg Tiedemann
Abstract	This paper describes the submission from the University of Helsinki to the shared task on cross-lingual dependency parsing at VarDial 2017. We present work on annotation projection and treebank translation that gave good results for all three target languages in the test set. In particular, Slovak seems to work well with information coming from the Czech treebank, which is in line with related work. The attachment scores for cross-lingual models even surpass the fully supervised models trained on the target language treebank. Croatian is the most difficult language in the test set and the improvements over the baseline are rather modest. Norwegian works best with information coming from Swedish whereas Danish contributes surprisingly little.
Tasks	Dependency Parsing
Published	2017-08-18
URL	http://arxiv.org/abs/1708.05719v1
PDF	http://arxiv.org/pdf/1708.05719v1.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-dependency-parsing-for-closely
Repo
Framework

Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices


Title	Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices
Authors	Andre Ebert, Sebastian Feld, Florian Dorfmeister
Abstract	Mutual usage of vehicles as well as car sharing became more and more attractive during the last years. Especially in urban environments with limited parking possibilities and a higher risk for traffic jams, car rentals and sharing services may save time and money. But when renting a vehicle it could already be damaged (e.g., scratches or bumps inflicted by a previous user) without the damage being perceived by the service provider. In order to address such problems, we present an automated, motion-based system for impact detection, that facilitates a common smartphone as a sensor platform. The system is capable of detecting the impact segment and the point of time of an impact event on a vehicle’s surface, as well as its direction of origin. With this additional specific knowledge, it may be possible to reconstruct the circumstances of an impact event, e.g., to prove possible innocence of a service’s customer.
Tasks
Published	2017-03-16
URL	http://arxiv.org/abs/1703.05680v1
PDF	http://arxiv.org/pdf/1703.05680v1.pdf
PWC	https://paperswithcode.com/paper/segmented-and-directional-impact-detection
Repo
Framework

A Flexible Framework for Hypothesis Testing in High-dimensions


Title	A Flexible Framework for Hypothesis Testing in High-dimensions
Authors	Adel Javanmard, Jason D. Lee
Abstract	Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime where the number of parameters exceeds the number of samples ($p> n$). In order to make informative inference, we assume that the model is approximately sparse, that is the effect of covariates on the response can be well approximated by conditioning on a relatively small number of covariates whose identities are unknown. We develop a framework for testing very general hypotheses regarding the model parameters. Our framework encompasses testing whether the parameter lies in a convex cone, testing the signal strength, and testing arbitrary functionals of the parameter. We show that the proposed procedure controls the type I error, and also analyze the power of the procedure. Our numerical experiments confirm our theoretical findings and demonstrate that we control false positive rate (type I error) near the nominal level, and have high power. By duality between hypotheses testing and confidence intervals, the proposed framework can be used to obtain valid confidence intervals for various functionals of the model parameters. For linear functionals, the length of confidence intervals is shown to be minimax rate optimal.
Tasks
Published	2017-04-26
URL	https://arxiv.org/abs/1704.07971v4
PDF	https://arxiv.org/pdf/1704.07971v4.pdf
PWC	https://paperswithcode.com/paper/a-flexible-framework-for-hypothesis-testing
Repo
Framework

Finding Risk-Averse Shortest Path with Time-dependent Stochastic Costs


Title	Finding Risk-Averse Shortest Path with Time-dependent Stochastic Costs
Authors	Dajian Li, Paul Weng, Orkun Karabasoglu
Abstract	In this paper, we tackle the problem of risk-averse route planning in a transportation network with time-dependent and stochastic costs. To solve this problem, we propose an adaptation of the A* algorithm that accommodates any risk measure or decision criterion that is monotonic with first-order stochastic dominance. We also present a case study of our algorithm on the Manhattan, NYC, transportation network.
Tasks
Published	2017-01-03
URL	http://arxiv.org/abs/1701.00642v1
PDF	http://arxiv.org/pdf/1701.00642v1.pdf
PWC	https://paperswithcode.com/paper/finding-risk-averse-shortest-path-with-time
Repo
Framework

Deep Video Generation, Prediction and Completion of Human Action Sequences


Title	Deep Video Generation, Prediction and Completion of Human Action Sequences
Authors	Haoye Cai, Chunyan Bai, Yu-Wing Tai, Chi-Keung Tang
Abstract	Current deep learning results on video generation are limited while there are only a few first results on video prediction and no relevant significant results on video completion. This is due to the severe ill-posedness inherent in these three problems. In this paper, we focus on human action videos, and propose a general, two-stage deep framework to generate human action videos with no constraints or arbitrary number of constraints, which uniformly address the three problems: video generation given no input frames, video prediction given the first few frames, and video completion given the first and last frames. To make the problem tractable, in the first stage we train a deep generative model that generates a human pose sequence from random noise. In the second stage, a skeleton-to-image network is trained, which is used to generate a human action video given the complete human pose sequence generated in the first stage. By introducing the two-stage strategy, we sidestep the original ill-posed problems while producing for the first time high-quality video generation/prediction/completion results of much longer duration. We present quantitative and qualitative evaluation to show that our two-stage approach outperforms state-of-the-art methods in video generation, prediction and video completion. Our video result demonstration can be viewed at https://iamacewhite.github.io/supp/index.html
Tasks	Video Generation, Video Prediction
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08682v3
PDF	http://arxiv.org/pdf/1711.08682v3.pdf
PWC	https://paperswithcode.com/paper/deep-video-generation-prediction-and
Repo
Framework

Multi-Generator Generative Adversarial Nets


Title	Multi-Generator Generative Adversarial Nets
Authors	Quan Hoang, Tu Dinh Nguyen, Trung Le, Dinh Phung
Abstract	We propose a new approach to train the Generative Adversarial Nets (GANs) with a mixture of generators to overcome the mode collapsing problem. The main intuition is to employ multiple generators, instead of using a single one as in the original GAN. The idea is simple, yet proven to be extremely effective at covering diverse data modes, easily overcoming the mode collapse and delivering state-of-the-art results. A minimax formulation is able to establish among a classifier, a discriminator, and a set of generators in a similar spirit with GAN. Generators create samples that are intended to come from the same distribution as the training data, whilst the discriminator determines whether samples are true data or generated by generators, and the classifier specifies which generator a sample comes from. The distinguishing feature is that internal samples are created from multiple generators, and then one of them will be randomly selected as final output similar to the mechanism of a probabilistic mixture model. We term our method Mixture GAN (MGAN). We develop theoretical analysis to prove that, at the equilibrium, the Jensen-Shannon divergence (JSD) between the mixture of generators’ distributions and the empirical data distribution is minimal, whilst the JSD among generators’ distributions is maximal, hence effectively avoiding the mode collapse. By utilizing parameter sharing, our proposed model adds minimal computational cost to the standard GAN, and thus can also efficiently scale to large-scale datasets. We conduct extensive experiments on synthetic 2D data and natural image databases (CIFAR-10, STL-10 and ImageNet) to demonstrate the superior performance of our MGAN in achieving state-of-the-art Inception scores over latest baselines, generating diverse and appealing recognizable objects at different resolutions, and specializing in capturing different types of objects by generators.
Tasks
Published	2017-08-08
URL	http://arxiv.org/abs/1708.02556v4
PDF	http://arxiv.org/pdf/1708.02556v4.pdf
PWC	https://paperswithcode.com/paper/multi-generator-generative-adversarial-nets
Repo
Framework


Title	An Ensemble Model with Ranking for Social Dialogue
Authors	Ioannis Papaioannou, Amanda Cercas Curry, Jose L. Part, Igor Shalyminov, Xinnuo Xu, Yanchao Yu, Ondřej Dušek, Verena Rieser, Oliver Lemon
Abstract	Open-domain social dialogue is one of the long-standing goals of Artificial Intelligence. This year, the Amazon Alexa Prize challenge was announced for the first time, where real customers get to rate systems developed by leading universities worldwide. The aim of the challenge is to converse “coherently and engagingly with humans on popular topics for 20 minutes”. We describe our Alexa Prize system (called ‘Alana’) consisting of an ensemble of bots, combining rule-based and machine learning systems, and using a contextual ranking mechanism to choose a system response. The ranker was trained on real user feedback received during the competition, where we address the problem of how to train on the noisy and sparse feedback obtained during the competition.
Tasks
Published	2017-12-20
URL	http://arxiv.org/abs/1712.07558v1
PDF	http://arxiv.org/pdf/1712.07558v1.pdf
PWC	https://paperswithcode.com/paper/an-ensemble-model-with-ranking-for-social
Repo
Framework

Bootstrapping a Lexicon for Emotional Arousal in Software Engineering


Title	Bootstrapping a Lexicon for Emotional Arousal in Software Engineering
Authors	Mika V. Mäntylä, Nicole Novielli, Filippo Lanubile, Maëlick Claes, Miikka Kuutila
Abstract	Emotional arousal increases activation and performance but may also lead to burnout in software development. We present the first version of a Software Engineering Arousal lexicon (SEA) that is specifically designed to address the problem of emotional arousal in the software developer ecosystem. SEA is built using a bootstrapping approach that combines word embedding model trained on issue-tracking data and manual scoring of items in the lexicon. We show that our lexicon is able to differentiate between issue priorities, which are a source of emotional activation and then act as a proxy for arousal. The best performance is obtained by combining SEA (428 words) with a previously created general purpose lexicon by Warriner et al. (13,915 words) and it achieves Cohen’s d effect sizes up to 0.5.
Tasks
Published	2017-03-27
URL	http://arxiv.org/abs/1703.09046v1
PDF	http://arxiv.org/pdf/1703.09046v1.pdf
PWC	https://paperswithcode.com/paper/bootstrapping-a-lexicon-for-emotional-arousal
Repo
Framework

The Price of Diversity in Assignment Problems


Title	The Price of Diversity in Assignment Problems
Authors	Nawal Benabbou, Mithun Chakraborty, Vinh Ho Xuan, Jakub Sliwinski, Yair Zick
Abstract	We introduce and analyze an extension to the matching problem on a weighted bipartite graph: Assignment with Type Constraints. The two parts of the graph are partitioned into subsets called types and blocks; we seek a matching with the largest sum of weights under the constraint that there is a pre-specified cap on the number of vertices matched in every type-block pair. Our primary motivation stems from the public housing program of Singapore, accounting for over 70% of its residential real estate. To promote ethnic diversity within its housing projects, Singapore imposes ethnicity quotas: each new housing development comprises blocks of flats and each ethnicity-based group in the population must not own more than a certain percentage of flats in a block. Other domains using similar hard capacity constraints include matching prospective students to schools or medical residents to hospitals. Limiting agents’ choices for ensuring diversity in this manner naturally entails some welfare loss. One of our goals is to study the trade-off between diversity and social welfare in such settings. We first show that, while the classic assignment program is polynomial-time computable, adding diversity constraints makes it computationally intractable; however, we identify a $\tfrac{1}{2}$-approximation algorithm, as well as reasonable assumptions on the weights that permit poly-time algorithms. Next, we provide two upper bounds on the price of diversity – a measure of the loss in welfare incurred by imposing diversity constraints – as functions of natural problem parameters. We conclude the paper with simulations based on publicly available data from two diversity-constrained allocation problems – Singapore Public Housing and Chicago School Choice – which shed light on how the constrained maximization as well as lottery-based variants perform in practice.
Tasks
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10241v7
PDF	http://arxiv.org/pdf/1711.10241v7.pdf
PWC	https://paperswithcode.com/paper/the-price-of-diversity-in-assignment-problems
Repo
Framework

The Singularity May Be Near


Title	The Singularity May Be Near
Authors	Roman V. Yampolskiy
Abstract	Toby Walsh in ‘The Singularity May Never Be Near’ gives six arguments to support his point of view that technological singularity may happen but that it is unlikely. In this paper, we provide analysis of each one of his arguments and arrive at similar conclusions, but with more weight given to the ‘likely to happen’ probability.
Tasks
Published	2017-05-31
URL	http://arxiv.org/abs/1706.01303v1
PDF	http://arxiv.org/pdf/1706.01303v1.pdf
PWC	https://paperswithcode.com/paper/the-singularity-may-be-near
Repo
Framework

Brain Inspired Cognitive Model with Attention for Self-Driving Cars


Title	Brain Inspired Cognitive Model with Attention for Self-Driving Cars
Authors	Shitao Chen, Songyi Zhang, Jinghao Shang, Badong Chen, Nanning Zheng
Abstract	Perception-driven approach and end-to-end system are two major vision-based frameworks for self-driving cars. However, it is difficult to introduce attention and historical information of autonomous driving process, which are the essential factors for achieving human-like driving into these two methods. In this paper, we propose a novel model for self-driving cars named brain-inspired cognitive model with attention (CMA). This model consists of three parts: a convolutional neural network for simulating human visual cortex, a cognitive map built to describe relationships between objects in complex traffic scene and a recurrent neural network that combines with the real-time updated cognitive map to implement attention mechanism and long-short term memory. The benefit of our model is that can accurately solve three tasks simultaneously:1) detection of the free space and boundaries of the current and adjacent lanes. 2)estimation of obstacle distance and vehicle attitude, and 3) learning of driving behavior and decision making from human driver. More significantly, the proposed model could accept external navigating instructions during an end-to-end driving process. For evaluation, we build a large-scale road-vehicle dataset which contains more than forty thousand labeled road images captured by three cameras on our self-driving car. Moreover, human driving activities and vehicle states are recorded in the meanwhile.
Tasks	Autonomous Driving, Decision Making, Self-Driving Cars
Published	2017-02-18
URL	http://arxiv.org/abs/1702.05596v1
PDF	http://arxiv.org/pdf/1702.05596v1.pdf
PWC	https://paperswithcode.com/paper/brain-inspired-cognitive-model-with-attention
Repo
Framework

Tensor Train Neighborhood Preserving Embedding


Title	Tensor Train Neighborhood Preserving Embedding
Authors	Wenqi Wang, Vaneet Aggarwal, Shuchin Aeron
Abstract	In this paper, we propose a Tensor Train Neighborhood Preserving Embedding (TTNPE) to embed multi-dimensional tensor data into low dimensional tensor subspace. Novel approaches to solve the optimization problem in TTNPE are proposed. For this embedding, we evaluate novel trade-off gain among classification, computation, and dimensionality reduction (storage) for supervised learning. It is shown that compared to the state-of-the-arts tensor embedding methods, TTNPE achieves superior trade-off in classification, computation, and dimensionality reduction in MNIST handwritten digits and Weizmann face datasets.
Tasks	Dimensionality Reduction
Published	2017-12-03
URL	http://arxiv.org/abs/1712.00828v2
PDF	http://arxiv.org/pdf/1712.00828v2.pdf
PWC	https://paperswithcode.com/paper/tensor-train-neighborhood-preserving
Repo
Framework

Thread Reconstruction in Conversational Data using Neural Coherence Models


Title	Thread Reconstruction in Conversational Data using Neural Coherence Models
Authors	Dat Tien Nguyen, Shafiq Joty, Basma El Amel Boussaha, Maarten de Rijke
Abstract	Discussion forums are an important source of information. They are often used to answer specific questions a user might have and to discover more about a topic of interest. Discussions in these forums may evolve in intricate ways, making it difficult for users to follow the flow of ideas. We propose a novel approach for automatically identifying the underlying thread structure of a forum discussion. Our approach is based on a neural model that computes coherence scores of possible reconstructions and then selects the highest scoring, i.e., the most coherent one. Preliminary experiments demonstrate promising results outperforming a number of strong baseline methods.
Tasks
Published	2017-07-24
URL	http://arxiv.org/abs/1707.07660v2
PDF	http://arxiv.org/pdf/1707.07660v2.pdf
PWC	https://paperswithcode.com/paper/thread-reconstruction-in-conversational-data
Repo
Framework