Paper Group ANR 47
Towards seamless multi-view scene analysis from satellite to street-level. End-to-End Neural Segmental Models for Speech Recognition. Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki’s Submission to VarDial 2017. Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices. A Flexible Framework for Hy …
Towards seamless multi-view scene analysis from satellite to street-level
Title | Towards seamless multi-view scene analysis from satellite to street-level |
Authors | Sébastien Lefèvre, Devis Tuia, Jan Dirk Wegner, Timothée Produit, Ahmed Samy Nassar |
Abstract | In this paper, we discuss and review how combined multi-view imagery from satellite to street-level can benefit scene analysis. Numerous works exist that merge information from remote sensing and images acquired from the ground for tasks like land cover mapping, object detection, or scene understanding. What makes the combination of overhead and street-level images challenging, is the strongly varying viewpoint, different scale, illumination, sensor modality and time of acquisition. Direct (dense) matching of images on a per-pixel basis is thus often impossible, and one has to resort to alternative strategies that will be discussed in this paper. We review recent works that attempt to combine images taken from the ground and overhead views for purposes like scene registration, reconstruction, or classification. Three methods that represent the wide range of potential methods and applications (change detection, image orientation, and tree cataloging) are described in detail. We show that cross-fertilization between remote sensing, computer vision and machine learning is very valuable to make the best of geographic data available from Earth Observation sensors and ground imagery. Despite its challenges, we believe that integrating these complementary data sources will lead to major breakthroughs in Big GeoData. |
Tasks | Object Detection, Scene Understanding |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08101v1 |
http://arxiv.org/pdf/1705.08101v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-seamless-multi-view-scene-analysis |
Repo | |
Framework | |
End-to-End Neural Segmental Models for Speech Recognition
Title | End-to-End Neural Segmental Models for Speech Recognition |
Authors | Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals |
Abstract | Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time. Neural segmental models are segmental models that use neural network-based weight functions. Neural segmental models have achieved competitive results for speech recognition, and their end-to-end training has been explored in several studies. In this work, we review neural segmental models, which can be viewed as consisting of a neural network-based acoustic encoder and a finite-state transducer decoder. We study end-to-end segmental models with different weight functions, including ones based on frame-level neural classifiers and on segmental recurrent neural networks. We study how reducing the search space size impacts performance under different weight functions. We also compare several loss functions for end-to-end training. Finally, we explore training approaches, including multi-stage vs. end-to-end training and multitask training that combines segmental and frame-level losses. |
Tasks | Speech Recognition |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00531v2 |
http://arxiv.org/pdf/1708.00531v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-neural-segmental-models-for-speech |
Repo | |
Framework | |
Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki’s Submission to VarDial 2017
Title | Cross-Lingual Dependency Parsing for Closely Related Languages - Helsinki’s Submission to VarDial 2017 |
Authors | Jörg Tiedemann |
Abstract | This paper describes the submission from the University of Helsinki to the shared task on cross-lingual dependency parsing at VarDial 2017. We present work on annotation projection and treebank translation that gave good results for all three target languages in the test set. In particular, Slovak seems to work well with information coming from the Czech treebank, which is in line with related work. The attachment scores for cross-lingual models even surpass the fully supervised models trained on the target language treebank. Croatian is the most difficult language in the test set and the improvements over the baseline are rather modest. Norwegian works best with information coming from Swedish whereas Danish contributes surprisingly little. |
Tasks | Dependency Parsing |
Published | 2017-08-18 |
URL | http://arxiv.org/abs/1708.05719v1 |
http://arxiv.org/pdf/1708.05719v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-lingual-dependency-parsing-for-closely |
Repo | |
Framework | |
Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices
Title | Segmented and Directional Impact Detection for Parked Vehicles using Mobile Devices |
Authors | Andre Ebert, Sebastian Feld, Florian Dorfmeister |
Abstract | Mutual usage of vehicles as well as car sharing became more and more attractive during the last years. Especially in urban environments with limited parking possibilities and a higher risk for traffic jams, car rentals and sharing services may save time and money. But when renting a vehicle it could already be damaged (e.g., scratches or bumps inflicted by a previous user) without the damage being perceived by the service provider. In order to address such problems, we present an automated, motion-based system for impact detection, that facilitates a common smartphone as a sensor platform. The system is capable of detecting the impact segment and the point of time of an impact event on a vehicle’s surface, as well as its direction of origin. With this additional specific knowledge, it may be possible to reconstruct the circumstances of an impact event, e.g., to prove possible innocence of a service’s customer. |
Tasks | |
Published | 2017-03-16 |
URL | http://arxiv.org/abs/1703.05680v1 |
http://arxiv.org/pdf/1703.05680v1.pdf | |
PWC | https://paperswithcode.com/paper/segmented-and-directional-impact-detection |
Repo | |
Framework | |
A Flexible Framework for Hypothesis Testing in High-dimensions
Title | A Flexible Framework for Hypothesis Testing in High-dimensions |
Authors | Adel Javanmard, Jason D. Lee |
Abstract | Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime where the number of parameters exceeds the number of samples ($p> n$). In order to make informative inference, we assume that the model is approximately sparse, that is the effect of covariates on the response can be well approximated by conditioning on a relatively small number of covariates whose identities are unknown. We develop a framework for testing very general hypotheses regarding the model parameters. Our framework encompasses testing whether the parameter lies in a convex cone, testing the signal strength, and testing arbitrary functionals of the parameter. We show that the proposed procedure controls the type I error, and also analyze the power of the procedure. Our numerical experiments confirm our theoretical findings and demonstrate that we control false positive rate (type I error) near the nominal level, and have high power. By duality between hypotheses testing and confidence intervals, the proposed framework can be used to obtain valid confidence intervals for various functionals of the model parameters. For linear functionals, the length of confidence intervals is shown to be minimax rate optimal. |
Tasks | |
Published | 2017-04-26 |
URL | https://arxiv.org/abs/1704.07971v4 |
https://arxiv.org/pdf/1704.07971v4.pdf | |
PWC | https://paperswithcode.com/paper/a-flexible-framework-for-hypothesis-testing |
Repo | |
Framework | |
Finding Risk-Averse Shortest Path with Time-dependent Stochastic Costs
Title | Finding Risk-Averse Shortest Path with Time-dependent Stochastic Costs |
Authors | Dajian Li, Paul Weng, Orkun Karabasoglu |
Abstract | In this paper, we tackle the problem of risk-averse route planning in a transportation network with time-dependent and stochastic costs. To solve this problem, we propose an adaptation of the A* algorithm that accommodates any risk measure or decision criterion that is monotonic with first-order stochastic dominance. We also present a case study of our algorithm on the Manhattan, NYC, transportation network. |
Tasks | |
Published | 2017-01-03 |
URL | http://arxiv.org/abs/1701.00642v1 |
http://arxiv.org/pdf/1701.00642v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-risk-averse-shortest-path-with-time |
Repo | |
Framework | |
Deep Video Generation, Prediction and Completion of Human Action Sequences
Title | Deep Video Generation, Prediction and Completion of Human Action Sequences |
Authors | Haoye Cai, Chunyan Bai, Yu-Wing Tai, Chi-Keung Tang |
Abstract | Current deep learning results on video generation are limited while there are only a few first results on video prediction and no relevant significant results on video completion. This is due to the severe ill-posedness inherent in these three problems. In this paper, we focus on human action videos, and propose a general, two-stage deep framework to generate human action videos with no constraints or arbitrary number of constraints, which uniformly address the three problems: video generation given no input frames, video prediction given the first few frames, and video completion given the first and last frames. To make the problem tractable, in the first stage we train a deep generative model that generates a human pose sequence from random noise. In the second stage, a skeleton-to-image network is trained, which is used to generate a human action video given the complete human pose sequence generated in the first stage. By introducing the two-stage strategy, we sidestep the original ill-posed problems while producing for the first time high-quality video generation/prediction/completion results of much longer duration. We present quantitative and qualitative evaluation to show that our two-stage approach outperforms state-of-the-art methods in video generation, prediction and video completion. Our video result demonstration can be viewed at https://iamacewhite.github.io/supp/index.html |
Tasks | Video Generation, Video Prediction |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08682v3 |
http://arxiv.org/pdf/1711.08682v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-video-generation-prediction-and |
Repo | |
Framework | |
Multi-Generator Generative Adversarial Nets
Title | Multi-Generator Generative Adversarial Nets |
Authors | Quan Hoang, Tu Dinh Nguyen, Trung Le, Dinh Phung |
Abstract | We propose a new approach to train the Generative Adversarial Nets (GANs) with a mixture of generators to overcome the mode collapsing problem. The main intuition is to employ multiple generators, instead of using a single one as in the original GAN. The idea is simple, yet proven to be extremely effective at covering diverse data modes, easily overcoming the mode collapse and delivering state-of-the-art results. A minimax formulation is able to establish among a classifier, a discriminator, and a set of generators in a similar spirit with GAN. Generators create samples that are intended to come from the same distribution as the training data, whilst the discriminator determines whether samples are true data or generated by generators, and the classifier specifies which generator a sample comes from. The distinguishing feature is that internal samples are created from multiple generators, and then one of them will be randomly selected as final output similar to the mechanism of a probabilistic mixture model. We term our method Mixture GAN (MGAN). We develop theoretical analysis to prove that, at the equilibrium, the Jensen-Shannon divergence (JSD) between the mixture of generators’ distributions and the empirical data distribution is minimal, whilst the JSD among generators’ distributions is maximal, hence effectively avoiding the mode collapse. By utilizing parameter sharing, our proposed model adds minimal computational cost to the standard GAN, and thus can also efficiently scale to large-scale datasets. We conduct extensive experiments on synthetic 2D data and natural image databases (CIFAR-10, STL-10 and ImageNet) to demonstrate the superior performance of our MGAN in achieving state-of-the-art Inception scores over latest baselines, generating diverse and appealing recognizable objects at different resolutions, and specializing in capturing different types of objects by generators. |
Tasks | |
Published | 2017-08-08 |
URL | http://arxiv.org/abs/1708.02556v4 |
http://arxiv.org/pdf/1708.02556v4.pdf | |
PWC | https://paperswithcode.com/paper/multi-generator-generative-adversarial-nets |
Repo | |
Framework | |
An Ensemble Model with Ranking for Social Dialogue
Title | An Ensemble Model with Ranking for Social Dialogue |
Authors | Ioannis Papaioannou, Amanda Cercas Curry, Jose L. Part, Igor Shalyminov, Xinnuo Xu, Yanchao Yu, Ondřej Dušek, Verena Rieser, Oliver Lemon |
Abstract | Open-domain social dialogue is one of the long-standing goals of Artificial Intelligence. This year, the Amazon Alexa Prize challenge was announced for the first time, where real customers get to rate systems developed by leading universities worldwide. The aim of the challenge is to converse “coherently and engagingly with humans on popular topics for 20 minutes”. We describe our Alexa Prize system (called ‘Alana’) consisting of an ensemble of bots, combining rule-based and machine learning systems, and using a contextual ranking mechanism to choose a system response. The ranker was trained on real user feedback received during the competition, where we address the problem of how to train on the noisy and sparse feedback obtained during the competition. |
Tasks | |
Published | 2017-12-20 |
URL | http://arxiv.org/abs/1712.07558v1 |
http://arxiv.org/pdf/1712.07558v1.pdf | |
PWC | https://paperswithcode.com/paper/an-ensemble-model-with-ranking-for-social |
Repo | |
Framework | |
Bootstrapping a Lexicon for Emotional Arousal in Software Engineering
Title | Bootstrapping a Lexicon for Emotional Arousal in Software Engineering |
Authors | Mika V. Mäntylä, Nicole Novielli, Filippo Lanubile, Maëlick Claes, Miikka Kuutila |
Abstract | Emotional arousal increases activation and performance but may also lead to burnout in software development. We present the first version of a Software Engineering Arousal lexicon (SEA) that is specifically designed to address the problem of emotional arousal in the software developer ecosystem. SEA is built using a bootstrapping approach that combines word embedding model trained on issue-tracking data and manual scoring of items in the lexicon. We show that our lexicon is able to differentiate between issue priorities, which are a source of emotional activation and then act as a proxy for arousal. The best performance is obtained by combining SEA (428 words) with a previously created general purpose lexicon by Warriner et al. (13,915 words) and it achieves Cohen’s d effect sizes up to 0.5. |
Tasks | |
Published | 2017-03-27 |
URL | http://arxiv.org/abs/1703.09046v1 |
http://arxiv.org/pdf/1703.09046v1.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-a-lexicon-for-emotional-arousal |
Repo | |
Framework | |
The Price of Diversity in Assignment Problems
Title | The Price of Diversity in Assignment Problems |
Authors | Nawal Benabbou, Mithun Chakraborty, Vinh Ho Xuan, Jakub Sliwinski, Yair Zick |
Abstract | We introduce and analyze an extension to the matching problem on a weighted bipartite graph: Assignment with Type Constraints. The two parts of the graph are partitioned into subsets called types and blocks; we seek a matching with the largest sum of weights under the constraint that there is a pre-specified cap on the number of vertices matched in every type-block pair. Our primary motivation stems from the public housing program of Singapore, accounting for over 70% of its residential real estate. To promote ethnic diversity within its housing projects, Singapore imposes ethnicity quotas: each new housing development comprises blocks of flats and each ethnicity-based group in the population must not own more than a certain percentage of flats in a block. Other domains using similar hard capacity constraints include matching prospective students to schools or medical residents to hospitals. Limiting agents’ choices for ensuring diversity in this manner naturally entails some welfare loss. One of our goals is to study the trade-off between diversity and social welfare in such settings. We first show that, while the classic assignment program is polynomial-time computable, adding diversity constraints makes it computationally intractable; however, we identify a $\tfrac{1}{2}$-approximation algorithm, as well as reasonable assumptions on the weights that permit poly-time algorithms. Next, we provide two upper bounds on the price of diversity – a measure of the loss in welfare incurred by imposing diversity constraints – as functions of natural problem parameters. We conclude the paper with simulations based on publicly available data from two diversity-constrained allocation problems – Singapore Public Housing and Chicago School Choice – which shed light on how the constrained maximization as well as lottery-based variants perform in practice. |
Tasks | |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10241v7 |
http://arxiv.org/pdf/1711.10241v7.pdf | |
PWC | https://paperswithcode.com/paper/the-price-of-diversity-in-assignment-problems |
Repo | |
Framework | |
The Singularity May Be Near
Title | The Singularity May Be Near |
Authors | Roman V. Yampolskiy |
Abstract | Toby Walsh in ‘The Singularity May Never Be Near’ gives six arguments to support his point of view that technological singularity may happen but that it is unlikely. In this paper, we provide analysis of each one of his arguments and arrive at similar conclusions, but with more weight given to the ‘likely to happen’ probability. |
Tasks | |
Published | 2017-05-31 |
URL | http://arxiv.org/abs/1706.01303v1 |
http://arxiv.org/pdf/1706.01303v1.pdf | |
PWC | https://paperswithcode.com/paper/the-singularity-may-be-near |
Repo | |
Framework | |
Brain Inspired Cognitive Model with Attention for Self-Driving Cars
Title | Brain Inspired Cognitive Model with Attention for Self-Driving Cars |
Authors | Shitao Chen, Songyi Zhang, Jinghao Shang, Badong Chen, Nanning Zheng |
Abstract | Perception-driven approach and end-to-end system are two major vision-based frameworks for self-driving cars. However, it is difficult to introduce attention and historical information of autonomous driving process, which are the essential factors for achieving human-like driving into these two methods. In this paper, we propose a novel model for self-driving cars named brain-inspired cognitive model with attention (CMA). This model consists of three parts: a convolutional neural network for simulating human visual cortex, a cognitive map built to describe relationships between objects in complex traffic scene and a recurrent neural network that combines with the real-time updated cognitive map to implement attention mechanism and long-short term memory. The benefit of our model is that can accurately solve three tasks simultaneously:1) detection of the free space and boundaries of the current and adjacent lanes. 2)estimation of obstacle distance and vehicle attitude, and 3) learning of driving behavior and decision making from human driver. More significantly, the proposed model could accept external navigating instructions during an end-to-end driving process. For evaluation, we build a large-scale road-vehicle dataset which contains more than forty thousand labeled road images captured by three cameras on our self-driving car. Moreover, human driving activities and vehicle states are recorded in the meanwhile. |
Tasks | Autonomous Driving, Decision Making, Self-Driving Cars |
Published | 2017-02-18 |
URL | http://arxiv.org/abs/1702.05596v1 |
http://arxiv.org/pdf/1702.05596v1.pdf | |
PWC | https://paperswithcode.com/paper/brain-inspired-cognitive-model-with-attention |
Repo | |
Framework | |
Tensor Train Neighborhood Preserving Embedding
Title | Tensor Train Neighborhood Preserving Embedding |
Authors | Wenqi Wang, Vaneet Aggarwal, Shuchin Aeron |
Abstract | In this paper, we propose a Tensor Train Neighborhood Preserving Embedding (TTNPE) to embed multi-dimensional tensor data into low dimensional tensor subspace. Novel approaches to solve the optimization problem in TTNPE are proposed. For this embedding, we evaluate novel trade-off gain among classification, computation, and dimensionality reduction (storage) for supervised learning. It is shown that compared to the state-of-the-arts tensor embedding methods, TTNPE achieves superior trade-off in classification, computation, and dimensionality reduction in MNIST handwritten digits and Weizmann face datasets. |
Tasks | Dimensionality Reduction |
Published | 2017-12-03 |
URL | http://arxiv.org/abs/1712.00828v2 |
http://arxiv.org/pdf/1712.00828v2.pdf | |
PWC | https://paperswithcode.com/paper/tensor-train-neighborhood-preserving |
Repo | |
Framework | |
Thread Reconstruction in Conversational Data using Neural Coherence Models
Title | Thread Reconstruction in Conversational Data using Neural Coherence Models |
Authors | Dat Tien Nguyen, Shafiq Joty, Basma El Amel Boussaha, Maarten de Rijke |
Abstract | Discussion forums are an important source of information. They are often used to answer specific questions a user might have and to discover more about a topic of interest. Discussions in these forums may evolve in intricate ways, making it difficult for users to follow the flow of ideas. We propose a novel approach for automatically identifying the underlying thread structure of a forum discussion. Our approach is based on a neural model that computes coherence scores of possible reconstructions and then selects the highest scoring, i.e., the most coherent one. Preliminary experiments demonstrate promising results outperforming a number of strong baseline methods. |
Tasks | |
Published | 2017-07-24 |
URL | http://arxiv.org/abs/1707.07660v2 |
http://arxiv.org/pdf/1707.07660v2.pdf | |
PWC | https://paperswithcode.com/paper/thread-reconstruction-in-conversational-data |
Repo | |
Framework | |