January 26, 2020

2863 words 14 mins read

Paper Group ANR 1430

Seismic data interpolation based on U-net with texture loss. Remote Estimation of Free-Flow Speeds. Google vs IBM: A Constraint Solving Challenge on the Job-Shop Scheduling Problem. Monocular Plan View Networks for Autonomous Driving. Argument Identification in Public Comments from eRulemaking. PEGASUS: Pre-training with Extracted Gap-sentences for …

Seismic data interpolation based on U-net with texture loss


Title	Seismic data interpolation based on U-net with texture loss
Authors	Wenqian Fang, Lihua Fu, Meng Zhang, Zhiming Li
Abstract	Missing traces in acquired seismic data is a common occurrence during the collection of seismic data. Deep neural network (DNN) has shown considerable promise in restoring incomplete seismic data. However, several DNN-based approaches ignore the specific characteristics of seismic data itself, and only focus on reducing the difference between the recovered and the original signals. In this study, a novel Seismic U-net InterpolaTor (SUIT) is proposed to preserve the seismic texture information while reconstructing the missing traces. Aside from minimizing the reconstruction error, SUIT enhances the texture consistency between the recovery and the original completely seismic data, by designing a pre-trained U-Net to extract the texture information. The experiments show that our method outperforms the classic state-of-art methods in terms of robustness.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04092v1
PDF	https://arxiv.org/pdf/1911.04092v1.pdf
PWC	https://paperswithcode.com/paper/seismic-data-interpolation-based-on-u-net
Repo
Framework

Remote Estimation of Free-Flow Speeds


Title	Remote Estimation of Free-Flow Speeds
Authors	Weilian Song, Tawfiq Salem, Hunter Blanton, Nathan Jacobs
Abstract	We propose an automated method to estimate a road segment’s free-flow speed from overhead imagery and road metadata. The free-flow speed of a road segment is the average observed vehicle speed in ideal conditions, without congestion or adverse weather. Standard practice for estimating free-flow speeds depends on several road attributes, including grade, curve, and width of the right of way. Unfortunately, many of these fine-grained labels are not always readily available and are costly to manually annotate. To compensate, our model uses a small, easy to obtain subset of road features along with aerial imagery to directly estimate free-flow speed with a deep convolutional neural network (CNN). We evaluate our approach on a large dataset, and demonstrate that using imagery alone performs nearly as well as the road features and that the combination of imagery with road features leads to the highest accuracy.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10104v1
PDF	https://arxiv.org/pdf/1906.10104v1.pdf
PWC	https://paperswithcode.com/paper/remote-estimation-of-free-flow-speeds
Repo
Framework

Google vs IBM: A Constraint Solving Challenge on the Job-Shop Scheduling Problem


Title	Google vs IBM: A Constraint Solving Challenge on the Job-Shop Scheduling Problem
Authors	Giacomo Da Col, Erich Teppan
Abstract	The job-shop scheduling is one of the most studied optimization problems from the dawn of computer era to the present day. Its combinatorial nature makes it easily expressible as a constraint satisfaction problem. In this paper, we compare the performance of two constraint solvers on the job-shop scheduling problem. The solvers in question are: OR-Tools, an open-source solver developed by Google and winner of the last MiniZinc Challenge, and CP Optimizer, a proprietary IBM constraint solver targeted at industrial scheduling problems. The comparison is based on the goodness of the solutions found and the time required to solve the problem instances. First, we target the classic benchmarks from the literature, then we carry out the comparison on a benchmark that was created with known optimal solution, with size comparable to real-world industrial problems.
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08247v1
PDF	https://arxiv.org/pdf/1909.08247v1.pdf
PWC	https://paperswithcode.com/paper/google-vs-ibm-a-constraint-solving-challenge
Repo
Framework

Monocular Plan View Networks for Autonomous Driving


Title	Monocular Plan View Networks for Autonomous Driving
Authors	Dequan Wang, Coline Devin, Qi-Zhi Cai, Philipp Krähenbühl, Trevor Darrell
Abstract	Convolutions on monocular dash cam videos capture spatial invariances in the image plane but do not explicitly reason about distances and depth. We propose a simple transformation of observations into a bird’s eye view, also known as plan view, for end-to-end control. We detect vehicles and pedestrians in the first person view and project them into an overhead plan view. This representation provides an abstraction of the environment from which a deep network can easily deduce the positions and directions of entities. Additionally, the plan view enables us to leverage advances in 3D object detection in conjunction with deep policy learning. We evaluate our monocular plan view network on the photo-realistic Grand Theft Auto V simulator. A network using both a plan view and front view causes less than half as many collisions as previous detection-based methods and an order of magnitude fewer collisions than pure pixel-based policies.
Tasks	3D Object Detection, Autonomous Driving, Object Detection
Published	2019-05-16
URL	https://arxiv.org/abs/1905.06937v1
PDF	https://arxiv.org/pdf/1905.06937v1.pdf
PWC	https://paperswithcode.com/paper/monocular-plan-view-networks-for-autonomous
Repo
Framework

Argument Identification in Public Comments from eRulemaking


Title	Argument Identification in Public Comments from eRulemaking
Authors	Vlad Eidelman, Brian Grom
Abstract	Administrative agencies in the United States receive millions of comments each year concerning proposed agency actions during the eRulemaking process. These comments represent a diversity of arguments in support and opposition of the proposals. While agencies are required to identify and respond to substantive comments, they have struggled to keep pace with the volume of information. In this work we address the tasks of identifying argumentative text, classifying the type of argument claims employed, and determining the stance of the comment. First, we propose a taxonomy of argument claims based on an analysis of thousands of rules and millions of comments. Second, we collect and semi-automatically bootstrap annotations to create a dataset of millions of sentences with argument claim type annotation at the sentence level. Third, we build a system for automatically determining argumentative spans and claim type using our proposed taxonomy in a hierarchical classification model.
Tasks
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00572v2
PDF	https://arxiv.org/pdf/1905.00572v2.pdf
PWC	https://paperswithcode.com/paper/argument-identification-in-public-comments
Repo
Framework

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization


Title	PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
Authors	Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
Abstract	Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples.
Tasks	Abstractive Text Summarization, Text Summarization
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08777v1
PDF	https://arxiv.org/pdf/1912.08777v1.pdf
PWC	https://paperswithcode.com/paper/pegasus-pre-training-with-extracted-gap
Repo
Framework

NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport


Title	NeuTra-lizing Bad Geometry in Hamiltonian Monte Carlo Using Neural Transport
Authors	Matthew Hoffman, Pavel Sountsov, Joshua V. Dillon, Ian Langmore, Dustin Tran, Srinivas Vasudevan
Abstract	Hamiltonian Monte Carlo is a powerful algorithm for sampling from difficult-to-normalize posterior distributions. However, when the geometry of the posterior is unfavorable, it may take many expensive evaluations of the target distribution and its gradient to converge and mix. We propose neural transport (NeuTra) HMC, a technique for learning to correct this sort of unfavorable geometry using inverse autoregressive flows (IAF), a powerful neural variational inference technique. The IAF is trained to minimize the KL divergence from an isotropic Gaussian to the warped posterior, and then HMC sampling is performed in the warped space. We evaluate NeuTra HMC on a variety of synthetic and real problems, and find that it significantly outperforms vanilla HMC both in time to reach the stationary distribution and asymptotic effective-sample-size rates.
Tasks
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03704v1
PDF	http://arxiv.org/pdf/1903.03704v1.pdf
PWC	https://paperswithcode.com/paper/neutra-lizing-bad-geometry-in-hamiltonian
Repo
Framework

Goal-Embedded Dual Hierarchical Model for Task-Oriented Dialogue Generation


Title	Goal-Embedded Dual Hierarchical Model for Task-Oriented Dialogue Generation
Authors	Yi-An Lai, Arshit Gupta, Yi Zhang
Abstract	Hierarchical neural networks are often used to model inherent structures within dialogues. For goal-oriented dialogues, these models miss a mechanism adhering to the goals and neglect the distinct conversational patterns between two interlocutors. In this work, we propose Goal-Embedded Dual Hierarchical Attentional Encoder-Decoder (G-DuHA) able to center around goals and capture interlocutor-level disparity while modeling goal-oriented dialogues. Experiments on dialogue generation, response generation, and human evaluations demonstrate that the proposed model successfully generates higher-quality, more diverse and goal-centric dialogues. Moreover, we apply data augmentation via goal-oriented dialogue generation for task-oriented dialog systems with better performance achieved.
Tasks	Data Augmentation, Dialogue Generation
Published	2019-09-19
URL	https://arxiv.org/abs/1909.09220v1
PDF	https://arxiv.org/pdf/1909.09220v1.pdf
PWC	https://paperswithcode.com/paper/goal-embedded-dual-hierarchical-model-for
Repo
Framework

VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research


Title	VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research
Authors	Xin Wang, Jiawei Wu, Junkun Chen, Lei Li, Yuan-Fang Wang, William Yang Wang
Abstract	We present a new large-scale multilingual video description dataset, VATEX, which contains over 41,250 videos and 825,000 captions in both English and Chinese. Among the captions, there are over 206,000 English-Chinese parallel translation pairs. Compared to the widely-used MSR-VTT dataset, VATEX is multilingual, larger, linguistically complex, and more diverse in terms of both video and natural language descriptions. We also introduce two tasks for video-and-language research based on VATEX: (1) Multilingual Video Captioning, aimed at describing a video in various languages with a compact unified captioning model, and (2) Video-guided Machine Translation, to translate a source language description into the target language using the video information as additional spatiotemporal context. Extensive experiments on the VATEX dataset show that, first, the unified multilingual model can not only produce both English and Chinese descriptions for a video more efficiently, but also offer improved performance over the monolingual models. Furthermore, we demonstrate that the spatiotemporal video context can be effectively utilized to align source and target languages and thus assist machine translation. In the end, we discuss the potentials of using VATEX for other video-and-language research.
Tasks	Machine Translation, Video Captioning, Video Description
Published	2019-04-06
URL	https://arxiv.org/abs/1904.03493v2
PDF	https://arxiv.org/pdf/1904.03493v2.pdf
PWC	https://paperswithcode.com/paper/vatex-a-large-scale-high-quality-multilingual
Repo
Framework

Poq: Projection-based Runtime Assertions for Debugging on a Quantum Computer


Title	Poq: Projection-based Runtime Assertions for Debugging on a Quantum Computer
Authors	Gushu Li, Li Zhou, Nengkun Yu, Yufei Ding, Mingsheng Ying, Yuan Xie
Abstract	In this paper, we propose Poq, a runtime assertion scheme for debugging on a quantum computer. The predicates in the assertions are represented by projections (or equivalently, closed subspaces of the state space), following Birkhoff-von Neumann quantum logic. The satisfaction of a projection by a quantum state can be directly checked upon a small number of projective measurements rather than a large number of repeated executions. Several techniques are introduced to rotate the predicates to the computational basis, on which a realistic quantum computer usually supports its measurements, so that a satisfying tested state will not be destroyed when an assertion is checked and multi-assertion per testing execution is enabled. We compare Poq with existing quantum program assertions and demonstrate the effectiveness and efficiency of Poq by its applications to assert two sophisticated quantum algorithms.
Tasks
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12855v1
PDF	https://arxiv.org/pdf/1911.12855v1.pdf
PWC	https://paperswithcode.com/paper/poq-projection-based-runtime-assertions-for
Repo
Framework

A survey on Big Data and Machine Learning for Chemistry


Title	A survey on Big Data and Machine Learning for Chemistry
Authors	Jose F Rodrigues Jr, Larisa Florea, Maria C F de Oliveira, Dermot Diamond, Osvaldo N Oliveira Jr
Abstract	Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production; that is, the algorithms, in order to learn, demand large volumes of data of various natures and from different sources, from materials properties to sensor data. In the survey, we propose a roadmap for future developments, with emphasis on materials discovery and chemical sensing, and within the context of the Internet of Things (IoT), both prominent research fields for ML in the context of big data. In addition to providing an overview of recent advances, we elaborate upon the conceptual and practical limitations of big data and ML applied to chemistry, outlining processes, discussing pitfalls, and reviewing cases of success and failure.
Tasks
Published	2019-04-23
URL	http://arxiv.org/abs/1904.10370v1
PDF	http://arxiv.org/pdf/1904.10370v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-big-data-and-machine-learning-for
Repo
Framework

Cooper: Cooperative Perception for Connected Autonomous Vehicles based on 3D Point Clouds


Title	Cooper: Cooperative Perception for Connected Autonomous Vehicles based on 3D Point Clouds
Authors	Qi Chen, Sihai Tang, Qing Yang, Song Fu
Abstract	Autonomous vehicles may make wrong decisions due to inaccurate detection and recognition. Therefore, an intelligent vehicle can combine its own data with that of other vehicles to enhance perceptive ability, and thus improve detection accuracy and driving safety. However, multi-vehicle cooperative perception requires the integration of real world scenes and the traffic of raw sensor data exchange far exceeds the bandwidth of existing vehicular networks. To the best our knowledge, we are the first to conduct a study on raw-data level cooperative perception for enhancing the detection ability of self-driving systems. In this work, relying on LiDAR 3D point clouds, we fuse the sensor data collected from different positions and angles of connected vehicles. A point cloud based 3D object detection method is proposed to work on a diversity of aligned point clouds. Experimental results on KITTI and our collected dataset show that the proposed system outperforms perception by extending sensing area, improving detection accuracy and promoting augmented results. Most importantly, we demonstrate it is possible to transmit point clouds data for cooperative perception via existing vehicular network technologies.
Tasks	3D Object Detection, Autonomous Vehicles, Object Detection
Published	2019-05-13
URL	https://arxiv.org/abs/1905.05265v1
PDF	https://arxiv.org/pdf/1905.05265v1.pdf
PWC	https://paperswithcode.com/paper/cooper-cooperative-perception-for-connected
Repo
Framework

Deep Optics for Monocular Depth Estimation and 3D Object Detection


Title	Deep Optics for Monocular Depth Estimation and 3D Object Detection
Authors	Julie Chang, Gordon Wetzstein
Abstract	Depth estimation and 3D object detection are critical for scene understanding but remain challenging to perform with a single image due to the loss of 3D information during image capture. Recent models using deep neural networks have improved monocular depth estimation performance, but there is still difficulty in predicting absolute depth and generalizing outside a standard dataset. Here we introduce the paradigm of deep optics, i.e. end-to-end design of optics and image processing, to the monocular depth estimation problem, using coded defocus blur as an additional depth cue to be decoded by a neural network. We evaluate several optical coding strategies along with an end-to-end optimization scheme for depth estimation on three datasets, including NYU Depth v2 and KITTI. We find an optimized freeform lens design yields the best results, but chromatic aberration from a singlet lens offers significantly improved performance as well. We build a physical prototype and validate that chromatic aberrations improve depth estimation on real-world results. In addition, we train object detection networks on the KITTI dataset and show that the lens optimized for depth estimation also results in improved 3D object detection performance.
Tasks	3D Object Detection, Depth Estimation, Monocular Depth Estimation, Object Detection, Scene Understanding
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08601v1
PDF	http://arxiv.org/pdf/1904.08601v1.pdf
PWC	https://paperswithcode.com/paper/deep-optics-for-monocular-depth-estimation
Repo
Framework

Semantic Label Reduction Techniques for Autonomous Driving


Title	Semantic Label Reduction Techniques for Autonomous Driving
Authors	Qadeer Khan, Torsten Schön, Patrick Wenzel
Abstract	Semantic segmentation maps can be used as input to models for maneuvering the controls of a car. However, not all labels may be necessary for making the control decision. One would expect that certain labels such as road lanes or sidewalks would be more critical in comparison with labels for vegetation or buildings which may not have a direct influence on the car’s driving decision. In this appendix, we evaluate and quantify how sensitive and important the different semantic labels are for controlling the car. Labels that do not influence the driving decision are remapped to other classes, thereby simplifying the task by reducing to only labels critical for driving of the vehicle.
Tasks	Autonomous Driving, Semantic Segmentation
Published	2019-02-11
URL	http://arxiv.org/abs/1902.03777v1
PDF	http://arxiv.org/pdf/1902.03777v1.pdf
PWC	https://paperswithcode.com/paper/semantic-label-reduction-techniques-for
Repo
Framework

LibriVoxDeEn: A Corpus for German-to-English Speech Translation and German Speech Recognition


Title	LibriVoxDeEn: A Corpus for German-to-English Speech Translation and German Speech Recognition
Authors	Benjamin Beilharz, Xin Sun, Sariya Karimova, Stefan Riezler
Abstract	We present a corpus of sentence-aligned triples of German audio, German text, and English translation, based on German audiobooks. The speech translation data consist of 110 hours of audio material aligned to over 50k parallel sentences. An even larger dataset comprising 547 hours of German speech aligned to German text is available for speech recognition. The audio data is read speech and thus low in disfluencies. The quality of audio and sentence alignments has been checked by a manual evaluation, showing that speech alignment quality is in general very high. The sentence alignment quality is comparable to well-used parallel translation data and can be adjusted by cutoffs on the automatic alignment score. To our knowledge, this corpus is to date the largest resource for German speech recognition and for end-to-end German-to-English speech translation.
Tasks	Speech Recognition
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07924v3
PDF	https://arxiv.org/pdf/1910.07924v3.pdf
PWC	https://paperswithcode.com/paper/librivoxdeen-a-corpus-for-german-to-english
Repo
Framework