January 26, 2020

3069 words 15 mins read

Paper Group ANR 1414

Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports. Local Score Dependent Model Explanation for Time Dependent Covariates. Learning Pixel Representations for Generic Segmentation. Analysis of critical parameters of satellite stereo image for 3D reconstruction and mapping. Automated 3D recovery from very high r …

Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports


Title	Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports
Authors	Yuhao Zhang, Derek Merck, Emily Bao Tsai, Christopher D. Manning, Curtis P. Langlotz
Abstract	Neural abstractive summarization models are able to generate summaries which have high overlap with human references. However, existing models are not optimized for factual correctness, a critical metric in real-world applications. In this work, we develop a general framework where we evaluate the factual correctness of a generated summary by fact-checking it against its reference using an information extraction module. We further propose a training strategy which optimizes a neural summarization model with a factual correctness reward via reinforcement learning. We apply the proposed method to the summarization of radiology reports, where factual correctness is a key requirement. On two separate datasets collected from real hospitals, we show via both automatic and human evaluation that the proposed approach substantially improves the factual correctness and overall quality of outputs over a competitive neural summarization system.
Tasks	Abstractive Text Summarization
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02541v2
PDF	https://arxiv.org/pdf/1911.02541v2.pdf
PWC	https://paperswithcode.com/paper/optimizing-the-factual-correctness-of-a
Repo
Framework

Local Score Dependent Model Explanation for Time Dependent Covariates


Title	Local Score Dependent Model Explanation for Time Dependent Covariates
Authors	Xochitl Watts, Freddy Lecue
Abstract	The use of deep neural networks to make high risk decisions creates a need for global and local explanations so that users and experts have confidence in the modeling algorithms. We introduce a novel technique to find global and local explanations for time series data used in binary classification machine learning systems. We identify the most salient of the original features used by a black box model to distinguish between classes. The explanation can be made on categorical, continuous, and time series data and can be generalized to any binary classification model. The analysis is conducted on time series data to train a long short-term memory deep neural network and uses the time dependent structure of the underlying features in the explanation. The proposed technique attributes weights to features to explain an observations risk of belonging to a class as a multiplicative factor of a base hazard rate. We use a variation of the Cox Proportional Hazards regression, a Generalized Additive Model, to explain the effect of variables upon the probability of an in-class response for a score output from the black box model. The covariates incorporate time dependence structure in the features so the explanation is inclusive of the underlying time series data structure.
Tasks	Time Series
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04839v1
PDF	https://arxiv.org/pdf/1908.04839v1.pdf
PWC	https://paperswithcode.com/paper/local-score-dependent-model-explanation-for
Repo
Framework

Learning Pixel Representations for Generic Segmentation


Title	Learning Pixel Representations for Generic Segmentation
Authors	Oran Shayer, Michael Lindenbaum
Abstract	Deep learning approaches to generic (non-semantic) segmentation have so far been indirect and relied on edge detection. This is in contrast to semantic segmentation, where DNNs are applied directly. We propose an alternative approach called Deep Generic Segmentation (DGS) and try to follow the path used for semantic segmentation. Our main contribution is a new method for learning a pixel-wise representation that reflects segment relatedness. This representation is combined with a CRF to yield the segmentation algorithm. We show that we are able to learn meaningful representations that improve segmentation quality and that the representations themselves achieve state-of-the-art segment similarity scores. The segmentation results are competitive and promising.
Tasks	Edge Detection, Semantic Segmentation
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11735v1
PDF	https://arxiv.org/pdf/1909.11735v1.pdf
PWC	https://paperswithcode.com/paper/learning-pixel-representations-for-generic
Repo
Framework

Analysis of critical parameters of satellite stereo image for 3D reconstruction and mapping


Title	Analysis of critical parameters of satellite stereo image for 3D reconstruction and mapping
Authors	Rongjun Qin
Abstract	Although nowadays advanced dense image matching (DIM) algorithms are able to produce LiDAR (Light Detection And Ranging) comparable dense point clouds from satellite stereo images, the accuracy and completeness of such point clouds heavily depend on the geometric parameters of the satellite stereo images. The intersection angle between two images are normally seen as the most important one in stereo data acquisition, as the state-of-the-art DIM algorithms work best on narrow baseline (smaller intersection angle) stereos (E.g. Semi-Global Matching regards 15-25 degrees as good intersection angle). This factor is in line with the traditional aerial photogrammetry configuration, as the intersection angle directly relates to the base-high ratio and texture distortion in the parallax direction, thus both affecting the horizontal and vertical accuracy. However, our experiments found that even with very similar (and good) intersection angles, the same DIM algorithm applied on different stereo pairs (of the same area) produced point clouds with dramatically different accuracy as compared to the ground truth LiDAR data. This raises a very practical question that is often asked by practitioners: what factors constitute a good satellite stereo pair, such that it produces accurate and optimal results for mapping purpose? In this work, we provide a comprehensive analysis on this matter by performing stereo matching over 1,000 satellite stereo pairs with different acquisition parameters including their intersection angles, off-nadir angles, sun elevation & azimuth angles, as well as time differences, thus to offer a thorough answer to this question. This work will potentially provide a valuable reference to researchers working on multi-view satellite image reconstruction, as well as industrial practitioners minimizing costs for high-quality large-scale mapping.
Tasks	3D Reconstruction, Image Reconstruction, Stereo Matching, Stereo Matching Hand
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07476v1
PDF	https://arxiv.org/pdf/1905.07476v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-critical-parameters-of-satellite
Repo
Framework

Automated 3D recovery from very high resolution multi-view satellite images


Title	Automated 3D recovery from very high resolution multi-view satellite images
Authors	Rongjun Qin
Abstract	This paper presents an automated pipeline for processing multi-view satellite images to 3D digital surface models (DSM). The proposed pipeline performs automated geo-referencing and generates high-quality densely matched point clouds. In particular, a novel approach is developed that fuses multiple depth maps derived by stereo matching to generate high-quality 3D maps. By learning critical configurations of stereo pairs from sample LiDAR data, we rank the image pairs based on the proximity of the results to the sample data. Multiple depth maps derived from individual image pairs are fused with an adaptive 3D median filter that considers the image spectral similarities. We demonstrate that the proposed adaptive median filter generally delivers better results in general as compared to normal median filter, and achieved an accuracy of improvement of 0.36 meters RMSE in the best case. Results and analysis are introduced in detail.
Tasks	Stereo Matching, Stereo Matching Hand
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07475v2
PDF	https://arxiv.org/pdf/1905.07475v2.pdf
PWC	https://paperswithcode.com/paper/automated-3d-recovery-from-very-high
Repo
Framework

Sentiment Classification using N-gram IDF and Automated Machine Learning


Title	Sentiment Classification using N-gram IDF and Automated Machine Learning
Authors	Rungroj Maipradit, Hideaki Hata, Kenichi Matsumoto
Abstract	We propose a sentiment classification method with a general machine learning framework. For feature representation, n-gram IDF is used to extract software-engineering-related, dataset-specific, positive, neutral, and negative n-gram expressions. For classifiers, an automated machine learning tool is used. In the comparison using publicly available datasets, our method achieved the highest F1 values in positive and negative sentences on all datasets.
Tasks	Sentiment Analysis
Published	2019-04-27
URL	https://arxiv.org/abs/1904.12162v2
PDF	https://arxiv.org/pdf/1904.12162v2.pdf
PWC	https://paperswithcode.com/paper/sentiment-classification-using-n-gram-idf-and
Repo
Framework

Learning from Observations Using a Single Video Demonstration and Human Feedback


Title	Learning from Observations Using a Single Video Demonstration and Human Feedback
Authors	Sunil Gandhi, Tim Oates, Tinoosh Mohsenin, Nicholas Waytowich
Abstract	In this paper, we present a method for learning from video demonstrations by using human feedback to construct a mapping between the standard representation of the agent and the visual representation of the demonstration. In this way, we leverage the advantages of both these representations, i.e., we learn the policy using standard state representations, but are able to specify the expected behavior using video demonstration. We train an autonomous agent using a single video demonstration and use human feedback (using numerical similarity rating) to map the standard representation to the visual representation with a neural network. We show the effectiveness of our method by teaching a hopper agent in the MuJoCo to perform a backflip using a single video demonstration generated in MuJoCo as well as from a real-world YouTube video of a person performing a backflip. Additionally, we show that our method can transfer to new tasks, such as hopping, with very little human feedback.
Tasks
Published	2019-09-29
URL	https://arxiv.org/abs/1909.13392v1
PDF	https://arxiv.org/pdf/1909.13392v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-observations-using-a-single
Repo
Framework

Improving Route Choice Models by Incorporating Contextual Factors via Knowledge Distillation


Title	Improving Route Choice Models by Incorporating Contextual Factors via Knowledge Distillation
Authors	Qun Liu, Supratik Mukhopadhyay, Yimin Zhu, Ravindra Gudishala, Sanaz Saeidi, Alimire Nabijiang
Abstract	Route Choice Models predict the route choices of travelers traversing an urban area. Most of the route choice models link route characteristics of alternative routes to those chosen by the drivers. The models play an important role in prediction of traffic levels on different routes and thus assist in development of efficient traffic management strategies that result in minimizing traffic delay and maximizing effective utilization of transport system. High fidelity route choice models are required to predict traffic levels with higher accuracy. Existing route choice models do not take into account dynamic contextual conditions such as the occurrence of an accident, the socio-cultural and economic background of drivers, other human behaviors, the dynamic personal risk level, etc. As a result, they can only make predictions at an aggregate level and for a fixed set of contextual factors. For higher fidelity, it is highly desirable to use a model that captures significance of subjective or contextual factors in route choice. This paper presents a novel approach for developing high-fidelity route choice models with increased predictive power by augmenting existing aggregate level baseline models with information on drivers’ responses to contextual factors obtained from Stated Choice Experiments carried out in an Immersive Virtual Environment through the use of knowledge distillation.
Tasks
Published	2019-03-27
URL	http://arxiv.org/abs/1903.11253v1
PDF	http://arxiv.org/pdf/1903.11253v1.pdf
PWC	https://paperswithcode.com/paper/improving-route-choice-models-by
Repo
Framework

Hierarchical Deep Stereo Matching on High-resolution Images


Title	Hierarchical Deep Stereo Matching on High-resolution Images
Authors	Gengshan Yang, Joshua Manela, Michael Happold, Deva Ramanan
Abstract	We explore the problem of real-time stereo matching on high-res imagery. Many state-of-the-art (SOTA) methods struggle to process high-res imagery because of memory constraints or speed limitations. To address this issue, we propose an end-to-end framework that searches for correspondences incrementally over a coarse-to-fine hierarchy. Because high-res stereo datasets are relatively rare, we introduce a dataset with high-res stereo pairs for both training and evaluation. Our approach achieved SOTA performance on Middlebury-v3 and KITTI-15 while running significantly faster than its competitors. The hierarchical design also naturally allows for anytime on-demand reports of disparity by capping intermediate coarse results, allowing us to accurately predict disparity for near-range structures with low latency (30ms). We demonstrate that the performance-vs-speed trade-off afforded by on-demand hierarchies may address sensing needs for time-critical applications such as autonomous driving.
Tasks	Autonomous Driving, Stereo Matching
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06704v1
PDF	https://arxiv.org/pdf/1912.06704v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-deep-stereo-matching-on-high-1
Repo
Framework

A Possible Reason for why Data-Driven Beats Theory-Driven Computer Vision


Title	A Possible Reason for why Data-Driven Beats Theory-Driven Computer Vision
Authors	John K. Tsotsos, Iuliia Kotseruba, Alexander Andreopoulos, Yulong Wu
Abstract	Why do some continue to wonder about the success and dominance of deep learning methods in computer vision and AI? Is it not enough that these methods provide practical solutions to many problems? Well no, it is not enough, at least for those who feel there should be a science that underpins all of this and that we should have a clear understanding of how this success was achieved. Here, this paper proposes that the dominance we are witnessing would not have been possible by the methods of deep learning alone: the tacit change has been the evolution of empirical practice in computer vision and AI over the past decades. We demonstrate this by examining the distribution of sensor settings in vision datasets and performance of both classic and deep learning algorithms under various camera settings. This reveals a strong mismatch between optimal performance ranges of classical theory-driven algorithms and sensor setting distributions in the common vision datasets, while data-driven models were trained for those datasets. The head-to-head comparisons between data-driven and theory-driven models were therefore unknowingly biased against the theory-driven models.
Tasks
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10933v2
PDF	https://arxiv.org/pdf/1908.10933v2.pdf
PWC	https://paperswithcode.com/paper/a-possible-reason-for-why-data-driven-beats
Repo
Framework

Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks


Title	Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks
Authors	Ahmed T. Elthakeb, Prannoy Pilligundla, Alex Cloninger, Hadi Esmaeilzadeh
Abstract	The deep layers of modern neural networks extract a rather rich set of features as an input propagates through the network. This paper sets out to harvest these rich intermediate representations for quantization with minimal accuracy loss while significantly reducing the memory footprint and compute intensity of the DNN. This paper utilizes knowledge distillation through teacher-student paradigm (Hinton et al., 2015) in a novel setting that exploits the feature extraction capability of DNNs for higher-accuracy quantization. As such, our algorithm logically divides a pretrained full-precision DNN to multiple sections, each of which exposes intermediate features to train a team of students independently in the quantized domain. This divide and conquer strategy, in fact, makes the training of each student section possible in isolation while all these independently trained sections are later stitched together to form the equivalent fully quantized network. Our algorithm is a sectional approach towards knowledge distillation and is not treating the intermediate representation as a hint for pretraining before one knowledge distillation pass over the entire network (Romero et al., 2015). Experiments on various DNNs (AlexNet, LeNet, MobileNet, ResNet-18, ResNet-20, SVHN and VGG-11) show that, this approach – called DCQ (Divide and Conquer Quantization) – on average, improves the performance of a state-of-the-art quantized training technique, DoReFa-Net (Zhou et al., 2016) by 21.6% and 9.3% for binary and ternary quantization, respectively. Additionally, we show that incorporating DCQ to existing quantized training methods leads to improved accuracies as compared to previously reported by multiple state-of-the-art quantized training methods.
Tasks	Quantization
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06033v4
PDF	https://arxiv.org/pdf/1906.06033v4.pdf
PWC	https://paperswithcode.com/paper/divide-and-conquer-leveraging-intermediate
Repo
Framework

Toward Gender-Inclusive Coreference Resolution


Title	Toward Gender-Inclusive Coreference Resolution
Authors	Yang Trista Cao, Hal Daumé III
Abstract	Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systemic biases in coreference resolution systems, including biases that reinforce cis-normativity and can harm binary and non-binary trans (and cis) stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and investigate where in the machine learning pipeline such biases can enter a system. We inspect many existing datasets for trans-exclusionary biases, and develop two new datasets for interrogating bias in crowd annotations and in existing coreference resolution systems. Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we will build systems that fail for: quality of service, stereotyping, and over- or under-representation.
Tasks	Coreference Resolution
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13913v3
PDF	https://arxiv.org/pdf/1910.13913v3.pdf
PWC	https://paperswithcode.com/paper/toward-gender-inclusive-coreference
Repo
Framework

Text Readability Assessment for Second Language Learners


Title	Text Readability Assessment for Second Language Learners
Authors	Menglin Xia, Ekaterina Kochmar, Ted Briscoe
Abstract	This paper addresses the task of readability assessment for the texts aimed at second language (L2) learners. One of the major challenges in this task is the lack of significantly sized level-annotated data. For the present work, we collected a dataset of CEFR-graded texts tailored for learners of English as an L2 and investigated text readability assessment for both native and L2 learners. We applied a generalization method to adapt models trained on larger native corpora to estimate text readability for learners, and explored domain adaptation and self-learning techniques to make use of the native data to improve system performance on the limited L2 data. In our experiments, the best performing model for readability on learner texts achieves an accuracy of 0.797 and PCC of $0.938$.
Tasks	Domain Adaptation
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07580v1
PDF	https://arxiv.org/pdf/1906.07580v1.pdf
PWC	https://paperswithcode.com/paper/text-readability-assessment-for-second-1
Repo
Framework

DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling


Title	DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling
Authors	Xin Huang, Stephen G. McGill, Jonathan A. DeCastro, Luke Fletcher, John J. Leonard, Brian C. Williams, Guy Rosman
Abstract	Vehicle trajectory prediction is crucial for autonomous driving and advanced driver assistant systems. While existing approaches may sample from a predicted distribution of vehicle trajectories, they lack the ability to explore it – a key ability for evaluating safety from a planning and verification perspective. In this work, we devise a novel approach for generating realistic and diverse vehicle trajectories. We extend the generative adversarial network (GAN) framework with a low-dimensional approximate semantic space, and shape that space to capture semantics such as merging and turning. We sample from this space in a way that mimics the predicted distribution, but allows us to control coverage of semantically distinct outcomes. We validate our approach on a publicly available dataset and show results that achieve state-of-the-art prediction performance, while providing improved coverage of the space of predicted trajectory semantics.
Tasks	Autonomous Driving, motion prediction, Trajectory Prediction
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12736v2
PDF	https://arxiv.org/pdf/1911.12736v2.pdf
PWC	https://paperswithcode.com/paper/diversity-aware-vehicle-motion-prediction-via
Repo
Framework

Make Thunderbolts Less Frightening – Predicting Extreme Weather Using Deep Learning


Title	Make Thunderbolts Less Frightening – Predicting Extreme Weather Using Deep Learning
Authors	Christian Schön, Jens Dittrich
Abstract	Forecasting severe weather conditions is still a very challenging and computationally expensive task due to the enormous amount of data and the complexity of the underlying physics. Machine learning approaches and especially deep learning have however shown huge improvements in many research areas dealing with large datasets in recent years. In this work, we tackle one specific sub-problem of weather forecasting, namely the prediction of thunderstorms and lightning. We propose the use of a convolutional neural network architecture inspired by UNet++ and ResNet to predict thunderstorms as a binary classification problem based on satellite images and lightnings recorded in the past. We achieve a probability of detection of more than 94% for lightnings within the next 15 minutes while at the same time minimizing the false alarm ratio compared to previous approaches.
Tasks	Weather Forecasting
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01277v2
PDF	https://arxiv.org/pdf/1912.01277v2.pdf
PWC	https://paperswithcode.com/paper/make-thunderbolts-less-frightening-predicting
Repo
Framework