Paper Group ANR 1414
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports. Local Score Dependent Model Explanation for Time Dependent Covariates. Learning Pixel Representations for Generic Segmentation. Analysis of critical parameters of satellite stereo image for 3D reconstruction and mapping. Automated 3D recovery from very high r …
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports
Title | Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports |
Authors | Yuhao Zhang, Derek Merck, Emily Bao Tsai, Christopher D. Manning, Curtis P. Langlotz |
Abstract | Neural abstractive summarization models are able to generate summaries which have high overlap with human references. However, existing models are not optimized for factual correctness, a critical metric in real-world applications. In this work, we develop a general framework where we evaluate the factual correctness of a generated summary by fact-checking it against its reference using an information extraction module. We further propose a training strategy which optimizes a neural summarization model with a factual correctness reward via reinforcement learning. We apply the proposed method to the summarization of radiology reports, where factual correctness is a key requirement. On two separate datasets collected from real hospitals, we show via both automatic and human evaluation that the proposed approach substantially improves the factual correctness and overall quality of outputs over a competitive neural summarization system. |
Tasks | Abstractive Text Summarization |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02541v2 |
https://arxiv.org/pdf/1911.02541v2.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-the-factual-correctness-of-a |
Repo | |
Framework | |
Local Score Dependent Model Explanation for Time Dependent Covariates
Title | Local Score Dependent Model Explanation for Time Dependent Covariates |
Authors | Xochitl Watts, Freddy Lecue |
Abstract | The use of deep neural networks to make high risk decisions creates a need for global and local explanations so that users and experts have confidence in the modeling algorithms. We introduce a novel technique to find global and local explanations for time series data used in binary classification machine learning systems. We identify the most salient of the original features used by a black box model to distinguish between classes. The explanation can be made on categorical, continuous, and time series data and can be generalized to any binary classification model. The analysis is conducted on time series data to train a long short-term memory deep neural network and uses the time dependent structure of the underlying features in the explanation. The proposed technique attributes weights to features to explain an observations risk of belonging to a class as a multiplicative factor of a base hazard rate. We use a variation of the Cox Proportional Hazards regression, a Generalized Additive Model, to explain the effect of variables upon the probability of an in-class response for a score output from the black box model. The covariates incorporate time dependence structure in the features so the explanation is inclusive of the underlying time series data structure. |
Tasks | Time Series |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.04839v1 |
https://arxiv.org/pdf/1908.04839v1.pdf | |
PWC | https://paperswithcode.com/paper/local-score-dependent-model-explanation-for |
Repo | |
Framework | |
Learning Pixel Representations for Generic Segmentation
Title | Learning Pixel Representations for Generic Segmentation |
Authors | Oran Shayer, Michael Lindenbaum |
Abstract | Deep learning approaches to generic (non-semantic) segmentation have so far been indirect and relied on edge detection. This is in contrast to semantic segmentation, where DNNs are applied directly. We propose an alternative approach called Deep Generic Segmentation (DGS) and try to follow the path used for semantic segmentation. Our main contribution is a new method for learning a pixel-wise representation that reflects segment relatedness. This representation is combined with a CRF to yield the segmentation algorithm. We show that we are able to learn meaningful representations that improve segmentation quality and that the representations themselves achieve state-of-the-art segment similarity scores. The segmentation results are competitive and promising. |
Tasks | Edge Detection, Semantic Segmentation |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11735v1 |
https://arxiv.org/pdf/1909.11735v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-pixel-representations-for-generic |
Repo | |
Framework | |
Analysis of critical parameters of satellite stereo image for 3D reconstruction and mapping
Title | Analysis of critical parameters of satellite stereo image for 3D reconstruction and mapping |
Authors | Rongjun Qin |
Abstract | Although nowadays advanced dense image matching (DIM) algorithms are able to produce LiDAR (Light Detection And Ranging) comparable dense point clouds from satellite stereo images, the accuracy and completeness of such point clouds heavily depend on the geometric parameters of the satellite stereo images. The intersection angle between two images are normally seen as the most important one in stereo data acquisition, as the state-of-the-art DIM algorithms work best on narrow baseline (smaller intersection angle) stereos (E.g. Semi-Global Matching regards 15-25 degrees as good intersection angle). This factor is in line with the traditional aerial photogrammetry configuration, as the intersection angle directly relates to the base-high ratio and texture distortion in the parallax direction, thus both affecting the horizontal and vertical accuracy. However, our experiments found that even with very similar (and good) intersection angles, the same DIM algorithm applied on different stereo pairs (of the same area) produced point clouds with dramatically different accuracy as compared to the ground truth LiDAR data. This raises a very practical question that is often asked by practitioners: what factors constitute a good satellite stereo pair, such that it produces accurate and optimal results for mapping purpose? In this work, we provide a comprehensive analysis on this matter by performing stereo matching over 1,000 satellite stereo pairs with different acquisition parameters including their intersection angles, off-nadir angles, sun elevation & azimuth angles, as well as time differences, thus to offer a thorough answer to this question. This work will potentially provide a valuable reference to researchers working on multi-view satellite image reconstruction, as well as industrial practitioners minimizing costs for high-quality large-scale mapping. |
Tasks | 3D Reconstruction, Image Reconstruction, Stereo Matching, Stereo Matching Hand |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07476v1 |
https://arxiv.org/pdf/1905.07476v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-critical-parameters-of-satellite |
Repo | |
Framework | |
Automated 3D recovery from very high resolution multi-view satellite images
Title | Automated 3D recovery from very high resolution multi-view satellite images |
Authors | Rongjun Qin |
Abstract | This paper presents an automated pipeline for processing multi-view satellite images to 3D digital surface models (DSM). The proposed pipeline performs automated geo-referencing and generates high-quality densely matched point clouds. In particular, a novel approach is developed that fuses multiple depth maps derived by stereo matching to generate high-quality 3D maps. By learning critical configurations of stereo pairs from sample LiDAR data, we rank the image pairs based on the proximity of the results to the sample data. Multiple depth maps derived from individual image pairs are fused with an adaptive 3D median filter that considers the image spectral similarities. We demonstrate that the proposed adaptive median filter generally delivers better results in general as compared to normal median filter, and achieved an accuracy of improvement of 0.36 meters RMSE in the best case. Results and analysis are introduced in detail. |
Tasks | Stereo Matching, Stereo Matching Hand |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07475v2 |
https://arxiv.org/pdf/1905.07475v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-3d-recovery-from-very-high |
Repo | |
Framework | |
Sentiment Classification using N-gram IDF and Automated Machine Learning
Title | Sentiment Classification using N-gram IDF and Automated Machine Learning |
Authors | Rungroj Maipradit, Hideaki Hata, Kenichi Matsumoto |
Abstract | We propose a sentiment classification method with a general machine learning framework. For feature representation, n-gram IDF is used to extract software-engineering-related, dataset-specific, positive, neutral, and negative n-gram expressions. For classifiers, an automated machine learning tool is used. In the comparison using publicly available datasets, our method achieved the highest F1 values in positive and negative sentences on all datasets. |
Tasks | Sentiment Analysis |
Published | 2019-04-27 |
URL | https://arxiv.org/abs/1904.12162v2 |
https://arxiv.org/pdf/1904.12162v2.pdf | |
PWC | https://paperswithcode.com/paper/sentiment-classification-using-n-gram-idf-and |
Repo | |
Framework | |
Learning from Observations Using a Single Video Demonstration and Human Feedback
Title | Learning from Observations Using a Single Video Demonstration and Human Feedback |
Authors | Sunil Gandhi, Tim Oates, Tinoosh Mohsenin, Nicholas Waytowich |
Abstract | In this paper, we present a method for learning from video demonstrations by using human feedback to construct a mapping between the standard representation of the agent and the visual representation of the demonstration. In this way, we leverage the advantages of both these representations, i.e., we learn the policy using standard state representations, but are able to specify the expected behavior using video demonstration. We train an autonomous agent using a single video demonstration and use human feedback (using numerical similarity rating) to map the standard representation to the visual representation with a neural network. We show the effectiveness of our method by teaching a hopper agent in the MuJoCo to perform a backflip using a single video demonstration generated in MuJoCo as well as from a real-world YouTube video of a person performing a backflip. Additionally, we show that our method can transfer to new tasks, such as hopping, with very little human feedback. |
Tasks | |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13392v1 |
https://arxiv.org/pdf/1909.13392v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-observations-using-a-single |
Repo | |
Framework | |
Improving Route Choice Models by Incorporating Contextual Factors via Knowledge Distillation
Title | Improving Route Choice Models by Incorporating Contextual Factors via Knowledge Distillation |
Authors | Qun Liu, Supratik Mukhopadhyay, Yimin Zhu, Ravindra Gudishala, Sanaz Saeidi, Alimire Nabijiang |
Abstract | Route Choice Models predict the route choices of travelers traversing an urban area. Most of the route choice models link route characteristics of alternative routes to those chosen by the drivers. The models play an important role in prediction of traffic levels on different routes and thus assist in development of efficient traffic management strategies that result in minimizing traffic delay and maximizing effective utilization of transport system. High fidelity route choice models are required to predict traffic levels with higher accuracy. Existing route choice models do not take into account dynamic contextual conditions such as the occurrence of an accident, the socio-cultural and economic background of drivers, other human behaviors, the dynamic personal risk level, etc. As a result, they can only make predictions at an aggregate level and for a fixed set of contextual factors. For higher fidelity, it is highly desirable to use a model that captures significance of subjective or contextual factors in route choice. This paper presents a novel approach for developing high-fidelity route choice models with increased predictive power by augmenting existing aggregate level baseline models with information on drivers’ responses to contextual factors obtained from Stated Choice Experiments carried out in an Immersive Virtual Environment through the use of knowledge distillation. |
Tasks | |
Published | 2019-03-27 |
URL | http://arxiv.org/abs/1903.11253v1 |
http://arxiv.org/pdf/1903.11253v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-route-choice-models-by |
Repo | |
Framework | |
Hierarchical Deep Stereo Matching on High-resolution Images
Title | Hierarchical Deep Stereo Matching on High-resolution Images |
Authors | Gengshan Yang, Joshua Manela, Michael Happold, Deva Ramanan |
Abstract | We explore the problem of real-time stereo matching on high-res imagery. Many state-of-the-art (SOTA) methods struggle to process high-res imagery because of memory constraints or speed limitations. To address this issue, we propose an end-to-end framework that searches for correspondences incrementally over a coarse-to-fine hierarchy. Because high-res stereo datasets are relatively rare, we introduce a dataset with high-res stereo pairs for both training and evaluation. Our approach achieved SOTA performance on Middlebury-v3 and KITTI-15 while running significantly faster than its competitors. The hierarchical design also naturally allows for anytime on-demand reports of disparity by capping intermediate coarse results, allowing us to accurately predict disparity for near-range structures with low latency (30ms). We demonstrate that the performance-vs-speed trade-off afforded by on-demand hierarchies may address sensing needs for time-critical applications such as autonomous driving. |
Tasks | Autonomous Driving, Stereo Matching |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06704v1 |
https://arxiv.org/pdf/1912.06704v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-deep-stereo-matching-on-high-1 |
Repo | |
Framework | |
A Possible Reason for why Data-Driven Beats Theory-Driven Computer Vision
Title | A Possible Reason for why Data-Driven Beats Theory-Driven Computer Vision |
Authors | John K. Tsotsos, Iuliia Kotseruba, Alexander Andreopoulos, Yulong Wu |
Abstract | Why do some continue to wonder about the success and dominance of deep learning methods in computer vision and AI? Is it not enough that these methods provide practical solutions to many problems? Well no, it is not enough, at least for those who feel there should be a science that underpins all of this and that we should have a clear understanding of how this success was achieved. Here, this paper proposes that the dominance we are witnessing would not have been possible by the methods of deep learning alone: the tacit change has been the evolution of empirical practice in computer vision and AI over the past decades. We demonstrate this by examining the distribution of sensor settings in vision datasets and performance of both classic and deep learning algorithms under various camera settings. This reveals a strong mismatch between optimal performance ranges of classical theory-driven algorithms and sensor setting distributions in the common vision datasets, while data-driven models were trained for those datasets. The head-to-head comparisons between data-driven and theory-driven models were therefore unknowingly biased against the theory-driven models. |
Tasks | |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.10933v2 |
https://arxiv.org/pdf/1908.10933v2.pdf | |
PWC | https://paperswithcode.com/paper/a-possible-reason-for-why-data-driven-beats |
Repo | |
Framework | |
Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks
Title | Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks |
Authors | Ahmed T. Elthakeb, Prannoy Pilligundla, Alex Cloninger, Hadi Esmaeilzadeh |
Abstract | The deep layers of modern neural networks extract a rather rich set of features as an input propagates through the network. This paper sets out to harvest these rich intermediate representations for quantization with minimal accuracy loss while significantly reducing the memory footprint and compute intensity of the DNN. This paper utilizes knowledge distillation through teacher-student paradigm (Hinton et al., 2015) in a novel setting that exploits the feature extraction capability of DNNs for higher-accuracy quantization. As such, our algorithm logically divides a pretrained full-precision DNN to multiple sections, each of which exposes intermediate features to train a team of students independently in the quantized domain. This divide and conquer strategy, in fact, makes the training of each student section possible in isolation while all these independently trained sections are later stitched together to form the equivalent fully quantized network. Our algorithm is a sectional approach towards knowledge distillation and is not treating the intermediate representation as a hint for pretraining before one knowledge distillation pass over the entire network (Romero et al., 2015). Experiments on various DNNs (AlexNet, LeNet, MobileNet, ResNet-18, ResNet-20, SVHN and VGG-11) show that, this approach – called DCQ (Divide and Conquer Quantization) – on average, improves the performance of a state-of-the-art quantized training technique, DoReFa-Net (Zhou et al., 2016) by 21.6% and 9.3% for binary and ternary quantization, respectively. Additionally, we show that incorporating DCQ to existing quantized training methods leads to improved accuracies as compared to previously reported by multiple state-of-the-art quantized training methods. |
Tasks | Quantization |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06033v4 |
https://arxiv.org/pdf/1906.06033v4.pdf | |
PWC | https://paperswithcode.com/paper/divide-and-conquer-leveraging-intermediate |
Repo | |
Framework | |
Toward Gender-Inclusive Coreference Resolution
Title | Toward Gender-Inclusive Coreference Resolution |
Authors | Yang Trista Cao, Hal Daumé III |
Abstract | Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systemic biases in coreference resolution systems, including biases that reinforce cis-normativity and can harm binary and non-binary trans (and cis) stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and investigate where in the machine learning pipeline such biases can enter a system. We inspect many existing datasets for trans-exclusionary biases, and develop two new datasets for interrogating bias in crowd annotations and in existing coreference resolution systems. Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we will build systems that fail for: quality of service, stereotyping, and over- or under-representation. |
Tasks | Coreference Resolution |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13913v3 |
https://arxiv.org/pdf/1910.13913v3.pdf | |
PWC | https://paperswithcode.com/paper/toward-gender-inclusive-coreference |
Repo | |
Framework | |
Text Readability Assessment for Second Language Learners
Title | Text Readability Assessment for Second Language Learners |
Authors | Menglin Xia, Ekaterina Kochmar, Ted Briscoe |
Abstract | This paper addresses the task of readability assessment for the texts aimed at second language (L2) learners. One of the major challenges in this task is the lack of significantly sized level-annotated data. For the present work, we collected a dataset of CEFR-graded texts tailored for learners of English as an L2 and investigated text readability assessment for both native and L2 learners. We applied a generalization method to adapt models trained on larger native corpora to estimate text readability for learners, and explored domain adaptation and self-learning techniques to make use of the native data to improve system performance on the limited L2 data. In our experiments, the best performing model for readability on learner texts achieves an accuracy of 0.797 and PCC of $0.938$. |
Tasks | Domain Adaptation |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07580v1 |
https://arxiv.org/pdf/1906.07580v1.pdf | |
PWC | https://paperswithcode.com/paper/text-readability-assessment-for-second-1 |
Repo | |
Framework | |
DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling
Title | DiversityGAN: Diversity-Aware Vehicle Motion Prediction via Latent Semantic Sampling |
Authors | Xin Huang, Stephen G. McGill, Jonathan A. DeCastro, Luke Fletcher, John J. Leonard, Brian C. Williams, Guy Rosman |
Abstract | Vehicle trajectory prediction is crucial for autonomous driving and advanced driver assistant systems. While existing approaches may sample from a predicted distribution of vehicle trajectories, they lack the ability to explore it – a key ability for evaluating safety from a planning and verification perspective. In this work, we devise a novel approach for generating realistic and diverse vehicle trajectories. We extend the generative adversarial network (GAN) framework with a low-dimensional approximate semantic space, and shape that space to capture semantics such as merging and turning. We sample from this space in a way that mimics the predicted distribution, but allows us to control coverage of semantically distinct outcomes. We validate our approach on a publicly available dataset and show results that achieve state-of-the-art prediction performance, while providing improved coverage of the space of predicted trajectory semantics. |
Tasks | Autonomous Driving, motion prediction, Trajectory Prediction |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12736v2 |
https://arxiv.org/pdf/1911.12736v2.pdf | |
PWC | https://paperswithcode.com/paper/diversity-aware-vehicle-motion-prediction-via |
Repo | |
Framework | |
Make Thunderbolts Less Frightening – Predicting Extreme Weather Using Deep Learning
Title | Make Thunderbolts Less Frightening – Predicting Extreme Weather Using Deep Learning |
Authors | Christian Schön, Jens Dittrich |
Abstract | Forecasting severe weather conditions is still a very challenging and computationally expensive task due to the enormous amount of data and the complexity of the underlying physics. Machine learning approaches and especially deep learning have however shown huge improvements in many research areas dealing with large datasets in recent years. In this work, we tackle one specific sub-problem of weather forecasting, namely the prediction of thunderstorms and lightning. We propose the use of a convolutional neural network architecture inspired by UNet++ and ResNet to predict thunderstorms as a binary classification problem based on satellite images and lightnings recorded in the past. We achieve a probability of detection of more than 94% for lightnings within the next 15 minutes while at the same time minimizing the false alarm ratio compared to previous approaches. |
Tasks | Weather Forecasting |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01277v2 |
https://arxiv.org/pdf/1912.01277v2.pdf | |
PWC | https://paperswithcode.com/paper/make-thunderbolts-less-frightening-predicting |
Repo | |
Framework | |