October 19, 2019

3118 words 15 mins read

Paper Group ANR 132

Reinforcement Learning with Wasserstein Distance Regularisation, with Applications to Multipolicy Learning. Sketch based Reduced Memory Hough Transform. An End-to-end Neural Natural Language Interface for Databases. Large-Scale Object Detection of Images from Network Cameras in Variable Ambient Lighting Conditions. Deep Semi Supervised Generative L …

Reinforcement Learning with Wasserstein Distance Regularisation, with Applications to Multipolicy Learning


Title	Reinforcement Learning with Wasserstein Distance Regularisation, with Applications to Multipolicy Learning
Authors	Mohammed Amin Abdullah, Aldo Pacchiano, Moez Draief
Abstract	We describe an application of Wasserstein distance to Reinforcement Learning. The Wasserstein distance in question is between the distribution of mappings of trajectories of a policy into some metric space, and some other fixed distribution (which may, for example, come from another policy). Different policies induce different distributions, so given an underlying metric, the Wasserstein distance quantifies how different policies are. This can be used to learn multiple polices which are different in terms of such Wasserstein distances by using a Wasserstein regulariser. Changing the sign of the regularisation parameter, one can learn a policy for which its trajectory mapping distribution is attracted to a given fixed distribution.
Tasks
Published	2018-02-12
URL	https://arxiv.org/abs/1802.03976v2
PDF	https://arxiv.org/pdf/1802.03976v2.pdf
PWC	https://paperswithcode.com/paper/a-note-on-reinforcement-learning-with
Repo
Framework

Sketch based Reduced Memory Hough Transform


Title	Sketch based Reduced Memory Hough Transform
Authors	Levi Offen, Michael Werman
Abstract	This paper proposes using sketch algorithms to represent the votes in Hough transforms. Replacing the accumulator array with a sketch (Sketch Hough Transform - SHT) significantly reduces the memory needed to compute a Hough transform. We also present a new sketch, Count Median Update, which works better than known sketch methods for replacing the accumulator array in the Hough Transform.
Tasks
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06287v1
PDF	http://arxiv.org/pdf/1811.06287v1.pdf
PWC	https://paperswithcode.com/paper/sketch-based-reduced-memory-hough-transform
Repo
Framework

An End-to-end Neural Natural Language Interface for Databases


Title	An End-to-end Neural Natural Language Interface for Databases
Authors	Prasetya Utama, Nathaniel Weir, Fuat Basik, Carsten Binnig, Ugur Cetintemel, Benjamin Hättasch, Amir Ilkhechi, Shekar Ramaswamy, Arif Usta
Abstract	The ability to extract insights from new data sets is critical for decision making. Visual interactive tools play an important role in data exploration since they provide non-technical users with an effective way to visually compose queries and comprehend the results. Natural language has recently gained traction as an alternative query interface to databases with the potential to enable non-expert users to formulate complex questions and information needs efficiently and effectively. However, understanding natural language questions and translating them accurately to SQL is a challenging task, and thus Natural Language Interfaces for Databases (NLIDBs) have not yet made their way into practical tools and commercial products. In this paper, we present DBPal, a novel data exploration tool with a natural language interface. DBPal leverages recent advances in deep models to make query understanding more robust in the following ways: First, DBPal uses a deep model to translate natural language statements to SQL, making the translation process more robust to paraphrasing and other linguistic variations. Second, to support the users in phrasing questions without knowing the database schema and the query features, DBPal provides a learned auto-completion model that suggests partial query extensions to users during query formulation and thus helps to write complex queries.
Tasks	Decision Making
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00401v1
PDF	http://arxiv.org/pdf/1804.00401v1.pdf
PWC	https://paperswithcode.com/paper/an-end-to-end-neural-natural-language
Repo
Framework

Large-Scale Object Detection of Images from Network Cameras in Variable Ambient Lighting Conditions


Title	Large-Scale Object Detection of Images from Network Cameras in Variable Ambient Lighting Conditions
Authors	Caleb Tung, Matthew R. Kelleher, Ryan J. Schlueter, Binhan Xu, Yung-Hsiang Lu, George K. Thiruvathukal, Yen-Kuang Chen, Yang Lu
Abstract	Computer vision relies on labeled datasets for training and evaluation in detecting and recognizing objects. The popular computer vision program, YOLO (“You Only Look Once”), has been shown to accurately detect objects in many major image datasets. However, the images found in those datasets, are independent of one another and cannot be used to test YOLO’s consistency at detecting the same object as its environment (e.g. ambient lighting) changes. This paper describes a novel effort to evaluate YOLO’s consistency for large-scale applications. It does so by working (a) at large scale and (b) by using consecutive images from a curated network of public video cameras deployed in a variety of real-world situations, including traffic intersections, national parks, shopping malls, university campuses, etc. We specifically examine YOLO’s ability to detect objects in different scenarios (e.g., daytime vs. night), leveraging the cameras’ ability to rapidly retrieve many successive images for evaluating detection consistency. Using our camera network and advanced computing resources (supercomputers), we analyzed more than 5 million images captured by 140 network cameras in 24 hours. Compared with labels marked by humans (considered as “ground truth”), YOLO struggles to consistently detect the same humans and cars as their positions change from one frame to the next; it also struggles to detect objects at night time. Our findings suggest that state-of-the art vision solutions should be trained by data from network camera with contextual information before they can be deployed in applications that demand high consistency on object detection.
Tasks	Object Detection
Published	2018-12-31
URL	http://arxiv.org/abs/1812.11901v1
PDF	http://arxiv.org/pdf/1812.11901v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-object-detection-of-images-from
Repo
Framework

Deep Semi Supervised Generative Learning for Automated PD-L1 Tumor Cell Scoring on NSCLC Tissue Needle Biopsies


Title	Deep Semi Supervised Generative Learning for Automated PD-L1 Tumor Cell Scoring on NSCLC Tissue Needle Biopsies
Authors	Ansh Kapil, Armin Meier, Aleksandra Zuraw, Keith Steele, Marlon Rebelatto, Günter Schmidt, Nicolas Brieu
Abstract	The level of PD-L1 expression in immunohistochemistry (IHC) assays is a key biomarker for the identification of Non-Small-Cell-Lung-Cancer (NSCLC) patients that may respond to anti PD-1/PD-L1 treatments. The quantification of PD-L1 expression currently includes the visual estimation of a Tumor Cell (TC) score by a pathologist and consists of evaluating the ratio of PD-L1 positive and PD-L1 negative tumor cells. Known challenges like differences in positivity estimation around clinically relevant cut-offs and sub-optimal quality of samples makes visual scoring tedious and subjective, yielding a scoring variability between pathologists. In this work, we propose a novel deep learning solution that enables the first automated and objective scoring of PD-L1 expression in late stage NSCLC needle biopsies. To account for the low amount of tissue available in biopsy images and to restrict the amount of manual annotations necessary for training, we explore the use of semi-supervised approaches against standard fully supervised methods. We consolidate the manual annotations used for training as well the visual TC scores used for quantitative evaluation with multiple pathologists. Concordance measures computed on a set of slides unseen during training provide evidence that our automatic scoring method matches visual scoring on the considered dataset while ensuring repeatability and objectivity.
Tasks
Published	2018-06-28
URL	http://arxiv.org/abs/1806.11036v1
PDF	http://arxiv.org/pdf/1806.11036v1.pdf
PWC	https://paperswithcode.com/paper/deep-semi-supervised-generative-learning-for
Repo
Framework

Lung Cancer Concept Annotation from Spanish Clinical Narratives


Title	Lung Cancer Concept Annotation from Spanish Clinical Narratives
Authors	Marjan Najafabadipour, Juan Manuel Tuñas, Alejandro Rodríguez-González, Ernestina Menasalvas
Abstract	Recent rapid increase in the generation of clinical data and rapid development of computational science make us able to extract new insights from massive datasets in healthcare industry. Oncological clinical notes are creating rich databases for documenting patients history and they potentially contain lots of patterns that could help in better management of the disease. However, these patterns are locked within free text (unstructured) portions of clinical documents and consequence in limiting health professionals to extract useful information from them and to finally perform Query and Answering (QA) process in an accurate way. The Information Extraction (IE) process requires Natural Language Processing (NLP) techniques to assign semantics to these patterns. Therefore, in this paper, we analyze the design of annotators for specific lung cancer concepts that can be integrated over Apache Unstructured Information Management Architecture (UIMA) framework. In addition, we explain the details of generation and storage of annotation outcomes.
Tasks
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06639v1
PDF	http://arxiv.org/pdf/1809.06639v1.pdf
PWC	https://paperswithcode.com/paper/lung-cancer-concept-annotation-from-spanish
Repo
Framework

Structured Label Inference for Visual Understanding


Title	Structured Label Inference for Visual Understanding
Authors	Nelson Nauata, Hexiang Hu, Guang-Tong Zhou, Zhiwei Deng, Zicheng Liao, Greg Mori
Abstract	Visual data such as images and videos contain a rich source of structured semantic labels as well as a wide range of interacting components. Visual content could be assigned with fine-grained labels describing major components, coarse-grained labels depicting high level abstractions, or a set of labels revealing attributes. Such categorization over different, interacting layers of labels evinces the potential for a graph-based encoding of label information. In this paper, we exploit this rich structure for performing graph-based inference in label space for a number of tasks: multi-label image and video classification and action detection in untrimmed videos. We consider the use of the Bidirectional Inference Neural Network (BINN) and Structured Inference Neural Network (SINN) for performing graph-based inference in label space and propose a Long Short-Term Memory (LSTM) based extension for exploiting activity progression on untrimmed videos. The methods were evaluated on (i) the Animal with Attributes (AwA), Scene Understanding (SUN) and NUS-WIDE datasets for multi-label image classification, (ii) the first two releases of the YouTube-8M large scale dataset for multi-label video classification, and (iii) the THUMOS’14 and MultiTHUMOS video datasets for action detection. Our results demonstrate the effectiveness of structured label inference in these challenging tasks, achieving significant improvements against baselines.
Tasks	Action Detection, Image Classification, Scene Understanding, Video Classification
Published	2018-02-18
URL	http://arxiv.org/abs/1802.06459v1
PDF	http://arxiv.org/pdf/1802.06459v1.pdf
PWC	https://paperswithcode.com/paper/structured-label-inference-for-visual
Repo
Framework

Classifying Mammographic Breast Density by Residual Learning


Title	Classifying Mammographic Breast Density by Residual Learning
Authors	Jingxu Xu, Cheng Li, Yongjin Zhou, Lisha Mou, Hairong Zheng, Shanshan Wang
Abstract	Mammographic breast density, a parameter used to describe the proportion of breast tissue fibrosis, is widely adopted as an evaluation characteristic of the likelihood of breast cancer incidence. In this study, we present a radiomics approach based on residual learning for the classification of mammographic breast densities. Our method possesses several encouraging properties such as being almost fully automatic, possessing big model capacity and flexibility. It can obtain outstanding classification results without the necessity of result compensation using mammographs taken from different views. The proposed method was instantiated with the INbreast dataset and classification accuracies of 92.6% and 96.8% were obtained for the four BI-RADS (Breast Imaging and Reporting Data System) category task and the two BI-RADS category task,respectively. The superior performances achieved compared to the existing state-of-the-art methods along with its encouraging properties indicate that our method has a great potential to be applied as a computer-aided diagnosis tool.
Tasks
Published	2018-09-21
URL	http://arxiv.org/abs/1809.10241v1
PDF	http://arxiv.org/pdf/1809.10241v1.pdf
PWC	https://paperswithcode.com/paper/classifying-mammographic-breast-density-by
Repo
Framework

A Review of Multiple Try MCMC algorithms for Signal Processing


Title	A Review of Multiple Try MCMC algorithms for Signal Processing
Authors	Luca Martino
Abstract	Many applications in signal processing require the estimation of some parameters of interest given a set of observed data. More specifically, Bayesian inference needs the computation of {\it a-posteriori} estimators which are often expressed as complicated multi-dimensional integrals. Unfortunately, analytical expressions for these estimators cannot be found in most real-world applications, and Monte Carlo methods are the only feasible approach. A very powerful class of Monte Carlo techniques is formed by the Markov Chain Monte Carlo (MCMC) algorithms. They generate a Markov chain such that its stationary distribution coincides with the target posterior density. In this work, we perform a thorough review of MCMC methods using multiple candidates in order to select the next state of the chain, at each iteration. With respect to the classical Metropolis-Hastings method, the use of multiple try techniques foster the exploration of the sample space. We present different Multiple Try Metropolis schemes, Ensemble MCMC methods, Particle Metropolis-Hastings algorithms and the Delayed Rejection Metropolis technique. We highlight limitations, benefits, connections and differences among the different methods, and compare them by numerical simulations.
Tasks	Bayesian Inference
Published	2018-01-27
URL	http://arxiv.org/abs/1801.09065v1
PDF	http://arxiv.org/pdf/1801.09065v1.pdf
PWC	https://paperswithcode.com/paper/a-review-of-multiple-try-mcmc-algorithms-for
Repo
Framework

First Steps Toward CNN based Source Classification of Document Images Shared Over Messaging App


Title	First Steps Toward CNN based Source Classification of Document Images Shared Over Messaging App
Authors	Sharad Joshi, Suraj Saxena, Nitin Khanna
Abstract	Knowledge of source smartphone corresponding to a document image can be helpful in a variety of applications including copyright infringement, ownership attribution, leak identification and usage restriction. In this letter, we investigate a convolutional neural network-based approach to solve source smartphone identification problem for printed text documents which have been captured by smartphone cameras and shared over messaging platform. In absence of any publicly available dataset addressing this problem, we introduce a new image dataset consisting of 315 images of documents printed in three different fonts, captured using 21 smartphones and shared over WhatsApp. Experiments conducted on this dataset demonstrate that, in all scenarios, the proposed system performs as well as or better than the state-of-the-art system based on handcrafted features and classification of letters extracted from document images. The new dataset and code of the proposed system will be made publicly available along with this letter’s publication, presently they are submitted for review.
Tasks
Published	2018-08-17
URL	http://arxiv.org/abs/1808.05941v1
PDF	http://arxiv.org/pdf/1808.05941v1.pdf
PWC	https://paperswithcode.com/paper/first-steps-toward-cnn-based-source
Repo
Framework

Object Detection in Videos by High Quality Object Linking


Title	Object Detection in Videos by High Quality Object Linking
Authors	Peng Tang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wenjun Zeng, Jingdong Wang
Abstract	Compared with object detection in static images, object detection in videos is more challenging due to degraded image qualities. An effective way to address this problem is to exploit temporal contexts by linking the same object across video to form tubelets and aggregating classification scores in the tubelets. In this paper, we focus on obtaining high quality object linking results for better classification. Unlike previous methods that link objects by checking boxes between neighboring frames, we propose to link in the same frame. To achieve this goal, we extend prior methods in following aspects: (1) a cuboid proposal network that extracts spatio-temporal candidate cuboids which bound the movement of objects; (2) a short tubelet detection network that detects short tubelets in short video segments; (3) a short tubelet linking algorithm that links temporally-overlapping short tubelets to form long tubelets. Experiments on the ImageNet VID dataset show that our method outperforms both the static image detector and the previous state of the art. In particular, our method improves results by 8.8% over the static image detector for fast moving objects.
Tasks	Object Detection
Published	2018-01-30
URL	http://arxiv.org/abs/1801.09823v3
PDF	http://arxiv.org/pdf/1801.09823v3.pdf
PWC	https://paperswithcode.com/paper/object-detection-in-videos-by-high-quality
Repo
Framework

Matrix Completion for Structured Observations


Title	Matrix Completion for Structured Observations
Authors	Denali Molitor, Deanna Needell
Abstract	The need to predict or fill-in missing data, often referred to as matrix completion, is a common challenge in today’s data-driven world. Previous strategies typically assume that no structural difference between observed and missing entries exists. Unfortunately, this assumption is woefully unrealistic in many applications. For example, in the classic Netflix challenge, in which one hopes to predict user-movie ratings for unseen films, the fact that the viewer has not watched a given movie may indicate a lack of interest in that movie, thus suggesting a lower rating than otherwise expected. We propose adjusting the standard nuclear norm minimization strategy for matrix completion to account for such structural differences between observed and unobserved entries by regularizing the values of the unobserved entries. We show that the proposed method outperforms nuclear norm minimization in certain settings.
Tasks	Matrix Completion
Published	2018-01-29
URL	http://arxiv.org/abs/1801.09657v1
PDF	http://arxiv.org/pdf/1801.09657v1.pdf
PWC	https://paperswithcode.com/paper/matrix-completion-for-structured-observations
Repo
Framework

A Proof-Theoretic Approach to Scope Ambiguity in Compositional Vector Space Models


Title	A Proof-Theoretic Approach to Scope Ambiguity in Compositional Vector Space Models
Authors	Gijs Jasper Wijnholds
Abstract	We investigate the extent to which compositional vector space models can be used to account for scope ambiguity in quantified sentences (of the form “Every man loves some woman”). Such sentences containing two quantifiers introduce two readings, a direct scope reading and an inverse scope reading. This ambiguity has been treated in a vector space model using bialgebras by (Hedges and Sadrzadeh, 2016) and (Sadrzadeh, 2016), though without an explanation of the mechanism by which the ambiguity arises. We combine a polarised focussed sequent calculus for the non-associative Lambek calculus NL, as described in (Moortgat and Moot, 2011), with the vector based approach to quantifier scope ambiguity. In particular, we establish a procedure for obtaining a vector space model for quantifier scope ambiguity in a derivational way.
Tasks
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10297v2
PDF	http://arxiv.org/pdf/1810.10297v2.pdf
PWC	https://paperswithcode.com/paper/a-proof-theoretic-approach-to-scope-ambiguity
Repo
Framework

BézierGAN: Automatic Generation of Smooth Curves from Interpretable Low-Dimensional Parameters


Title	BézierGAN: Automatic Generation of Smooth Curves from Interpretable Low-Dimensional Parameters
Authors	Wei Chen, Mark Fuge
Abstract	Many real-world objects are designed by smooth curves, especially in the domain of aerospace and ship, where aerodynamic shapes (e.g., airfoils) and hydrodynamic shapes (e.g., hulls) are designed. To facilitate the design process of those objects, we propose a deep learning based generative model that can synthesize smooth curves. The model maps a low-dimensional latent representation to a sequence of discrete points sampled from a rational B'ezier curve. We demonstrate the performance of our method in completing both synthetic and real-world generative tasks. Results show that our method can generate diverse and realistic curves, while preserving consistent shape variation in the latent space, which is favorable for latent space design optimization or design space exploration.
Tasks
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08871v1
PDF	http://arxiv.org/pdf/1808.08871v1.pdf
PWC	https://paperswithcode.com/paper/beziergan-automatic-generation-of-smooth
Repo
Framework

EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection


Title	EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection
Authors	Mohsen Ghafoorian, Cedric Nugteren, Nóra Baka, Olaf Booij, Michael Hofmann
Abstract	Convolutional neural networks have been successfully applied to semantic segmentation problems. However, there are many problems that are inherently not pixel-wise classification problems but are nevertheless frequently formulated as semantic segmentation. This ill-posed formulation consequently necessitates hand-crafted scenario-specific and computationally expensive post-processing methods to convert the per pixel probability maps to final desired outputs. Generative adversarial networks (GANs) can be used to make the semantic segmentation network output to be more realistic or better structure-preserving, decreasing the dependency on potentially complex post-processing. In this work, we propose EL-GAN: a GAN framework to mitigate the discussed problem using an embedding loss. With EL-GAN, we discriminate based on learned embeddings of both the labels and the prediction at the same time. This results in more stable training due to having better discriminative information, benefiting from seeing both `fake' and` real’ predictions at the same time. This substantially stabilizes the adversarial training process. We use the TuSimple lane marking challenge to demonstrate that with our proposed framework it is viable to overcome the inherent anomalies of posing it as a semantic segmentation problem. Not only is the output considerably more similar to the labels when compared to conventional methods, the subsequent post-processing is also simpler and crosses the competitive 96% accuracy threshold.
Tasks	Lane Detection, Semantic Segmentation
Published	2018-06-14
URL	http://arxiv.org/abs/1806.05525v2
PDF	http://arxiv.org/pdf/1806.05525v2.pdf
PWC	https://paperswithcode.com/paper/el-gan-embedding-loss-driven-generative
Repo
Framework