January 26, 2020

3094 words 15 mins read

Paper Group ANR 1378

WCE Polyp Detection with Triplet based Embeddings. Accurate Vision-based Manipulation through Contact Reasoning. Enhancing PIO Element Detection in Medical Text Using Contextualized Embedding. Are Perceptually-Aligned Gradients a General Property of Robust Classifiers?. How Robust Are Graph Neural Networks to Structural Noise?. A Fast-Optimal Guara …

WCE Polyp Detection with Triplet based Embeddings


Title	WCE Polyp Detection with Triplet based Embeddings
Authors	Pablo Laiz, Jordi Vitrià, Hagen Wenzek, Carolina Malagelada, Fernando Azpiroz, Santi Seguí
Abstract	Wireless capsule endoscopy is a medical procedure used to visualize the entire gastrointestinal tract and to diagnose intestinal conditions, such as polyps or bleeding. Current analyses are performed by manually inspecting nearly each one of the frames of the video, a tedious and error-prone task. Automatic image analysis methods can be used to reduce the time needed for physicians to evaluate a capsule endoscopy video, however these methods are still in a research phase. In this paper we focus on computer-aided polyp detection in capsule endoscopy images. This is a challenging problem because of the diversity of polyp appearance, the imbalanced dataset structure and the scarcity of data. We have developed a new polyp computer-aided decision system that combines a deep convolutional neural network and metric learning. The key point of the method is the use of the triplet loss function with the aim of improving feature extraction from the images when having small dataset. The triplet loss function allows to train robust detectors by forcing images from the same category to be represented by similar embedding vectors while ensuring that images from different categories are represented by dissimilar vectors. Empirical results show a meaningful increase of AUC values compared to baseline methods. A good performance is not the only requirement when considering the adoption of this technology to clinical practice. Trust and explainability of decisions are as important as performance. With this purpose, we also provide a method to generate visual explanations of the outcome of our polyp detector. These explanations can be used to build a physician’s trust in the system and also to convey information about the inner working of the method to the designer for debugging purposes.
Tasks	Metric Learning
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04643v1
PDF	https://arxiv.org/pdf/1912.04643v1.pdf
PWC	https://paperswithcode.com/paper/wce-polyp-detection-with-triplet-based
Repo
Framework

Accurate Vision-based Manipulation through Contact Reasoning


Title	Accurate Vision-based Manipulation through Contact Reasoning
Authors	Alina Kloss, Maria Bauza, Jiajun Wu, Joshua B. Tenenbaum, Alberto Rodriguez, Jeannette Bohg
Abstract	Planning contact interactions is one of the core challenges of many robotic tasks. Optimizing contact locations while taking dynamics into account is computationally costly and in only partially observed environments, executing contact-based tasks often suffers from low accuracy. We present an approach that addresses these two challenges for the problem of vision-based manipulation. First, we propose to disentangle contact from motion optimization. Thereby, we improve planning efficiency by focusing computation on promising contact locations. Second, we use a hybrid approach for perception and state estimation that combines neural networks with a physically meaningful state representation. In simulation and real-world experiments on the task of planar pushing, we show that our method is more efficient and achieves a higher manipulation accuracy than previous vision-based approaches.
Tasks
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03112v1
PDF	https://arxiv.org/pdf/1911.03112v1.pdf
PWC	https://paperswithcode.com/paper/accurate-vision-based-manipulation-through
Repo
Framework

Enhancing PIO Element Detection in Medical Text Using Contextualized Embedding


Title	Enhancing PIO Element Detection in Medical Text Using Contextualized Embedding
Authors	Hichem Mezaoui, Aleksandr Gontcharov, Isuru Gunasekara
Abstract	In this paper, we investigate a new approach to Population, Intervention and Outcome (PIO) element detection, a common task in Evidence Based Medicine (EBM). The purpose of this study is two-fold: to build a training dataset for PIO element detection with minimum redundancy and ambiguity and to investigate possible options in utilizing state of the art embedding methods for the task of PIO element detection. For the former purpose, we build a new and improved dataset by investigating the shortcomings of previously released datasets. For the latter purpose, we leverage the state of the art text embedding, Bidirectional Encoder Representations from Transformers (BERT), and build a multi-label classifier. We show that choosing a domain specific pre-trained embedding further optimizes the performance of the classifier. Furthermore, we show that the model could be enhanced by using ensemble methods and boosting techniques provided that features are adequately chosen.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11085v1
PDF	https://arxiv.org/pdf/1906.11085v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-pio-element-detection-in-medical
Repo
Framework

Are Perceptually-Aligned Gradients a General Property of Robust Classifiers?


Title	Are Perceptually-Aligned Gradients a General Property of Robust Classifiers?
Authors	Simran Kaur, Jeremy Cohen, Zachary C. Lipton
Abstract	For a standard convolutional neural network, optimizing over the input pixels to maximize the score of some target class will generally produce a grainy-looking version of the original image. However, Santurkar et al. (2019) demonstrated that for adversarially-trained neural networks, this optimization produces images that uncannily resemble the target class. In this paper, we show that these “perceptually-aligned gradients” also occur under randomized smoothing, an alternative means of constructing adversarially-robust classifiers. Our finding supports the hypothesis that perceptually-aligned gradients may be a general property of robust classifiers. We hope that our results will inspire research aimed at explaining this link between perceptually-aligned gradients and adversarial robustness.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08640v2
PDF	https://arxiv.org/pdf/1910.08640v2.pdf
PWC	https://paperswithcode.com/paper/are-perceptually-aligned-gradients-a-general
Repo
Framework

How Robust Are Graph Neural Networks to Structural Noise?


Title	How Robust Are Graph Neural Networks to Structural Noise?
Authors	James Fox, Sivasankaran Rajamanickam
Abstract	Graph neural networks (GNNs) are an emerging model for learning graph embeddings and making predictions on graph structured data. However, robustness of graph neural networks is not yet well-understood. In this work, we focus on node structural identity predictions, where a representative GNN model is able to achieve near-perfect accuracy. We also show that the same GNN model is not robust to addition of structural noise, through a controlled dataset and set of experiments. Finally, we show that under the right conditions, graph-augmented training is capable of significantly improving robustness to structural noise.
Tasks
Published	2019-12-21
URL	https://arxiv.org/abs/1912.10206v1
PDF	https://arxiv.org/pdf/1912.10206v1.pdf
PWC	https://paperswithcode.com/paper/how-robust-are-graph-neural-networks-to
Repo
Framework

A Fast-Optimal Guaranteed Algorithm For Learning Sub-Interval Relationships in Time Series


Title	A Fast-Optimal Guaranteed Algorithm For Learning Sub-Interval Relationships in Time Series
Authors	Saurabh Agrawal, Saurabh Verma, Anuj Karpatne, Stefan Liess, Snigdhansu Chatterjee, Vipin Kumar
Abstract	Traditional approaches focus on finding relationships between two entire time series, however, many interesting relationships exist in small sub-intervals of time and remain feeble during other sub-intervals. We define the notion of a sub-interval relationship (SIR) to capture such interactions that are prominent only in certain sub-intervals of time. To that end, we propose a fast-optimal guaranteed algorithm to find most interesting SIR relationship in a pair of time series. Lastly, we demonstrate the utility of our method in climate science domain based on a real-world dataset along with its scalability scope and obtain useful domain insights.
Tasks	Time Series
Published	2019-06-03
URL	https://arxiv.org/abs/1906.01450v1
PDF	https://arxiv.org/pdf/1906.01450v1.pdf
PWC	https://paperswithcode.com/paper/a-fast-optimal-guaranteed-algorithm-for
Repo
Framework

Empirical Study of Deep Learning for Text Classification in Legal Document Review


Title	Empirical Study of Deep Learning for Text Classification in Legal Document Review
Authors	Fusheng Wei, Han Qin, Shi Ye, Haozhen Zhao
Abstract	Predictive coding has been widely used in legal matters to find relevant or privileged documents in large sets of electronically stored information. It saves the time and cost significantly. Logistic Regression (LR) and Support Vector Machines (SVM) are two popular machine learning algorithms used in predictive coding. Recently, deep learning received a lot of attentions in many industries. This paper reports our preliminary studies in using deep learning in legal document review. Specifically, we conducted experiments to compare deep learning results with results obtained using a SVM algorithm on the four datasets of real legal matters. Our results showed that CNN performed better with larger volume of training dataset and should be a fit method in the text classification in legal industry.
Tasks	Text Classification
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01723v1
PDF	http://arxiv.org/pdf/1904.01723v1.pdf
PWC	https://paperswithcode.com/paper/empirical-study-of-deep-learning-for-text
Repo
Framework

Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving


Title	Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving
Authors	Carl-Johan Hoel, Katherine Driggs-Campbell, Krister Wolff, Leo Laine, Mykel J. Kochenderfer
Abstract	Tactical decision making for autonomous driving is challenging due to the diversity of environments, the uncertainty in the sensor information, and the complex interaction with other road users. This paper introduces a general framework for tactical decision making, which combines the concepts of planning and learning, in the form of Monte Carlo tree search and deep reinforcement learning. The method is based on the AlphaGo Zero algorithm, which is extended to a domain with a continuous state space where self-play cannot be used. The framework is applied to two different highway driving cases in a simulated environment and it is shown to perform better than a commonly used baseline method. The strength of combining planning and learning is also illustrated by a comparison to using the Monte Carlo tree search or the neural network policy separately.
Tasks	Autonomous Driving, Decision Making
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02680v1
PDF	https://arxiv.org/pdf/1905.02680v1.pdf
PWC	https://paperswithcode.com/paper/combining-planning-and-deep-reinforcement
Repo
Framework

Strategies to architect AI Safety: Defense to guard AI from Adversaries


Title	Strategies to architect AI Safety: Defense to guard AI from Adversaries
Authors	Rajagopal. A, Nirmala. V
Abstract	The impact of designing for security of AI is critical for humanity in the AI era. With humans increasingly becoming dependent upon AI, there is a need for neural networks that work reliably, inspite of Adversarial attacks. The vision for Safe and secure AI for popular use is achievable. To achieve safety of AI, this paper explores strategies and a novel deep learning architecture. To guard AI from adversaries, paper explores combination of 3 strategies: 1. Introduce randomness at inference time to hide the representation learning from adversaries. 2. Detect presence of adversaries by analyzing the sequence of inferences. 3. Exploit visual similarity. To realize these strategies, this paper designs a novel architecture, Dynamic Neural Defense, DND. This defense has 3 deep learning architectural features: 1. By hiding the way a neural network learns from exploratory attacks using a random computation graph, DND evades attack. 2. By analyzing input sequence to cloud AI inference engine with LSTM, DND detects attack sequence. 3. By inferring with visual similar inputs generated by VAE, any AI defended by DND approach does not succumb to hackers. Thus, a roadmap to develop reliable, safe and secure AI is presented.
Tasks	Representation Learning
Published	2019-06-08
URL	https://arxiv.org/abs/1906.03466v1
PDF	https://arxiv.org/pdf/1906.03466v1.pdf
PWC	https://paperswithcode.com/paper/strategies-to-architect-ai-safety-defense-to
Repo
Framework

Flood Prediction Using Machine Learning Models: Literature Review


Title	Flood Prediction Using Machine Learning Models: Literature Review
Authors	Amir Mosavi, Pinar Ozturk, Kwok-wing Chau
Abstract	Floods are among the most destructive natural disasters, which are highly complex to model. The research on the advancement of flood prediction models contributed to risk reduction, policy suggestion, minimization of the loss of human life, and reduction the property damage associated with floods. To mimic the complex mathematical expressions of physical processes of floods, during the past two decades, machine learning (ML) methods contributed highly in the advancement of prediction systems providing better performance and cost-effective solutions. Due to the vast benefits and potential of ML, its popularity dramatically increased among hydrologists. Researchers through introducing novel ML methods and hybridizing of the existing ones aim at discovering more accurate and efficient prediction models. The main contribution of this paper is to demonstrate the state of the art of ML models in flood prediction and to give insight into the most suitable models. In this paper, the literature where ML models were benchmarked through a qualitative analysis of robustness, accuracy, effectiveness, and speed are particularly investigated to provide an extensive overview on the various ML algorithms used in the field. The performance comparison of ML models presents an in-depth understanding of the different techniques within the framework of a comprehensive evaluation and discussion. As a result, this paper introduces the most promising prediction methods for both long-term and short-term floods. Furthermore, the major trends in improving the quality of the flood prediction models are investigated. Among them, hybridization, data decomposition, algorithm ensemble, and model optimization are reported as the most effective strategies for the improvement of ML methods.
Tasks
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02781v1
PDF	https://arxiv.org/pdf/1908.02781v1.pdf
PWC	https://paperswithcode.com/paper/flood-prediction-using-machine-learning
Repo
Framework

Vanishing Nodes: Another Phenomenon That Makes Training Deep Neural Networks Difficult


Title	Vanishing Nodes: Another Phenomenon That Makes Training Deep Neural Networks Difficult
Authors	Wen-Yu Chang, Tsung-Nan Lin
Abstract	It is well known that the problem of vanishing/exploding gradients is a challenge when training deep networks. In this paper, we describe another phenomenon, called vanishing nodes, that also increases the difficulty of training deep neural networks. As the depth of a neural network increases, the network’s hidden nodes have more highly correlated behavior. This results in great similarities between these nodes. The redundancy of hidden nodes thus increases as the network becomes deeper. We call this problem vanishing nodes, and we propose the metric vanishing node indicator (VNI) for quantitatively measuring the degree of vanishing nodes. The VNI can be characterized by the network parameters, which is shown analytically to be proportional to the depth of the network and inversely proportional to the network width. The theoretical results show that the effective number of nodes vanishes to one when the VNI increases to one (its maximal value), and that vanishing/exploding gradients and vanishing nodes are two different challenges that increase the difficulty of training deep neural networks. The numerical results from the experiments suggest that the degree of vanishing nodes will become more evident during back-propagation training, and that when the VNI is equal to 1, the network cannot learn simple tasks (e.g. the XOR problem) even when the gradients are neither vanishing nor exploding. We refer to this kind of gradients as the walking dead gradients, which cannot help the network converge when having a relatively large enough scale. Finally, the experiments show that the likelihood of failed training increases as the depth of the network increases. The training will become much more difficult due to the lack of network representation capability.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09745v1
PDF	https://arxiv.org/pdf/1910.09745v1.pdf
PWC	https://paperswithcode.com/paper/vanishing-nodes-another-phenomenon-that-makes
Repo
Framework

U-Net with spatial pyramid pooling for drusen segmentation in optical coherence tomography


Title	U-Net with spatial pyramid pooling for drusen segmentation in optical coherence tomography
Authors	Rhona Asgari, Sebastian Waldstein, Ferdinand Schlanitz, Magdalena Baratsits, Ursula Schmidt-Erfurth, Hrvoje Bogunović
Abstract	The presence of drusen is the main hallmark of early/intermediate age-related macular degeneration (AMD). Therefore, automated drusen segmentation is an important step in image-guided management of AMD. There are two common approaches to drusen segmentation. In the first, the drusen are segmented directly as a binary classification task. In the second approach, the surrounding retinal layers (outer boundary retinal pigment epithelium (OBRPE) and Bruch’s membrane (BM)) are segmented and the remaining space between these two layers is extracted as drusen. In this work, we extend the standard U-Net architecture with spatial pyramid pooling components to introduce global feature context. We apply the model to the task of segmenting drusen together with BM and OBRPE. The proposed network was trained and evaluated on a longitudinal OCT dataset of 425 scans from 38 patients with early/intermediate AMD. This preliminary study showed that the proposed network consistently outperformed the standard U-net model.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05404v1
PDF	https://arxiv.org/pdf/1912.05404v1.pdf
PWC	https://paperswithcode.com/paper/u-net-with-spatial-pyramid-pooling-for-drusen
Repo
Framework

Generative Adversarial Network with Multi-Branch Discriminator for Cross-Species Image-to-Image Translation


Title	Generative Adversarial Network with Multi-Branch Discriminator for Cross-Species Image-to-Image Translation
Authors	Ziqiang Zheng, Zhibin Yu, Haiyong Zheng, Yang Wu, Bing Zheng, Ping Lin
Abstract	Current approaches have made great progress on image-to-image translation tasks benefiting from the success of image synthesis methods especially generative adversarial networks (GANs). However, existing methods are limited to handling translation tasks between two species while keeping the content matching on the semantic level. A more challenging task would be the translation among more than two species. To explore this new area, we propose a simple yet effective structure of a multi-branch discriminator for enhancing an arbitrary generative adversarial architecture (GAN), named GAN-MBD. It takes advantage of the boosting strategy to break a common discriminator into several smaller ones with fewer parameters, which can enhance the generation and synthesis abilities of GANs efficiently and effectively. Comprehensive experiments show that the proposed multi-branch discriminator can dramatically improve the performance of popular GANs on cross-species image-to-image translation tasks while reducing the number of parameters for computation. The code and some datasets are attached as supplementary materials for reference.
Tasks	Image Generation, Image-to-Image Translation
Published	2019-01-24
URL	http://arxiv.org/abs/1901.10895v1
PDF	http://arxiv.org/pdf/1901.10895v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-network-with-multi
Repo
Framework

Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews


Title	Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Authors	Michael Gref, Christoph Schmidt, Sven Behnke, Joachim Köhler
Abstract	In automatic speech recognition, often little training data is available for specific challenging tasks, but training of state-of-the-art automatic speech recognition systems requires large amounts of annotated speech. To address this issue, we propose a two-staged approach to acoustic modeling that combines noise and reverberation data augmentation with transfer learning to robustly address challenges such as difficult acoustic recording conditions, spontaneous speech, and speech of elderly people. We evaluate our approach using the example of German oral history interviews, where a relative average reduction of the word error rate by 19.3% is achieved.
Tasks	Data Augmentation, Robust Speech Recognition, Speech Recognition, Transfer Learning
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06709v1
PDF	https://arxiv.org/pdf/1908.06709v1.pdf
PWC	https://paperswithcode.com/paper/two-staged-acoustic-modeling-adaption-for
Repo
Framework

Efficient Intrinsically Motivated Robotic Grasping with Learning-Adaptive Imagination in Latent Space


Title	Efficient Intrinsically Motivated Robotic Grasping with Learning-Adaptive Imagination in Latent Space
Authors	Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
Abstract	Combining model-based and model-free deep reinforcement learning has shown great promise for improving sample efficiency on complex control tasks while still retaining high performance. Incorporating imagination is a recent effort in this direction inspired by human mental simulation of motor behavior. We propose a learning-adaptive imagination approach which, unlike previous approaches, takes into account the reliability of the learned dynamics model used for imagining the future. Our approach learns an ensemble of disjoint local dynamics models in latent space and derives an intrinsic reward based on learning progress, motivating the controller to take actions leading to data that improves the models. The learned models are used to generate imagined experiences, augmenting the training set of real experiences. We evaluate our approach on learning vision-based robotic grasping and show that it significantly improves sample efficiency and achieves near-optimal performance in a sparse reward environment.
Tasks	Robotic Grasping
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04729v1
PDF	https://arxiv.org/pdf/1910.04729v1.pdf
PWC	https://paperswithcode.com/paper/efficient-intrinsically-motivated-robotic
Repo
Framework