July 27, 2019

2815 words 14 mins read

Paper Group ANR 697

Convolutional Recurrent Neural Networks for Bird Audio Detection. Learning Fine-Grained Knowledge about Contingent Relations between Everyday Events. Skin Lesion Segmentation: U-Nets versus Clustering. Tversky loss function for image segmentation using 3D fully convolutional deep networks. Liver lesion segmentation informed by joint liver segmentat …

Convolutional Recurrent Neural Networks for Bird Audio Detection


Title	Convolutional Recurrent Neural Networks for Bird Audio Detection
Authors	EmreÇakır, Sharath Adavanne, Giambattista Parascandolo, Konstantinos Drossos, Tuomas Virtanen
Abstract	Bird sounds possess distinctive spectral structure which may exhibit small shifts in spectrum depending on the bird species and environmental conditions. In this paper, we propose using convolutional recurrent neural networks on the task of automated bird audio detection in real-life environments. In the proposed method, convolutional layers extract high dimensional, local frequency shift invariant features, while recurrent layers capture longer term dependencies between the features extracted from short time frames. This method achieves 88.5% Area Under ROC Curve (AUC) score on the unseen evaluation data and obtains the second place in the Bird Audio Detection challenge.
Tasks	Bird Audio Detection
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02317v1
PDF	http://arxiv.org/pdf/1703.02317v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-recurrent-neural-networks-for-4
Repo
Framework

Learning Fine-Grained Knowledge about Contingent Relations between Everyday Events


Title	Learning Fine-Grained Knowledge about Contingent Relations between Everyday Events
Authors	Elahe Rahimtoroghi, Ernesto Hernandez, Marilyn A Walker
Abstract	Much of the user-generated content on social media is provided by ordinary people telling stories about their daily lives. We develop and test a novel method for learning fine-grained common-sense knowledge from these stories about contingent (causal and conditional) relationships between everyday events. This type of knowledge is useful for text and story understanding, information extraction, question answering, and text summarization. We test and compare different methods for learning contingency relation, and compare what is learned from topic-sorted story collections vs. general-domain stories. Our experiments show that using topic-specific datasets enables learning finer-grained knowledge about events and results in significant improvement over the baselines. An evaluation on Amazon Mechanical Turk shows 82% of the relations between events that we learn from topic-sorted stories are judged as contingent.
Tasks	Common Sense Reasoning, Question Answering, Text Summarization
Published	2017-08-30
URL	http://arxiv.org/abs/1708.09450v1
PDF	http://arxiv.org/pdf/1708.09450v1.pdf
PWC	https://paperswithcode.com/paper/learning-fine-grained-knowledge-about
Repo
Framework

Skin Lesion Segmentation: U-Nets versus Clustering


Title	Skin Lesion Segmentation: U-Nets versus Clustering
Authors	Bill S. Lin, Kevin Michael, Shivam Kalra, H. R. Tizhoosh
Abstract	Many automatic skin lesion diagnosis systems use segmentation as a preprocessing step to diagnose skin conditions because skin lesion shape, border irregularity, and size can influence the likelihood of malignancy. This paper presents, examines and compares two different approaches to skin lesion segmentation. The first approach uses U-Nets and introduces a histogram equalization based preprocessing step. The second approach is a C-Means clustering based approach that is much simpler to implement and faster to execute. The Jaccard Index between the algorithm output and hand segmented images by dermatologists is used to evaluate the proposed algorithms. While many recently proposed deep neural networks to segment skin lesions require a significant amount of computational power for training (i.e., computer with GPUs), the main objective of this paper is to present methods that can be used with only a CPU. This severely limits, for example, the number of training instances that can be presented to the U-Net. Comparing the two proposed algorithms, U-Nets achieved a significantly higher Jaccard Index compared to the clustering approach. Moreover, using the histogram equalization for preprocessing step significantly improved the U-Net segmentation results.
Tasks	Lesion Segmentation
Published	2017-09-27
URL	http://arxiv.org/abs/1710.01248v1
PDF	http://arxiv.org/pdf/1710.01248v1.pdf
PWC	https://paperswithcode.com/paper/skin-lesion-segmentation-u-nets-versus
Repo
Framework

Tversky loss function for image segmentation using 3D fully convolutional deep networks


Title	Tversky loss function for image segmentation using 3D fully convolutional deep networks
Authors	Seyed Sadegh Mohseni Salehi, Deniz Erdogmus, Ali Gholipour
Abstract	Fully convolutional deep neural networks carry out excellent potential for fast and accurate image segmentation. One of the main challenges in training these networks is data imbalance, which is particularly problematic in medical imaging applications such as lesion segmentation where the number of lesion voxels is often much lower than the number of non-lesion voxels. Training with unbalanced data can lead to predictions that are severely biased towards high precision but low recall (sensitivity), which is undesired especially in medical applications where false negatives are much less tolerable than false positives. Several methods have been proposed to deal with this problem including balanced sampling, two step training, sample re-weighting, and similarity loss functions. In this paper, we propose a generalized loss function based on the Tversky index to address the issue of data imbalance and achieve much better trade-off between precision and recall in training 3D fully convolutional deep neural networks. Experimental results in multiple sclerosis lesion segmentation on magnetic resonance images show improved F2 score, Dice coefficient, and the area under the precision-recall curve in test data. Based on these results we suggest Tversky loss function as a generalized framework to effectively train deep neural networks.
Tasks	Lesion Segmentation, Semantic Segmentation
Published	2017-06-18
URL	http://arxiv.org/abs/1706.05721v1
PDF	http://arxiv.org/pdf/1706.05721v1.pdf
PWC	https://paperswithcode.com/paper/tversky-loss-function-for-image-segmentation
Repo
Framework

Liver lesion segmentation informed by joint liver segmentation


Title	Liver lesion segmentation informed by joint liver segmentation
Authors	Eugene Vorontsov, An Tang, Chris Pal, Samuel Kadoury
Abstract	We propose a model for the joint segmentation of the liver and liver lesions in computed tomography (CT) volumes. We build the model from two fully convolutional networks, connected in tandem and trained together end-to-end. We evaluate our approach on the 2017 MICCAI Liver Tumour Segmentation Challenge, attaining competitive liver and liver lesion detection and segmentation scores across a wide range of metrics. Unlike other top performing methods, our model output post-processing is trivial, we do not use data external to the challenge, and we propose a simple single-stage model that is trained end-to-end. However, our method nearly matches the top lesion segmentation performance and achieves the second highest precision for lesion detection while maintaining high recall.
Tasks	Computed Tomography (CT), Lesion Segmentation, Liver Segmentation
Published	2017-07-24
URL	http://arxiv.org/abs/1707.07734v3
PDF	http://arxiv.org/pdf/1707.07734v3.pdf
PWC	https://paperswithcode.com/paper/liver-lesion-segmentation-informed-by-joint
Repo
Framework

Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation


Title	Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation
Authors	Kuan Fang, Yunfei Bai, Stefan Hinterstoisser, Silvio Savarese, Mrinal Kalakrishnan
Abstract	Learning-based approaches to robotic manipulation are limited by the scalability of data collection and accessibility of labels. In this paper, we present a multi-task domain adaptation framework for instance grasping in cluttered scenes by utilizing simulated robot experiments. Our neural network takes monocular RGB images and the instance segmentation mask of a specified target object as inputs, and predicts the probability of successfully grasping the specified object for each candidate motor command. The proposed transfer learning framework trains a model for instance grasping in simulation and uses a domain-adversarial loss to transfer the trained model to real robots using indiscriminate grasping data, which is available both in simulation and the real world. We evaluate our model in real-world robot experiments, comparing it with alternative model architectures as well as an indiscriminate grasping baseline.
Tasks	Domain Adaptation, Instance Segmentation, Semantic Segmentation, Transfer Learning
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06422v2
PDF	http://arxiv.org/pdf/1710.06422v2.pdf
PWC	https://paperswithcode.com/paper/multi-task-domain-adaptation-for-deep
Repo
Framework

Using Reinforcement Learning for Demand Response of Domestic Hot Water Buffers: a Real-Life Demonstration


Title	Using Reinforcement Learning for Demand Response of Domestic Hot Water Buffers: a Real-Life Demonstration
Authors	Oscar De Somer, Ana Soares, Tristan Kuijpers, Koen Vossen, Koen Vanthournout, Fred Spiessens
Abstract	This paper demonstrates a data-driven control approach for demand response in real-life residential buildings. The objective is to optimally schedule the heating cycles of the Domestic Hot Water (DHW) buffer to maximize the self-consumption of the local photovoltaic (PV) production. A model-based reinforcement learning technique is used to tackle the underlying sequential decision-making problem. The proposed algorithm learns the stochastic occupant behavior, predicts the PV production and takes into account the dynamics of the system. A real-life experiment with six residential buildings is performed using this algorithm. The results show that the self-consumption of the PV production is significantly increased, compared to the default thermostat control.
Tasks	Decision Making
Published	2017-03-16
URL	http://arxiv.org/abs/1703.05486v1
PDF	http://arxiv.org/pdf/1703.05486v1.pdf
PWC	https://paperswithcode.com/paper/using-reinforcement-learning-for-demand
Repo
Framework

Group Re-Identification via Unsupervised Transfer of Sparse Features Encoding


Title	Group Re-Identification via Unsupervised Transfer of Sparse Features Encoding
Authors	Giuseppe Lisanti, Niki Martinel, Alberto Del Bimbo, Gian Luca Foresti
Abstract	Person re-identification is best known as the problem of associating a single person that is observed from one or more disjoint cameras. The existing literature has mainly addressed such an issue, neglecting the fact that people usually move in groups, like in crowded scenarios. We believe that the additional information carried by neighboring individuals provides a relevant visual context that can be exploited to obtain a more robust match of single persons within the group. Despite this, re-identifying groups of people compound the common single person re-identification problems by introducing changes in the relative position of persons within the group and severe self-occlusions. In this paper, we propose a solution for group re-identification that grounds on transferring knowledge from single person re-identification to group re-identification by exploiting sparse dictionary learning. First, a dictionary of sparse atoms is learned using patches extracted from single person images. Then, the learned dictionary is exploited to obtain a sparsity-driven residual group representation, which is finally matched to perform the re-identification. Extensive experiments on the i-LIDS groups and two newly collected datasets show that the proposed solution outperforms state-of-the-art approaches.
Tasks	Dictionary Learning, Person Re-Identification
Published	2017-07-28
URL	http://arxiv.org/abs/1707.09173v1
PDF	http://arxiv.org/pdf/1707.09173v1.pdf
PWC	https://paperswithcode.com/paper/group-re-identification-via-unsupervised
Repo
Framework

Domain Randomization and Generative Models for Robotic Grasping


Title	Domain Randomization and Generative Models for Robotic Grasping
Authors	Joshua Tobin, Lukas Biewald, Rocky Duan, Marcin Andrychowicz, Ankur Handa, Vikash Kumar, Bob McGrew, Jonas Schneider, Peter Welinder, Wojciech Zaremba, Pieter Abbeel
Abstract	Deep learning-based robotic grasping has made significant progress thanks to algorithmic improvements and increased data availability. However, state-of-the-art models are often trained on as few as hundreds or thousands of unique object instances, and as a result generalization can be a challenge. In this work, we explore a novel data generation pipeline for training a deep neural network to perform grasp planning that applies the idea of domain randomization to object synthesis. We generate millions of unique, unrealistic procedurally generated objects, and train a deep neural network to perform grasp planning on these objects. Since the distribution of successful grasps for a given object can be highly multimodal, we propose an autoregressive grasp planning model that maps sensor inputs of a scene to a probability distribution over possible grasps. This model allows us to sample grasps efficiently at test time (or avoid sampling entirely). We evaluate our model architecture and data generation pipeline in simulation and the real world. We find we can achieve a $>$90% success rate on previously unseen realistic objects at test time in simulation despite having only been trained on random objects. We also demonstrate an 80% success rate on real-world grasp attempts despite having only been trained on random simulated objects.
Tasks	Robotic Grasping
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06425v2
PDF	http://arxiv.org/pdf/1710.06425v2.pdf
PWC	https://paperswithcode.com/paper/domain-randomization-and-generative-models
Repo
Framework

Context-Aware Prediction of Derivational Word-forms


Title	Context-Aware Prediction of Derivational Word-forms
Authors	Ekaterina Vylomova, Ryan Cotterell, Timothy Baldwin, Trevor Cohn
Abstract	Derivational morphology is a fundamental and complex characteristic of language. In this paper we propose the new task of predicting the derivational form of a given base-form lemma that is appropriate for a given context. We present an encoder–decoder style neural network to produce a derived form character-by-character, based on its corresponding character-level representation of the base form and the context. We demonstrate that our model is able to generate valid context-sensitive derivations from known base forms, but is less accurate under a lexicon agnostic setting.
Tasks
Published	2017-02-22
URL	http://arxiv.org/abs/1702.06675v1
PDF	http://arxiv.org/pdf/1702.06675v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-prediction-of-derivational-word
Repo
Framework

In-Order Transition-based Constituent Parsing


Title	In-Order Transition-based Constituent Parsing
Authors	Jiangming Liu, Yue Zhang
Abstract	Both bottom-up and top-down strategies have been used for neural transition-based constituent parsing. The parsing strategies differ in terms of the order in which they recognize productions in the derivation tree, where bottom-up strategies and top-down strategies take post-order and pre-order traversal over trees, respectively. Bottom-up parsers benefit from rich features from readily built partial parses, but lack lookahead guidance in the parsing process; top-down parsers benefit from non-local guidance for local decisions, but rely on a strong encoder over the input to predict a constituent hierarchy before its construction.To mitigate both issues, we propose a novel parsing system based on in-order traversal over syntactic trees, designing a set of transition actions to find a compromise between bottom-up constituent information and top-down lookahead information. Based on stack-LSTM, our psycholinguistically motivated constituent parsing system achieves 91.8 F1 on WSJ benchmark. Furthermore, the system achieves 93.6 F1 with supervised reranking and 94.2 F1 with semi-supervised reranking, which are the best results on the WSJ benchmark.
Tasks
Published	2017-07-17
URL	http://arxiv.org/abs/1707.05000v1
PDF	http://arxiv.org/pdf/1707.05000v1.pdf
PWC	https://paperswithcode.com/paper/in-order-transition-based-constituent-parsing
Repo
Framework

Toward Optimal Run Racing: Application to Deep Learning Calibration


Title	Toward Optimal Run Racing: Application to Deep Learning Calibration
Authors	Olivier Bousquet, Sylvain Gelly, Karol Kurach, Marc Schoenauer, Michele Sebag, Olivier Teytaud, Damien Vincent
Abstract	This paper aims at one-shot learning of deep neural nets, where a highly parallel setting is considered to address the algorithm calibration problem - selecting the best neural architecture and learning hyper-parameter values depending on the dataset at hand. The notoriously expensive calibration problem is optimally reduced by detecting and early stopping non-optimal runs. The theoretical contribution regards the optimality guarantees within the multiple hypothesis testing framework. Experimentations on the Cifar10, PTB and Wiki benchmarks demonstrate the relevance of the approach with a principled and consistent improvement on the state of the art with no extra hyper-parameter.
Tasks	Calibration, One-Shot Learning
Published	2017-06-10
URL	http://arxiv.org/abs/1706.03199v2
PDF	http://arxiv.org/pdf/1706.03199v2.pdf
PWC	https://paperswithcode.com/paper/toward-optimal-run-racing-application-to-deep
Repo
Framework

Convolutive Audio Source Separation using Robust ICA and an intelligent evolving permutation ambiguity solution


Title	Convolutive Audio Source Separation using Robust ICA and an intelligent evolving permutation ambiguity solution
Authors	Dimitrios Mallis, Thomas Sgouros, Nikolaos Mitianoudis
Abstract	Audio source separation is the task of isolating sound sources that are active simultaneously in a room captured by a set of microphones. Convolutive audio source separation of equal number of sources and microphones has a number of shortcomings including the complexity of frequency-domain ICA, the permutation ambiguity and the problem’s scalabity with increasing number of sensors. In this paper, the authors propose a multiple-microphone audio source separation algorithm based on a previous work of Mitianoudis and Davies (2003). Complex FastICA is substituted by Robust ICA increasing robustness and performance. Permutation ambiguity is solved using two methodologies. The first is using the Likelihood Ration Jump solution, which is now modified to decrease computational complexity in the case of multiple microphones. The application of the MuSIC algorithm, as a preprocessing step to the previous solution, forms a second methodology with promising results.
Tasks
Published	2017-08-14
URL	http://arxiv.org/abs/1708.03989v1
PDF	http://arxiv.org/pdf/1708.03989v1.pdf
PWC	https://paperswithcode.com/paper/convolutive-audio-source-separation-using
Repo
Framework

Picasso: A Modular Framework for Visualizing the Learning Process of Neural Network Image Classifiers


Title	Picasso: A Modular Framework for Visualizing the Learning Process of Neural Network Image Classifiers
Authors	Ryan Henderson, Rasmus Rothe
Abstract	Picasso is a free open-source (Eclipse Public License) web application written in Python for rendering standard visualizations useful for analyzing convolutional neural networks. Picasso ships with occlusion maps and saliency maps, two visualizations which help reveal issues that evaluation metrics like loss and accuracy might hide: for example, learning a proxy classification task. Picasso works with the Tensorflow deep learning framework, and Keras (when the model can be loaded into the Tensorflow backend). Picasso can be used with minimal configuration by deep learning researchers and engineers alike across various neural network architectures. Adding new visualizations is simple: the user can specify their visualization code and HTML template separately from the application code.
Tasks
Published	2017-05-16
URL	http://arxiv.org/abs/1705.05627v3
PDF	http://arxiv.org/pdf/1705.05627v3.pdf
PWC	https://paperswithcode.com/paper/picasso-a-modular-framework-for-visualizing
Repo
Framework

Learning Inverse Statics Models Efficiently


Title	Learning Inverse Statics Models Efficiently
Authors	Rania Rayyes, Daniel Kubus, Carsten Hartmann, Jochen Steil
Abstract	Online Goal Babbling and Direction Sampling are recently proposed methods for direct learning of inverse kinematics mappings from scratch even in high-dimensional sensorimotor spaces following the paradigm of “learning while behaving”. To learn inverse statics mappings - primarily for gravity compensation - from scratch and without using any closed-loop controller, we modify and enhance the Online Goal Babbling and Direction Sampling schemes. Moreover, we exploit symmetries in the inverse statics mappings to drastically reduce the number of samples required for learning inverse statics models. Results for a 2R planar robot, a 3R simplified human arm, and a 4R humanoid robot arm clearly demonstrate that their inverse statics mappings can be learned successfully with our modified online Goal Babbling scheme. Furthermore, we show that the number of samples required for the 2R and 3R arms can be reduced by a factor of at least 8 and 16 resp. -depending on the number of discovered symmetries.
Tasks
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06463v1
PDF	http://arxiv.org/pdf/1710.06463v1.pdf
PWC	https://paperswithcode.com/paper/learning-inverse-statics-models-efficiently
Repo
Framework