July 27, 2019

3200 words 16 mins read

Paper Group ANR 748

Motion Artifact Detection in Confocal Laser Endomicroscopy Images. Rapid Adaptation with Conditionally Shifted Neurons. Natural Language Inference from Multiple Premises. On the Power of Learning from $k$-Wise Queries. Interpreting Deep Visual Representations via Network Dissection. DeepSketch2Face: A Deep Learning Based Sketching System for 3D Fac …

Motion Artifact Detection in Confocal Laser Endomicroscopy Images


Title	Motion Artifact Detection in Confocal Laser Endomicroscopy Images
Authors	Maike P. Stoeve, Marc Aubreville, Nicolai Oetter, Christian Knipfer, Helmut Neumann, Florian Stelzle, Andreas Maier
Abstract	Confocal Laser Endomicroscopy (CLE), an optical imaging technique allowing non-invasive examination of the mucosa on a (sub)cellular level, has proven to be a valuable diagnostic tool in gastroenterology and shows promising results in various anatomical regions including the oral cavity. Recently, the feasibility of automatic carcinoma detection for CLE images of sufficient quality was shown. However, in real world data sets a high amount of CLE images is corrupted by artifacts. Amongst the most prevalent artifact types are motion-induced image deteriorations. In the scope of this work, algorithmic approaches for the automatic detection of motion artifact-tainted image regions were developed. Hence, this work provides an important step towards clinical applicability of automatic carcinoma detection. Both, conventional machine learning and novel, deep learning-based approaches were assessed. The deep learning-based approach outperforms the conventional approaches, attaining an AUC of 0.90.
Tasks
Published	2017-11-03
URL	http://arxiv.org/abs/1711.01117v2
PDF	http://arxiv.org/pdf/1711.01117v2.pdf
PWC	https://paperswithcode.com/paper/motion-artifact-detection-in-confocal-laser
Repo
Framework

Rapid Adaptation with Conditionally Shifted Neurons


Title	Rapid Adaptation with Conditionally Shifted Neurons
Authors	Tsendsuren Munkhdalai, Xingdi Yuan, Soroush Mehri, Adam Trischler
Abstract	We describe a mechanism by which artificial neural networks can learn rapid adaptation - the ability to adapt on the fly, with little data, to new tasks - that we call conditionally shifted neurons. We apply this mechanism in the framework of metalearning, where the aim is to replicate some of the flexibility of human learning in machines. Conditionally shifted neurons modify their activation values with task-specific shifts retrieved from a memory module, which is populated rapidly based on limited task experience. On metalearning benchmarks from the vision and language domains, models augmented with conditionally shifted neurons achieve state-of-the-art results.
Tasks	Few-Shot Image Classification
Published	2017-12-28
URL	http://arxiv.org/abs/1712.09926v3
PDF	http://arxiv.org/pdf/1712.09926v3.pdf
PWC	https://paperswithcode.com/paper/rapid-adaptation-with-conditionally-shifted
Repo
Framework

Natural Language Inference from Multiple Premises


Title	Natural Language Inference from Multiple Premises
Authors	Alice Lai, Yonatan Bisk, Julia Hockenmaier
Abstract	We define a novel textual entailment task that requires inference over multiple premise sentences. We present a new dataset for this task that minimizes trivial lexical inferences, emphasizes knowledge of everyday events, and presents a more challenging setting for textual entailment. We evaluate several strong neural baselines and analyze how the multiple premise task differs from standard textual entailment.
Tasks	Natural Language Inference
Published	2017-10-09
URL	http://arxiv.org/abs/1710.02925v1
PDF	http://arxiv.org/pdf/1710.02925v1.pdf
PWC	https://paperswithcode.com/paper/natural-language-inference-from-multiple
Repo
Framework

On the Power of Learning from $k$-Wise Queries


Title	On the Power of Learning from $k$-Wise Queries
Authors	Vitaly Feldman, Badih Ghazi
Abstract	Several well-studied models of access to data samples, including statistical queries, local differential privacy and low-communication algorithms rely on queries that provide information about a function of a single sample. (For example, a statistical query (SQ) gives an estimate of $Ex_{x \sim D}[q(x)]$ for any choice of the query function $q$ mapping $X$ to the reals, where $D$ is an unknown data distribution over $X$.) Yet some data analysis algorithms rely on properties of functions that depend on multiple samples. Such algorithms would be naturally implemented using $k$-wise queries each of which is specified by a function $q$ mapping $X^k$ to the reals. Hence it is natural to ask whether algorithms using $k$-wise queries can solve learning problems more efficiently and by how much. Blum, Kalai and Wasserman (2003) showed that for any weak PAC learning problem over a fixed distribution, the complexity of learning with $k$-wise SQs is smaller than the (unary) SQ complexity by a factor of at most $2^k$. We show that for more general problems over distributions the picture is substantially richer. For every $k$, the complexity of distribution-independent PAC learning with $k$-wise queries can be exponentially larger than learning with $(k+1)$-wise queries. We then give two approaches for simulating a $k$-wise query using unary queries. The first approach exploits the structure of the problem that needs to be solved. It generalizes and strengthens (exponentially) the results of Blum et al.. It allows us to derive strong lower bounds for learning DNF formulas and stochastic constraint satisfaction problems that hold against algorithms using $k$-wise queries. The second approach exploits the $k$-party communication complexity of the $k$-wise query function.
Tasks
Published	2017-02-28
URL	http://arxiv.org/abs/1703.00066v1
PDF	http://arxiv.org/pdf/1703.00066v1.pdf
PWC	https://paperswithcode.com/paper/on-the-power-of-learning-from-k-wise-queries
Repo
Framework

Interpreting Deep Visual Representations via Network Dissection


Title	Interpreting Deep Visual Representations via Network Dissection
Authors	Bolei Zhou, David Bau, Aude Oliva, Antonio Torralba
Abstract	The success of recent deep convolutional neural networks (CNNs) depends on learning hidden representations that can summarize the important factors of variation behind the data. However, CNNs often criticized as being black boxes that lack interpretability, since they have millions of unexplained model parameters. In this work, we describe Network Dissection, a method that interprets networks by providing labels for the units of their deep visual representations. The proposed method quantifies the interpretability of CNN representations by evaluating the alignment between individual hidden units and a set of visual semantic concepts. By identifying the best alignments, units are given human interpretable labels across a range of objects, parts, scenes, textures, materials, and colors. The method reveals that deep representations are more transparent and interpretable than expected: we find that representations are significantly more interpretable than they would be under a random equivalently powerful basis. We apply the method to interpret and compare the latent representations of various network architectures trained to solve different supervised and self-supervised training tasks. We then examine factors affecting the network interpretability such as the number of the training iterations, regularizations, different initializations, and the network depth and width. Finally we show that the interpreted units can be used to provide explicit explanations of a prediction given by a CNN for an image. Our results highlight that interpretability is an important property of deep neural networks that provides new insights into their hierarchical structure.
Tasks
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05611v2
PDF	http://arxiv.org/pdf/1711.05611v2.pdf
PWC	https://paperswithcode.com/paper/interpreting-deep-visual-representations-via
Repo
Framework

DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling


Title	DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling
Authors	Xiaoguang Han, Chang Gao, Yizhou Yu
Abstract	Face modeling has been paid much attention in the field of visual computing. There exist many scenarios, including cartoon characters, avatars for social media, 3D face caricatures as well as face-related art and design, where low-cost interactive face modeling is a popular approach especially among amateur users. In this paper, we propose a deep learning based sketching system for 3D face and caricature modeling. This system has a labor-efficient sketching interface, that allows the user to draw freehand imprecise yet expressive 2D lines representing the contours of facial features. A novel CNN based deep regression network is designed for inferring 3D face models from 2D sketches. Our network fuses both CNN and shape based features of the input sketch, and has two independent branches of fully connected layers generating independent subsets of coefficients for a bilinear face representation. Our system also supports gesture based interactions for users to further manipulate initial face models. Both user studies and numerical results indicate that our sketching system can help users create face models quickly and effectively. A significantly expanded face database with diverse identities, expressions and levels of exaggeration is constructed to promote further research and evaluation of face modeling techniques.
Tasks	Caricature
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02042v1
PDF	http://arxiv.org/pdf/1706.02042v1.pdf
PWC	https://paperswithcode.com/paper/deepsketch2face-a-deep-learning-based
Repo
Framework

Empirical Study of Drone Sound Detection in Real-Life Environment with Deep Neural Networks


Title	Empirical Study of Drone Sound Detection in Real-Life Environment with Deep Neural Networks
Authors	Sungho Jeon, Jong-Woo Shin, Young-Jun Lee, Woong-Hee Kim, YoungHyoun Kwon, Hae-Yong Yang
Abstract	This work aims to investigate the use of deep neural network to detect commercial hobby drones in real-life environments by analyzing their sound data. The purpose of work is to contribute to a system for detecting drones used for malicious purposes, such as for terrorism. Specifically, we present a method capable of detecting the presence of commercial hobby drones as a binary classification problem based on sound event detection. We recorded the sound produced by a few popular commercial hobby drones, and then augmented this data with diverse environmental sound data to remedy the scarcity of drone sound data in diverse environments. We investigated the effectiveness of state-of-the-art event sound classification methods, i.e., a Gaussian Mixture Model (GMM), Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN), for drone sound detection. Our empirical results, which were obtained with a testing dataset collected on an urban street, confirmed the effectiveness of these models for operating in a real environment. In summary, our RNN models showed the best detection performance with an F-Score of 0.8009 with 240 ms of input audio with a short processing time, indicating their applicability to real-time detection systems.
Tasks	Sound Event Detection
Published	2017-01-20
URL	http://arxiv.org/abs/1701.05779v1
PDF	http://arxiv.org/pdf/1701.05779v1.pdf
PWC	https://paperswithcode.com/paper/empirical-study-of-drone-sound-detection-in
Repo
Framework

Leipzig Corpus Miner - A Text Mining Infrastructure for Qualitative Data Analysis


Title	Leipzig Corpus Miner - A Text Mining Infrastructure for Qualitative Data Analysis
Authors	Andreas Niekler, Gregor Wiedemann, Gerhard Heyer
Abstract	This paper presents the “Leipzig Corpus Miner”, a technical infrastructure for supporting qualitative and quantitative content analysis. The infrastructure aims at the integration of ‘close reading’ procedures on individual documents with procedures of ‘distant reading’, e.g. lexical characteristics of large document collections. Therefore information retrieval systems, lexicometric statistics and machine learning procedures are combined in a coherent framework which enables qualitative data analysts to make use of state-of-the-art Natural Language Processing techniques on very large document collections. Applicability of the framework ranges from social sciences to media studies and market research. As an example we introduce the usage of the framework in a political science study on post-democracy and neoliberalism.
Tasks	Information Retrieval
Published	2017-07-11
URL	http://arxiv.org/abs/1707.03253v1
PDF	http://arxiv.org/pdf/1707.03253v1.pdf
PWC	https://paperswithcode.com/paper/leipzig-corpus-miner-a-text-mining
Repo
Framework

Hierarchical internal representation of spectral features in deep convolutional networks trained for EEG decoding


Title	Hierarchical internal representation of spectral features in deep convolutional networks trained for EEG decoding
Authors	Kay Gregor Hartmann, Robin Tibor Schirrmeister, Tonio Ball
Abstract	Recently, there is increasing interest and research on the interpretability of machine learning models, for example how they transform and internally represent EEG signals in Brain-Computer Interface (BCI) applications. This can help to understand the limits of the model and how it may be improved, in addition to possibly provide insight about the data itself. Schirrmeister et al. (2017) have recently reported promising results for EEG decoding with deep convolutional neural networks (ConvNets) trained in an end-to-end manner and, with a causal visualization approach, showed that they learn to use spectral amplitude changes in the input. In this study, we investigate how ConvNets represent spectral features through the sequence of intermediate stages of the network. We show higher sensitivity to EEG phase features at earlier stages and higher sensitivity to EEG amplitude features at later stages. Intriguingly, we observed a specialization of individual stages of the network to the classical EEG frequency bands alpha, beta, and high gamma. Furthermore, we find first evidence that particularly in the last convolutional layer, the network learns to detect more complex oscillatory patterns beyond spectral phase and amplitude, reminiscent of the representation of complex visual features in later layers of ConvNets in computer vision tasks. Our findings thus provide insights into how ConvNets hierarchically represent spectral EEG features in their intermediate layers and suggest that ConvNets can exploit and might help to better understand the compositional structure of EEG time series.
Tasks	EEG, Eeg Decoding, Time Series
Published	2017-11-21
URL	http://arxiv.org/abs/1711.07792v3
PDF	http://arxiv.org/pdf/1711.07792v3.pdf
PWC	https://paperswithcode.com/paper/hierarchical-internal-representation-of
Repo
Framework

Segmentation and Classification of Cine-MR Images Using Fully Convolutional Networks and Handcrafted Features


Title	Segmentation and Classification of Cine-MR Images Using Fully Convolutional Networks and Handcrafted Features
Authors	M. Hossein Eybposh, Mohammad Haghir Ebrahim-Abadi, Mohammad Jalilpour-Monesi, Seyed Saman Saboksayr
Abstract	Three-dimensional cine-MRI is of crucial importance for assessing the cardiac function. Features that describe the anatomy and function of cardiac structures (e.g. Left Ventricle (LV), Right Ventricle (RV), and Myocardium(MC)) are known to have significant diagnostic value and can be computed from 3D cine-MR images. However, these features require precise segmentation of cardiac structures. Among the fully automated segmentation methods, Fully Convolutional Networks (FCN) with Skip Connections have shown robustness in medical segmentation problems. In this study, we develop a complete pipeline for classification of subjects with cardiac conditions based on 3D cine-MRI. For the segmentation task, we develop a 2D FCN and introduce Parallel Paths (PP) as a way to exploit the 3D information of the cine-MR image. For the classification task, 125 features were extracted from the segmented structures, describing their anatomy and function. Next, a two-stage pipeline for feature selection using the LASSO method is developed. A subset of 20 features is selected for classification. Each subject is classified using an ensemble of Logistic Regression, Multi-Layer Perceptron, and Support Vector Machine classifiers through majority voting. The Dice Coefficient for segmentation was 0.95+-0.03, 0.89+-0.13, and 0.90+-0.03 for LV, RV, and MC respectively. The 8-fold cross validation accuracy for the classification task was 95.05% and 92.77% based on ground truth and the proposed methods segmentations respectively. The results show that the PPs increase the segmentation accuracy, by exploiting the spatial relations. Moreover, the classification algorithm and the features showed discriminability while keeping the sensitivity to segmentation error as low as possible.
Tasks	Feature Selection
Published	2017-09-08
URL	http://arxiv.org/abs/1709.02565v2
PDF	http://arxiv.org/pdf/1709.02565v2.pdf
PWC	https://paperswithcode.com/paper/segmentation-and-classification-of-cine-mr
Repo
Framework

Multi-scale Forest Species Recognition Systems for Reduced Cost


Title	Multi-scale Forest Species Recognition Systems for Reduced Cost
Authors	Paulo R. Cavalin, Marcelo N. Kapp, Luiz S. Oliveira
Abstract	This work focuses on cost reduction methods for forest species recognition systems. Current state-of-the-art shows that the accuracy of these systems have increased considerably in the past years, but the cost in time to perform the recognition of input samples has also increased proportionally. For this reason, in this work we focus on investigating methods for cost reduction locally (at either feature extraction or classification level individually) and globally (at both levels combined), and evaluate two main aspects: 1) the impact in cost reduction, given the proposed measures for it; and 2) the impact in recognition accuracy. The experimental evaluation conducted on two forest species datasets demonstrated that, with global cost reduction, the cost of the system can be reduced to less than 1/20 and recognition rates that are better than those of the original system can be achieved.
Tasks
Published	2017-09-12
URL	http://arxiv.org/abs/1709.04056v1
PDF	http://arxiv.org/pdf/1709.04056v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-forest-species-recognition
Repo
Framework

Learning Representations from Road Network for End-to-End Urban Growth Simulation


Title	Learning Representations from Road Network for End-to-End Urban Growth Simulation
Authors	Saptarshi Pal, Soumya K Ghosh
Abstract	From our experiences in the past, we have seen that the growth of cities is very much dependent on the transportation networks. In mega cities, transportation networks determine to a significant extent as to where the people will move and houses will be built. Hence, transportation network data is crucial to an urban growth prediction system. Existing works have used manually derived distance based features based on the road networks to build models on urban growth. But due to the non-generic and laborious nature of the manual feature engineering process, we can shift to End-to-End systems which do not rely on manual feature engineering. In this paper, we propose a method to integrate road network data to an existing Rule based End-to-End framework without manual feature engineering. Our method employs recurrent neural networks to represent road networks in a structured way such that it can be plugged into the previously proposed End-to-End framework. The proposed approach enhances the performance in terms of Figure of Merit, Producer’s accuracy, User’s accuracy and Overall accuracy of the existing Rule based End-to-End framework.
Tasks	Feature Engineering
Published	2017-12-19
URL	http://arxiv.org/abs/1712.06778v3
PDF	http://arxiv.org/pdf/1712.06778v3.pdf
PWC	https://paperswithcode.com/paper/learning-representations-from-road-network
Repo
Framework

Story Cloze Ending Selection Baselines and Data Examination


Title	Story Cloze Ending Selection Baselines and Data Examination
Authors	Todor Mihaylov, Anette Frank
Abstract	This paper describes two supervised baseline systems for the Story Cloze Test Shared Task (Mostafazadeh et al., 2016a). We first build a classifier using features based on word embeddings and semantic similarity computation. We further implement a neural LSTM system with different encoding strategies that try to model the relation between the story and the provided endings. Our experiments show that a model using representation features based on average word embedding vectors over the given story words and the candidate ending sentences words, joint with similarity features between the story and candidate ending representations performed better than the neural models. Our best model achieves an accuracy of 72.42, ranking 3rd in the official evaluation.
Tasks	Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published	2017-03-13
URL	http://arxiv.org/abs/1703.04330v1
PDF	http://arxiv.org/pdf/1703.04330v1.pdf
PWC	https://paperswithcode.com/paper/story-cloze-ending-selection-baselines-and
Repo
Framework

Saliency Fusion in Eigenvector Space with Multi-Channel Pulse Coupled Neural Network


Title	Saliency Fusion in Eigenvector Space with Multi-Channel Pulse Coupled Neural Network
Authors	Nevrez Imamoglu, Zhixuan Wei, Huangjun Shi, Yuki Yoshida, Myagmarbayar Nergui, Jose Gonzalez, Dongyun Gu, Weidong Chen, Kenzo Nonami, Wenwei Yu
Abstract	Saliency computation has become a popular research field for many applications due to the useful information provided by saliency maps. For a saliency map, local relations around the salient regions in multi-channel perspective should be taken into consideration by aiming uniformity on the region of interest as an internal approach. And, irrelevant salient regions have to be avoided as much as possible. Most of the works achieve these criteria with external processing modules; however, these can be accomplished during the conspicuity map fusion process. Therefore, in this paper, a new model is proposed for saliency/conspicuity map fusion with two concepts: a) input image transformation relying on the principal component analysis (PCA), and b) saliency conspicuity map fusion with multi-channel pulsed coupled neural network (m-PCNN). Experimental results, which are evaluated by precision, recall, F-measure, and area under curve (AUC), support the reliability of the proposed method by enhancing the saliency computation.
Tasks
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00160v1
PDF	http://arxiv.org/pdf/1703.00160v1.pdf
PWC	https://paperswithcode.com/paper/saliency-fusion-in-eigenvector-space-with
Repo
Framework

Misdirected Registration Uncertainty


Title	Misdirected Registration Uncertainty
Authors	Jie Luo, Karteek Popuri, Dana Cobzas, Hongyi Ding, William M. Wells III, Masashi Sugiyama
Abstract	Being a task of establishing spatial correspondences, medical image registration is often formalized as finding the optimal transformation that best aligns two images. Since the transformation is such an essential component of registration, most existing researches conventionally quantify the registration uncertainty, which is the confidence in the estimated spatial correspondences, by the transformation uncertainty. In this paper, we give concrete examples and reveal that using the transformation uncertainty to quantify the registration uncertainty is inappropriate and sometimes misleading. Based on this finding, we also raise attention to an important yet subtle aspect of probabilistic image registration, that is whether it is reasonable to determine the correspondence of a registered voxel solely by the mode of its transformation distribution.
Tasks	Image Registration, Medical Image Registration
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08121v2
PDF	http://arxiv.org/pdf/1704.08121v2.pdf
PWC	https://paperswithcode.com/paper/misdirected-registration-uncertainty
Repo
Framework