July 30, 2019

2946 words 14 mins read

Paper Group AWR 47

Automatic skin lesion segmentation with fully convolutional-deconvolutional networks. Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings. Incorporat …

Automatic skin lesion segmentation with fully convolutional-deconvolutional networks


Title	Automatic skin lesion segmentation with fully convolutional-deconvolutional networks
Authors	Yading Yuan
Abstract	This paper summarizes our method and validation results for the ISBI Challenge 2017 - Skin Lesion Analysis Towards Melanoma Detection - Part I: Lesion Segmentation
Tasks	Lesion Segmentation
Published	2017-03-15
URL	http://arxiv.org/abs/1703.05165v2
PDF	http://arxiv.org/pdf/1703.05165v2.pdf
PWC	https://paperswithcode.com/paper/automatic-skin-lesion-segmentation-with-fully
Repo	https://github.com/manideep2510/melanoma_segmentation
Framework	tf

Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss


Title	Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss
Authors	Yehezkel S. Resheff, Amit Mandelbaum, Daphna Weinshall
Abstract	Deep learning has become the method of choice in many application domains of machine learning in recent years, especially for multi-class classification tasks. The most common loss function used in this context is the cross-entropy loss, which reduces to the log loss in the typical case when there is a single correct response label. While this loss is insensitive to the identity of the assigned class in the case of misclassification, in practice it is often the case that some errors may be more detrimental than others. Here we present the bilinear-loss (and related log-bilinear-loss) which differentially penalizes the different wrong assignments of the model. We thoroughly test this method using standard models and benchmark image datasets. As one application, we show the ability of this method to better contain error within the correct super-class, in the hierarchically labeled CIFAR100 dataset, without affecting the overall performance of the classifier.
Tasks
Published	2017-04-20
URL	http://arxiv.org/abs/1704.06062v1
PDF	http://arxiv.org/pdf/1704.06062v1.pdf
PWC	https://paperswithcode.com/paper/every-untrue-label-is-untrue-in-its-own-way
Repo	https://github.com/Hezi-Resheff/paper-log-bilinear-loss
Framework	tf

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer


Title	Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Authors	Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean
Abstract	The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this work, we address these challenges and finally realize the promise of conditional computation, achieving greater than 1000x improvements in model capacity with only minor losses in computational efficiency on modern GPU clusters. We introduce a Sparsely-Gated Mixture-of-Experts layer (MoE), consisting of up to thousands of feed-forward sub-networks. A trainable gating network determines a sparse combination of these experts to use for each example. We apply the MoE to the tasks of language modeling and machine translation, where model capacity is critical for absorbing the vast quantities of knowledge available in the training corpora. We present model architectures in which a MoE with up to 137 billion parameters is applied convolutionally between stacked LSTM layers. On large language modeling and machine translation benchmarks, these models achieve significantly better results than state-of-the-art at lower computational cost.
Tasks	Language Modelling, Machine Translation
Published	2017-01-23
URL	http://arxiv.org/abs/1701.06538v1
PDF	http://arxiv.org/pdf/1701.06538v1.pdf
PWC	https://paperswithcode.com/paper/outrageously-large-neural-networks-the
Repo	https://github.com/unconst/MACH
Framework	tf

Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings


Title	Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings
Authors	Shane Settle, Keith Levin, Herman Kamper, Karen Livescu
Abstract	Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments. Recent work has shown that comparing speech segments by representing them as fixed-dimensional vectors — acoustic word embeddings — and measuring their vector distance (e.g., cosine distance) can discriminate between words more accurately than DTW-based approaches. We consider an approach to query-by-example search that embeds both the query and database segments according to a neural model, followed by nearest-neighbor search to find the matching segments. Earlier work on embedding-based query-by-example, using template-based acoustic word embeddings, achieved competitive performance. We find that our embeddings, based on recurrent neural networks trained to optimize word discrimination, achieve substantial improvements in performance and run-time efficiency over the previous approaches.
Tasks	Word Embeddings
Published	2017-06-12
URL	http://arxiv.org/abs/1706.03818v1
PDF	http://arxiv.org/pdf/1706.03818v1.pdf
PWC	https://paperswithcode.com/paper/query-by-example-search-with-discriminative
Repo	https://github.com/kamperh/recipe_semantic_flickraudio
Framework	tf

Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions


Title	Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions
Authors	Iván González Díaz
Abstract	This report describes our submission to the ISIC 2017 Challenge in Skin Lesion Analysis Towards Melanoma Detection. We have participated in the Part 3: Lesion Classification with a system for automatic diagnosis of nevus, melanoma and seborrheic keratosis. Our approach aims to incorporate the expert knowledge of dermatologists into the well known framework of Convolutional Neural Networks (CNN), which have shown impressive performance in many visual recognition tasks. In particular, we have designed several networks providing lesion area identification, lesion segmentation into structural patterns and final diagnosis of clinical cases. Furthermore, novel blocks for CNNs have been designed to integrate this information with the diagnosis processing pipeline.
Tasks	Lesion Segmentation
Published	2017-03-06
URL	http://arxiv.org/abs/1703.01976v3
PDF	http://arxiv.org/pdf/1703.01976v3.pdf
PWC	https://paperswithcode.com/paper/incorporating-the-knowledge-of-dermatologists
Repo	https://github.com/igondia/matconvnet-dermoscopy
Framework	none

Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks


Title	Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks
Authors	Diarmaid Conaty, Denis D. Mauá, Cassio P. de Campos
Abstract	We discuss the computational complexity of approximating maximum a posteriori inference in sum-product networks. We first show NP-hardness in trees of height two by a reduction from maximum independent set; this implies non-approximability within a sublinear factor. We show that this is a tight bound, as we can find an approximation within a linear factor in networks of height two. We then show that, in trees of height three, it is NP-hard to approximate the problem within a factor $2^{f(n)}$ for any sublinear function $f$ of the size of the input $n$. Again, this bound is tight, as we prove that the usual max-product algorithm finds (in any network) approximations within factor $2^{c \cdot n}$ for some constant $c < 1$. Last, we present a simple algorithm, and show that it provably produces solutions at least as good as, and potentially much better than, the max-product algorithm. We empirically analyze the proposed algorithm against max-product using synthetic and realistic networks.
Tasks
Published	2017-03-17
URL	http://arxiv.org/abs/1703.06045v5
PDF	http://arxiv.org/pdf/1703.06045v5.pdf
PWC	https://paperswithcode.com/paper/approximation-complexity-of-maximum-a
Repo	https://github.com/RenatoGeh/gospn
Framework	none

Deep Learning with Topological Signatures


Title	Deep Learning with Topological Signatures
Authors	Christoph Hofer, Roland Kwitt, Marc Niethammer, Andreas Uhl
Abstract	Inferring topological and geometrical information from data can offer an alternative perspective on machine learning problems. Methods from topological data analysis, e.g., persistent homology, enable us to obtain such information, typically in the form of summary representations of topological features. However, such topological signatures often come with an unusual structure (e.g., multisets of intervals) that is highly impractical for most machine learning techniques. While many strategies have been proposed to map these topological signatures into machine learning compatible representations, they suffer from being agnostic to the target learning task. In contrast, we propose a technique that enables us to input topological signatures to deep neural networks and learn a task-optimal representation during training. Our approach is realized as a novel input layer with favorable theoretical properties. Classification experiments on 2D object shapes and social network graphs demonstrate the versatility of the approach and, in case of the latter, we even outperform the state-of-the-art by a large margin.
Tasks	Topological Data Analysis
Published	2017-07-13
URL	http://arxiv.org/abs/1707.04041v3
PDF	http://arxiv.org/pdf/1707.04041v3.pdf
PWC	https://paperswithcode.com/paper/deep-learning-with-topological-signatures
Repo	https://github.com/billy-mosse/spiderman
Framework	none

Semi-supervised emotion lexicon expansion with label propagation and specialized word embeddings


Title	Semi-supervised emotion lexicon expansion with label propagation and specialized word embeddings
Authors	Mario Giulianelli
Abstract	There exist two main approaches to automatically extract affective orientation: lexicon-based and corpus-based. In this work, we argue that these two methods are compatible and show that combining them can improve the accuracy of emotion classifiers. In particular, we introduce a novel variant of the Label Propagation algorithm that is tailored to distributed word representations, we apply batch gradient descent to accelerate the optimization of label propagation and to make the optimization feasible for large graphs, and we propose a reproducible method for emotion lexicon expansion. We conclude that label propagation can expand an emotion lexicon in a meaningful way and that the expanded emotion lexicon can be leveraged to improve the accuracy of an emotion classifier.
Tasks	Word Embeddings
Published	2017-08-13
URL	http://arxiv.org/abs/1708.03910v1
PDF	http://arxiv.org/pdf/1708.03910v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-emotion-lexicon-expansion
Repo	https://github.com/Procope/emo2vec
Framework	tf

Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning


Title	Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning
Authors	Hongge Chen, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Cho-Jui Hsieh
Abstract	Visual language grounding is widely studied in modern neural image captioning systems, which typically adopts an encoder-decoder framework consisting of two principal components: a convolutional neural network (CNN) for image feature extraction and a recurrent neural network (RNN) for language caption generation. To study the robustness of language grounding to adversarial perturbations in machine vision and perception, we propose Show-and-Fool, a novel algorithm for crafting adversarial examples in neural image captioning. The proposed algorithm provides two evaluation approaches, which check whether neural image captioning systems can be mislead to output some randomly chosen captions or keywords. Our extensive experiments show that our algorithm can successfully craft visually-similar adversarial examples with randomly targeted captions or keywords, and the adversarial examples can be made highly transferable to other image captioning systems. Consequently, our approach leads to new robustness implications of neural image captioning and novel insights in visual language grounding.
Tasks	Image Captioning
Published	2017-12-06
URL	http://arxiv.org/abs/1712.02051v2
PDF	http://arxiv.org/pdf/1712.02051v2.pdf
PWC	https://paperswithcode.com/paper/attacking-visual-language-grounding-with
Repo	https://github.com/IBM/Image-Captioning-Attack
Framework	tf

DeepArchitect: Automatically Designing and Training Deep Architectures


Title	DeepArchitect: Automatically Designing and Training Deep Architectures
Authors	Renato Negrinho, Geoff Gordon
Abstract	In deep learning, performance is strongly affected by the choice of architecture and hyperparameters. While there has been extensive work on automatic hyperparameter optimization for simple spaces, complex spaces such as the space of deep architectures remain largely unexplored. As a result, the choice of architecture is done manually by the human expert through a slow trial and error process guided mainly by intuition. In this paper we describe a framework for automatically designing and training deep models. We propose an extensible and modular language that allows the human expert to compactly represent complex search spaces over architectures and their hyperparameters. The resulting search spaces are tree-structured and therefore easy to traverse. Models can be automatically compiled to computational graphs once values for all hyperparameters have been chosen. We can leverage the structure of the search space to introduce different model search algorithms, such as random search, Monte Carlo tree search (MCTS), and sequential model-based optimization (SMBO). We present experiments comparing the different algorithms on CIFAR-10 and show that MCTS and SMBO outperform random search. In addition, these experiments show that our framework can be used effectively for model discovery, as it is possible to describe expressive search spaces and discover competitive models without much effort from the human expert. Code for our framework and experiments has been made publicly available.
Tasks	Hyperparameter Optimization
Published	2017-04-28
URL	http://arxiv.org/abs/1704.08792v1
PDF	http://arxiv.org/pdf/1704.08792v1.pdf
PWC	https://paperswithcode.com/paper/deeparchitect-automatically-designing-and
Repo	https://github.com/negrinho/deep_architect_legacy
Framework	tf

Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces


Title	Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces
Authors	Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, Peter Stone
Abstract	While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data. One way to increase the speed at which agents are able to learn to perform tasks is by leveraging the input of human trainers. Although such input can take many forms, real-time, scalar-valued feedback is especially useful in situations where it proves difficult or impossible for humans to provide expert demonstrations. Previous approaches have shown the usefulness of human input provided in this fashion (e.g., the TAMER framework), but they have thus far not considered high-dimensional state spaces or employed the use of deep learning. In this paper, we do both: we propose Deep TAMER, an extension of the TAMER framework that leverages the representational power of deep neural networks in order to learn complex tasks in just a short amount of time with a human trainer. We demonstrate Deep TAMER’s success by using it and just 15 minutes of human-provided feedback to train an agent that performs better than humans on the Atari game of Bowling - a task that has proven difficult for even state-of-the-art reinforcement learning methods.
Tasks
Published	2017-09-28
URL	http://arxiv.org/abs/1709.10163v2
PDF	http://arxiv.org/pdf/1709.10163v2.pdf
PWC	https://paperswithcode.com/paper/deep-tamer-interactive-agent-shaping-in-high
Repo	https://github.com/JulienDesvergnes/human-reinforcement-learning
Framework	tf

Active Convolution: Learning the Shape of Convolution for Image Classification


Title	Active Convolution: Learning the Shape of Convolution for Image Classification
Authors	Yunho Jeon, Junmo Kim
Abstract	In recent years, deep learning has achieved great success in many computer vision applications. Convolutional neural networks (CNNs) have lately emerged as a major approach to image classification. Most research on CNNs thus far has focused on developing architectures such as the Inception and residual networks. The convolution layer is the core of the CNN, but few studies have addressed the convolution unit itself. In this paper, we introduce a convolution unit called the active convolution unit (ACU). A new convolution has no fixed shape, because of which we can define any form of convolution. Its shape can be learned through backpropagation during training. Our proposed unit has a few advantages. First, the ACU is a generalization of convolution; it can define not only all conventional convolutions, but also convolutions with fractional pixel coordinates. We can freely change the shape of the convolution, which provides greater freedom to form CNN structures. Second, the shape of the convolution is learned while training and there is no need to tune it by hand. Third, the ACU can learn better than a conventional unit, where we obtained the improvement simply by changing the conventional convolution to an ACU. We tested our proposed method on plain and residual networks, and the results showed significant improvement using our method on various datasets and architectures in comparison with the baseline.
Tasks	Image Classification
Published	2017-03-27
URL	http://arxiv.org/abs/1703.09076v1
PDF	http://arxiv.org/pdf/1703.09076v1.pdf
PWC	https://paperswithcode.com/paper/active-convolution-learning-the-shape-of
Repo	https://github.com/jyh2986/Active-Convolution
Framework	none

Long Text Generation via Adversarial Training with Leaked Information


Title	Long Text Generation via Adversarial Training with Leaked Information
Authors	Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, Jun Wang
Abstract	Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional Manager module, which takes the extracted features of current generated words and outputs a latent vector to guide the Worker module for next-word generation. Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker.
Tasks	Text Generation
Published	2017-09-24
URL	http://arxiv.org/abs/1709.08624v2
PDF	http://arxiv.org/pdf/1709.08624v2.pdf
PWC	https://paperswithcode.com/paper/long-text-generation-via-adversarial-training
Repo	https://github.com/liyzcj/leakgan-py3
Framework	tf

SmoothGrad: removing noise by adding noise


Title	SmoothGrad: removing noise by adding noise
Authors	Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda Viégas, Martin Wattenberg
Abstract	Explaining the output of a deep network remains a challenge. In the case of an image classifier, one type of explanation is to identify pixels that strongly influence the final decision. A starting point for this strategy is the gradient of the class score function with respect to the input image. This gradient can be interpreted as a sensitivity map, and there are several techniques that elaborate on this basic idea. This paper makes two contributions: it introduces SmoothGrad, a simple method that can help visually sharpen gradient-based sensitivity maps, and it discusses lessons in the visualization of these maps. We publish the code for our experiments and a website with our results.
Tasks	Interpretable Machine Learning
Published	2017-06-12
URL	http://arxiv.org/abs/1706.03825v1
PDF	http://arxiv.org/pdf/1706.03825v1.pdf
PWC	https://paperswithcode.com/paper/smoothgrad-removing-noise-by-adding-noise
Repo	https://github.com/saivarunr/xshap
Framework	tf

Incremental Skip-gram Model with Negative Sampling


Title	Incremental Skip-gram Model with Negative Sampling
Authors	Nobuhiro Kaji, Hayato Kobayashi
Abstract	This paper explores an incremental training strategy for the skip-gram model with negative sampling (SGNS) from both empirical and theoretical perspectives. Existing methods of neural word embeddings, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. To address this problem, we present a simple incremental extension of SGNS and provide a thorough theoretical analysis to demonstrate its validity. Empirical experiments demonstrated the correctness of the theoretical analysis as well as the practical usefulness of the incremental algorithm.
Tasks	Word Embeddings
Published	2017-04-13
URL	http://arxiv.org/abs/1704.03956v2
PDF	http://arxiv.org/pdf/1704.03956v2.pdf
PWC	https://paperswithcode.com/paper/incremental-skip-gram-model-with-negative
Repo	https://github.com/yahoojapan/yskip
Framework	none