July 30, 2019

2946 words 14 mins read

Paper Group AWR 47

Paper Group AWR 47

Automatic skin lesion segmentation with fully convolutional-deconvolutional networks. Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings. Incorporat …

Automatic skin lesion segmentation with fully convolutional-deconvolutional networks

Title Automatic skin lesion segmentation with fully convolutional-deconvolutional networks
Authors Yading Yuan
Abstract This paper summarizes our method and validation results for the ISBI Challenge 2017 - Skin Lesion Analysis Towards Melanoma Detection - Part I: Lesion Segmentation
Tasks Lesion Segmentation
Published 2017-03-15
URL http://arxiv.org/abs/1703.05165v2
PDF http://arxiv.org/pdf/1703.05165v2.pdf
PWC https://paperswithcode.com/paper/automatic-skin-lesion-segmentation-with-fully
Repo https://github.com/manideep2510/melanoma_segmentation
Framework tf

Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss

Title Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss
Authors Yehezkel S. Resheff, Amit Mandelbaum, Daphna Weinshall
Abstract Deep learning has become the method of choice in many application domains of machine learning in recent years, especially for multi-class classification tasks. The most common loss function used in this context is the cross-entropy loss, which reduces to the log loss in the typical case when there is a single correct response label. While this loss is insensitive to the identity of the assigned class in the case of misclassification, in practice it is often the case that some errors may be more detrimental than others. Here we present the bilinear-loss (and related log-bilinear-loss) which differentially penalizes the different wrong assignments of the model. We thoroughly test this method using standard models and benchmark image datasets. As one application, we show the ability of this method to better contain error within the correct super-class, in the hierarchically labeled CIFAR100 dataset, without affecting the overall performance of the classifier.
Tasks
Published 2017-04-20
URL http://arxiv.org/abs/1704.06062v1
PDF http://arxiv.org/pdf/1704.06062v1.pdf
PWC https://paperswithcode.com/paper/every-untrue-label-is-untrue-in-its-own-way
Repo https://github.com/Hezi-Resheff/paper-log-bilinear-loss
Framework tf

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

Title Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Authors Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean
Abstract The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this work, we address these challenges and finally realize the promise of conditional computation, achieving greater than 1000x improvements in model capacity with only minor losses in computational efficiency on modern GPU clusters. We introduce a Sparsely-Gated Mixture-of-Experts layer (MoE), consisting of up to thousands of feed-forward sub-networks. A trainable gating network determines a sparse combination of these experts to use for each example. We apply the MoE to the tasks of language modeling and machine translation, where model capacity is critical for absorbing the vast quantities of knowledge available in the training corpora. We present model architectures in which a MoE with up to 137 billion parameters is applied convolutionally between stacked LSTM layers. On large language modeling and machine translation benchmarks, these models achieve significantly better results than state-of-the-art at lower computational cost.
Tasks Language Modelling, Machine Translation
Published 2017-01-23
URL http://arxiv.org/abs/1701.06538v1
PDF http://arxiv.org/pdf/1701.06538v1.pdf
PWC https://paperswithcode.com/paper/outrageously-large-neural-networks-the
Repo https://github.com/unconst/MACH
Framework tf

Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings

Title Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings
Authors Shane Settle, Keith Levin, Herman Kamper, Karen Livescu
Abstract Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments. Recent work has shown that comparing speech segments by representing them as fixed-dimensional vectors — acoustic word embeddings — and measuring their vector distance (e.g., cosine distance) can discriminate between words more accurately than DTW-based approaches. We consider an approach to query-by-example search that embeds both the query and database segments according to a neural model, followed by nearest-neighbor search to find the matching segments. Earlier work on embedding-based query-by-example, using template-based acoustic word embeddings, achieved competitive performance. We find that our embeddings, based on recurrent neural networks trained to optimize word discrimination, achieve substantial improvements in performance and run-time efficiency over the previous approaches.
Tasks Word Embeddings
Published 2017-06-12
URL http://arxiv.org/abs/1706.03818v1
PDF http://arxiv.org/pdf/1706.03818v1.pdf
PWC https://paperswithcode.com/paper/query-by-example-search-with-discriminative
Repo https://github.com/kamperh/recipe_semantic_flickraudio
Framework tf

Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions

Title Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions
Authors Iván González Díaz
Abstract This report describes our submission to the ISIC 2017 Challenge in Skin Lesion Analysis Towards Melanoma Detection. We have participated in the Part 3: Lesion Classification with a system for automatic diagnosis of nevus, melanoma and seborrheic keratosis. Our approach aims to incorporate the expert knowledge of dermatologists into the well known framework of Convolutional Neural Networks (CNN), which have shown impressive performance in many visual recognition tasks. In particular, we have designed several networks providing lesion area identification, lesion segmentation into structural patterns and final diagnosis of clinical cases. Furthermore, novel blocks for CNNs have been designed to integrate this information with the diagnosis processing pipeline.
Tasks Lesion Segmentation
Published 2017-03-06
URL http://arxiv.org/abs/1703.01976v3
PDF http://arxiv.org/pdf/1703.01976v3.pdf
PWC https://paperswithcode.com/paper/incorporating-the-knowledge-of-dermatologists
Repo https://github.com/igondia/matconvnet-dermoscopy
Framework none

Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks

Title Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks
Authors Diarmaid Conaty, Denis D. Mauá, Cassio P. de Campos
Abstract We discuss the computational complexity of approximating maximum a posteriori inference in sum-product networks. We first show NP-hardness in trees of height two by a reduction from maximum independent set; this implies non-approximability within a sublinear factor. We show that this is a tight bound, as we can find an approximation within a linear factor in networks of height two. We then show that, in trees of height three, it is NP-hard to approximate the problem within a factor $2^{f(n)}$ for any sublinear function $f$ of the size of the input $n$. Again, this bound is tight, as we prove that the usual max-product algorithm finds (in any network) approximations within factor $2^{c \cdot n}$ for some constant $c < 1$. Last, we present a simple algorithm, and show that it provably produces solutions at least as good as, and potentially much better than, the max-product algorithm. We empirically analyze the proposed algorithm against max-product using synthetic and realistic networks.
Tasks
Published 2017-03-17
URL http://arxiv.org/abs/1703.06045v5
PDF http://arxiv.org/pdf/1703.06045v5.pdf
PWC https://paperswithcode.com/paper/approximation-complexity-of-maximum-a
Repo https://github.com/RenatoGeh/gospn
Framework none

Deep Learning with Topological Signatures

Title Deep Learning with Topological Signatures
Authors Christoph Hofer, Roland Kwitt, Marc Niethammer, Andreas Uhl
Abstract Inferring topological and geometrical information from data can offer an alternative perspective on machine learning problems. Methods from topological data analysis, e.g., persistent homology, enable us to obtain such information, typically in the form of summary representations of topological features. However, such topological signatures often come with an unusual structure (e.g., multisets of intervals) that is highly impractical for most machine learning techniques. While many strategies have been proposed to map these topological signatures into machine learning compatible representations, they suffer from being agnostic to the target learning task. In contrast, we propose a technique that enables us to input topological signatures to deep neural networks and learn a task-optimal representation during training. Our approach is realized as a novel input layer with favorable theoretical properties. Classification experiments on 2D object shapes and social network graphs demonstrate the versatility of the approach and, in case of the latter, we even outperform the state-of-the-art by a large margin.
Tasks Topological Data Analysis
Published 2017-07-13
URL http://arxiv.org/abs/1707.04041v3
PDF http://arxiv.org/pdf/1707.04041v3.pdf
PWC https://paperswithcode.com/paper/deep-learning-with-topological-signatures
Repo https://github.com/billy-mosse/spiderman
Framework none

Semi-supervised emotion lexicon expansion with label propagation and specialized word embeddings

Title Semi-supervised emotion lexicon expansion with label propagation and specialized word embeddings
Authors Mario Giulianelli
Abstract There exist two main approaches to automatically extract affective orientation: lexicon-based and corpus-based. In this work, we argue that these two methods are compatible and show that combining them can improve the accuracy of emotion classifiers. In particular, we introduce a novel variant of the Label Propagation algorithm that is tailored to distributed word representations, we apply batch gradient descent to accelerate the optimization of label propagation and to make the optimization feasible for large graphs, and we propose a reproducible method for emotion lexicon expansion. We conclude that label propagation can expand an emotion lexicon in a meaningful way and that the expanded emotion lexicon can be leveraged to improve the accuracy of an emotion classifier.
Tasks Word Embeddings
Published 2017-08-13
URL http://arxiv.org/abs/1708.03910v1
PDF http://arxiv.org/pdf/1708.03910v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-emotion-lexicon-expansion
Repo https://github.com/Procope/emo2vec
Framework tf

Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning

Title Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning
Authors Hongge Chen, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Cho-Jui Hsieh
Abstract Visual language grounding is widely studied in modern neural image captioning systems, which typically adopts an encoder-decoder framework consisting of two principal components: a convolutional neural network (CNN) for image feature extraction and a recurrent neural network (RNN) for language caption generation. To study the robustness of language grounding to adversarial perturbations in machine vision and perception, we propose Show-and-Fool, a novel algorithm for crafting adversarial examples in neural image captioning. The proposed algorithm provides two evaluation approaches, which check whether neural image captioning systems can be mislead to output some randomly chosen captions or keywords. Our extensive experiments show that our algorithm can successfully craft visually-similar adversarial examples with randomly targeted captions or keywords, and the adversarial examples can be made highly transferable to other image captioning systems. Consequently, our approach leads to new robustness implications of neural image captioning and novel insights in visual language grounding.
Tasks Image Captioning
Published 2017-12-06
URL http://arxiv.org/abs/1712.02051v2
PDF http://arxiv.org/pdf/1712.02051v2.pdf
PWC https://paperswithcode.com/paper/attacking-visual-language-grounding-with
Repo https://github.com/IBM/Image-Captioning-Attack
Framework tf

DeepArchitect: Automatically Designing and Training Deep Architectures

Title DeepArchitect: Automatically Designing and Training Deep Architectures
Authors Renato Negrinho, Geoff Gordon
Abstract In deep learning, performance is strongly affected by the choice of architecture and hyperparameters. While there has been extensive work on automatic hyperparameter optimization for simple spaces, complex spaces such as the space of deep architectures remain largely unexplored. As a result, the choice of architecture is done manually by the human expert through a slow trial and error process guided mainly by intuition. In this paper we describe a framework for automatically designing and training deep models. We propose an extensible and modular language that allows the human expert to compactly represent complex search spaces over architectures and their hyperparameters. The resulting search spaces are tree-structured and therefore easy to traverse. Models can be automatically compiled to computational graphs once values for all hyperparameters have been chosen. We can leverage the structure of the search space to introduce different model search algorithms, such as random search, Monte Carlo tree search (MCTS), and sequential model-based optimization (SMBO). We present experiments comparing the different algorithms on CIFAR-10 and show that MCTS and SMBO outperform random search. In addition, these experiments show that our framework can be used effectively for model discovery, as it is possible to describe expressive search spaces and discover competitive models without much effort from the human expert. Code for our framework and experiments has been made publicly available.
Tasks Hyperparameter Optimization
Published 2017-04-28
URL http://arxiv.org/abs/1704.08792v1
PDF http://arxiv.org/pdf/1704.08792v1.pdf
PWC https://paperswithcode.com/paper/deeparchitect-automatically-designing-and
Repo https://github.com/negrinho/deep_architect_legacy
Framework tf

Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

Title Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces
Authors Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, Peter Stone
Abstract While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data. One way to increase the speed at which agents are able to learn to perform tasks is by leveraging the input of human trainers. Although such input can take many forms, real-time, scalar-valued feedback is especially useful in situations where it proves difficult or impossible for humans to provide expert demonstrations. Previous approaches have shown the usefulness of human input provided in this fashion (e.g., the TAMER framework), but they have thus far not considered high-dimensional state spaces or employed the use of deep learning. In this paper, we do both: we propose Deep TAMER, an extension of the TAMER framework that leverages the representational power of deep neural networks in order to learn complex tasks in just a short amount of time with a human trainer. We demonstrate Deep TAMER’s success by using it and just 15 minutes of human-provided feedback to train an agent that performs better than humans on the Atari game of Bowling - a task that has proven difficult for even state-of-the-art reinforcement learning methods.
Tasks
Published 2017-09-28
URL http://arxiv.org/abs/1709.10163v2
PDF http://arxiv.org/pdf/1709.10163v2.pdf
PWC https://paperswithcode.com/paper/deep-tamer-interactive-agent-shaping-in-high
Repo https://github.com/JulienDesvergnes/human-reinforcement-learning
Framework tf

Active Convolution: Learning the Shape of Convolution for Image Classification

Title Active Convolution: Learning the Shape of Convolution for Image Classification
Authors Yunho Jeon, Junmo Kim
Abstract In recent years, deep learning has achieved great success in many computer vision applications. Convolutional neural networks (CNNs) have lately emerged as a major approach to image classification. Most research on CNNs thus far has focused on developing architectures such as the Inception and residual networks. The convolution layer is the core of the CNN, but few studies have addressed the convolution unit itself. In this paper, we introduce a convolution unit called the active convolution unit (ACU). A new convolution has no fixed shape, because of which we can define any form of convolution. Its shape can be learned through backpropagation during training. Our proposed unit has a few advantages. First, the ACU is a generalization of convolution; it can define not only all conventional convolutions, but also convolutions with fractional pixel coordinates. We can freely change the shape of the convolution, which provides greater freedom to form CNN structures. Second, the shape of the convolution is learned while training and there is no need to tune it by hand. Third, the ACU can learn better than a conventional unit, where we obtained the improvement simply by changing the conventional convolution to an ACU. We tested our proposed method on plain and residual networks, and the results showed significant improvement using our method on various datasets and architectures in comparison with the baseline.
Tasks Image Classification
Published 2017-03-27
URL http://arxiv.org/abs/1703.09076v1
PDF http://arxiv.org/pdf/1703.09076v1.pdf
PWC https://paperswithcode.com/paper/active-convolution-learning-the-shape-of
Repo https://github.com/jyh2986/Active-Convolution
Framework none

Long Text Generation via Adversarial Training with Leaked Information

Title Long Text Generation via Adversarial Training with Leaked Information
Authors Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, Jun Wang
Abstract Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional Manager module, which takes the extracted features of current generated words and outputs a latent vector to guide the Worker module for next-word generation. Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker.
Tasks Text Generation
Published 2017-09-24
URL http://arxiv.org/abs/1709.08624v2
PDF http://arxiv.org/pdf/1709.08624v2.pdf
PWC https://paperswithcode.com/paper/long-text-generation-via-adversarial-training
Repo https://github.com/liyzcj/leakgan-py3
Framework tf

SmoothGrad: removing noise by adding noise

Title SmoothGrad: removing noise by adding noise
Authors Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda Viégas, Martin Wattenberg
Abstract Explaining the output of a deep network remains a challenge. In the case of an image classifier, one type of explanation is to identify pixels that strongly influence the final decision. A starting point for this strategy is the gradient of the class score function with respect to the input image. This gradient can be interpreted as a sensitivity map, and there are several techniques that elaborate on this basic idea. This paper makes two contributions: it introduces SmoothGrad, a simple method that can help visually sharpen gradient-based sensitivity maps, and it discusses lessons in the visualization of these maps. We publish the code for our experiments and a website with our results.
Tasks Interpretable Machine Learning
Published 2017-06-12
URL http://arxiv.org/abs/1706.03825v1
PDF http://arxiv.org/pdf/1706.03825v1.pdf
PWC https://paperswithcode.com/paper/smoothgrad-removing-noise-by-adding-noise
Repo https://github.com/saivarunr/xshap
Framework tf

Incremental Skip-gram Model with Negative Sampling

Title Incremental Skip-gram Model with Negative Sampling
Authors Nobuhiro Kaji, Hayato Kobayashi
Abstract This paper explores an incremental training strategy for the skip-gram model with negative sampling (SGNS) from both empirical and theoretical perspectives. Existing methods of neural word embeddings, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. To address this problem, we present a simple incremental extension of SGNS and provide a thorough theoretical analysis to demonstrate its validity. Empirical experiments demonstrated the correctness of the theoretical analysis as well as the practical usefulness of the incremental algorithm.
Tasks Word Embeddings
Published 2017-04-13
URL http://arxiv.org/abs/1704.03956v2
PDF http://arxiv.org/pdf/1704.03956v2.pdf
PWC https://paperswithcode.com/paper/incremental-skip-gram-model-with-negative
Repo https://github.com/yahoojapan/yskip
Framework none
comments powered by Disqus