Paper Group AWR 47
Automatic skin lesion segmentation with fully convolutional-deconvolutional networks. Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings. Incorporat …
Automatic skin lesion segmentation with fully convolutional-deconvolutional networks
Title | Automatic skin lesion segmentation with fully convolutional-deconvolutional networks |
Authors | Yading Yuan |
Abstract | This paper summarizes our method and validation results for the ISBI Challenge 2017 - Skin Lesion Analysis Towards Melanoma Detection - Part I: Lesion Segmentation |
Tasks | Lesion Segmentation |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.05165v2 |
http://arxiv.org/pdf/1703.05165v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-skin-lesion-segmentation-with-fully |
Repo | https://github.com/manideep2510/melanoma_segmentation |
Framework | tf |
Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss
Title | Every Untrue Label is Untrue in its Own Way: Controlling Error Type with the Log Bilinear Loss |
Authors | Yehezkel S. Resheff, Amit Mandelbaum, Daphna Weinshall |
Abstract | Deep learning has become the method of choice in many application domains of machine learning in recent years, especially for multi-class classification tasks. The most common loss function used in this context is the cross-entropy loss, which reduces to the log loss in the typical case when there is a single correct response label. While this loss is insensitive to the identity of the assigned class in the case of misclassification, in practice it is often the case that some errors may be more detrimental than others. Here we present the bilinear-loss (and related log-bilinear-loss) which differentially penalizes the different wrong assignments of the model. We thoroughly test this method using standard models and benchmark image datasets. As one application, we show the ability of this method to better contain error within the correct super-class, in the hierarchically labeled CIFAR100 dataset, without affecting the overall performance of the classifier. |
Tasks | |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06062v1 |
http://arxiv.org/pdf/1704.06062v1.pdf | |
PWC | https://paperswithcode.com/paper/every-untrue-label-is-untrue-in-its-own-way |
Repo | https://github.com/Hezi-Resheff/paper-log-bilinear-loss |
Framework | tf |
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Title | Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer |
Authors | Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean |
Abstract | The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this work, we address these challenges and finally realize the promise of conditional computation, achieving greater than 1000x improvements in model capacity with only minor losses in computational efficiency on modern GPU clusters. We introduce a Sparsely-Gated Mixture-of-Experts layer (MoE), consisting of up to thousands of feed-forward sub-networks. A trainable gating network determines a sparse combination of these experts to use for each example. We apply the MoE to the tasks of language modeling and machine translation, where model capacity is critical for absorbing the vast quantities of knowledge available in the training corpora. We present model architectures in which a MoE with up to 137 billion parameters is applied convolutionally between stacked LSTM layers. On large language modeling and machine translation benchmarks, these models achieve significantly better results than state-of-the-art at lower computational cost. |
Tasks | Language Modelling, Machine Translation |
Published | 2017-01-23 |
URL | http://arxiv.org/abs/1701.06538v1 |
http://arxiv.org/pdf/1701.06538v1.pdf | |
PWC | https://paperswithcode.com/paper/outrageously-large-neural-networks-the |
Repo | https://github.com/unconst/MACH |
Framework | tf |
Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings
Title | Query-by-Example Search with Discriminative Neural Acoustic Word Embeddings |
Authors | Shane Settle, Keith Levin, Herman Kamper, Karen Livescu |
Abstract | Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments. Recent work has shown that comparing speech segments by representing them as fixed-dimensional vectors — acoustic word embeddings — and measuring their vector distance (e.g., cosine distance) can discriminate between words more accurately than DTW-based approaches. We consider an approach to query-by-example search that embeds both the query and database segments according to a neural model, followed by nearest-neighbor search to find the matching segments. Earlier work on embedding-based query-by-example, using template-based acoustic word embeddings, achieved competitive performance. We find that our embeddings, based on recurrent neural networks trained to optimize word discrimination, achieve substantial improvements in performance and run-time efficiency over the previous approaches. |
Tasks | Word Embeddings |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03818v1 |
http://arxiv.org/pdf/1706.03818v1.pdf | |
PWC | https://paperswithcode.com/paper/query-by-example-search-with-discriminative |
Repo | https://github.com/kamperh/recipe_semantic_flickraudio |
Framework | tf |
Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions
Title | Incorporating the Knowledge of Dermatologists to Convolutional Neural Networks for the Diagnosis of Skin Lesions |
Authors | Iván González Díaz |
Abstract | This report describes our submission to the ISIC 2017 Challenge in Skin Lesion Analysis Towards Melanoma Detection. We have participated in the Part 3: Lesion Classification with a system for automatic diagnosis of nevus, melanoma and seborrheic keratosis. Our approach aims to incorporate the expert knowledge of dermatologists into the well known framework of Convolutional Neural Networks (CNN), which have shown impressive performance in many visual recognition tasks. In particular, we have designed several networks providing lesion area identification, lesion segmentation into structural patterns and final diagnosis of clinical cases. Furthermore, novel blocks for CNNs have been designed to integrate this information with the diagnosis processing pipeline. |
Tasks | Lesion Segmentation |
Published | 2017-03-06 |
URL | http://arxiv.org/abs/1703.01976v3 |
http://arxiv.org/pdf/1703.01976v3.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-the-knowledge-of-dermatologists |
Repo | https://github.com/igondia/matconvnet-dermoscopy |
Framework | none |
Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks
Title | Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks |
Authors | Diarmaid Conaty, Denis D. Mauá, Cassio P. de Campos |
Abstract | We discuss the computational complexity of approximating maximum a posteriori inference in sum-product networks. We first show NP-hardness in trees of height two by a reduction from maximum independent set; this implies non-approximability within a sublinear factor. We show that this is a tight bound, as we can find an approximation within a linear factor in networks of height two. We then show that, in trees of height three, it is NP-hard to approximate the problem within a factor $2^{f(n)}$ for any sublinear function $f$ of the size of the input $n$. Again, this bound is tight, as we prove that the usual max-product algorithm finds (in any network) approximations within factor $2^{c \cdot n}$ for some constant $c < 1$. Last, we present a simple algorithm, and show that it provably produces solutions at least as good as, and potentially much better than, the max-product algorithm. We empirically analyze the proposed algorithm against max-product using synthetic and realistic networks. |
Tasks | |
Published | 2017-03-17 |
URL | http://arxiv.org/abs/1703.06045v5 |
http://arxiv.org/pdf/1703.06045v5.pdf | |
PWC | https://paperswithcode.com/paper/approximation-complexity-of-maximum-a |
Repo | https://github.com/RenatoGeh/gospn |
Framework | none |
Deep Learning with Topological Signatures
Title | Deep Learning with Topological Signatures |
Authors | Christoph Hofer, Roland Kwitt, Marc Niethammer, Andreas Uhl |
Abstract | Inferring topological and geometrical information from data can offer an alternative perspective on machine learning problems. Methods from topological data analysis, e.g., persistent homology, enable us to obtain such information, typically in the form of summary representations of topological features. However, such topological signatures often come with an unusual structure (e.g., multisets of intervals) that is highly impractical for most machine learning techniques. While many strategies have been proposed to map these topological signatures into machine learning compatible representations, they suffer from being agnostic to the target learning task. In contrast, we propose a technique that enables us to input topological signatures to deep neural networks and learn a task-optimal representation during training. Our approach is realized as a novel input layer with favorable theoretical properties. Classification experiments on 2D object shapes and social network graphs demonstrate the versatility of the approach and, in case of the latter, we even outperform the state-of-the-art by a large margin. |
Tasks | Topological Data Analysis |
Published | 2017-07-13 |
URL | http://arxiv.org/abs/1707.04041v3 |
http://arxiv.org/pdf/1707.04041v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-with-topological-signatures |
Repo | https://github.com/billy-mosse/spiderman |
Framework | none |
Semi-supervised emotion lexicon expansion with label propagation and specialized word embeddings
Title | Semi-supervised emotion lexicon expansion with label propagation and specialized word embeddings |
Authors | Mario Giulianelli |
Abstract | There exist two main approaches to automatically extract affective orientation: lexicon-based and corpus-based. In this work, we argue that these two methods are compatible and show that combining them can improve the accuracy of emotion classifiers. In particular, we introduce a novel variant of the Label Propagation algorithm that is tailored to distributed word representations, we apply batch gradient descent to accelerate the optimization of label propagation and to make the optimization feasible for large graphs, and we propose a reproducible method for emotion lexicon expansion. We conclude that label propagation can expand an emotion lexicon in a meaningful way and that the expanded emotion lexicon can be leveraged to improve the accuracy of an emotion classifier. |
Tasks | Word Embeddings |
Published | 2017-08-13 |
URL | http://arxiv.org/abs/1708.03910v1 |
http://arxiv.org/pdf/1708.03910v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-emotion-lexicon-expansion |
Repo | https://github.com/Procope/emo2vec |
Framework | tf |
Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning
Title | Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning |
Authors | Hongge Chen, Huan Zhang, Pin-Yu Chen, Jinfeng Yi, Cho-Jui Hsieh |
Abstract | Visual language grounding is widely studied in modern neural image captioning systems, which typically adopts an encoder-decoder framework consisting of two principal components: a convolutional neural network (CNN) for image feature extraction and a recurrent neural network (RNN) for language caption generation. To study the robustness of language grounding to adversarial perturbations in machine vision and perception, we propose Show-and-Fool, a novel algorithm for crafting adversarial examples in neural image captioning. The proposed algorithm provides two evaluation approaches, which check whether neural image captioning systems can be mislead to output some randomly chosen captions or keywords. Our extensive experiments show that our algorithm can successfully craft visually-similar adversarial examples with randomly targeted captions or keywords, and the adversarial examples can be made highly transferable to other image captioning systems. Consequently, our approach leads to new robustness implications of neural image captioning and novel insights in visual language grounding. |
Tasks | Image Captioning |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02051v2 |
http://arxiv.org/pdf/1712.02051v2.pdf | |
PWC | https://paperswithcode.com/paper/attacking-visual-language-grounding-with |
Repo | https://github.com/IBM/Image-Captioning-Attack |
Framework | tf |
DeepArchitect: Automatically Designing and Training Deep Architectures
Title | DeepArchitect: Automatically Designing and Training Deep Architectures |
Authors | Renato Negrinho, Geoff Gordon |
Abstract | In deep learning, performance is strongly affected by the choice of architecture and hyperparameters. While there has been extensive work on automatic hyperparameter optimization for simple spaces, complex spaces such as the space of deep architectures remain largely unexplored. As a result, the choice of architecture is done manually by the human expert through a slow trial and error process guided mainly by intuition. In this paper we describe a framework for automatically designing and training deep models. We propose an extensible and modular language that allows the human expert to compactly represent complex search spaces over architectures and their hyperparameters. The resulting search spaces are tree-structured and therefore easy to traverse. Models can be automatically compiled to computational graphs once values for all hyperparameters have been chosen. We can leverage the structure of the search space to introduce different model search algorithms, such as random search, Monte Carlo tree search (MCTS), and sequential model-based optimization (SMBO). We present experiments comparing the different algorithms on CIFAR-10 and show that MCTS and SMBO outperform random search. In addition, these experiments show that our framework can be used effectively for model discovery, as it is possible to describe expressive search spaces and discover competitive models without much effort from the human expert. Code for our framework and experiments has been made publicly available. |
Tasks | Hyperparameter Optimization |
Published | 2017-04-28 |
URL | http://arxiv.org/abs/1704.08792v1 |
http://arxiv.org/pdf/1704.08792v1.pdf | |
PWC | https://paperswithcode.com/paper/deeparchitect-automatically-designing-and |
Repo | https://github.com/negrinho/deep_architect_legacy |
Framework | tf |
Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces
Title | Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces |
Authors | Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, Peter Stone |
Abstract | While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data. One way to increase the speed at which agents are able to learn to perform tasks is by leveraging the input of human trainers. Although such input can take many forms, real-time, scalar-valued feedback is especially useful in situations where it proves difficult or impossible for humans to provide expert demonstrations. Previous approaches have shown the usefulness of human input provided in this fashion (e.g., the TAMER framework), but they have thus far not considered high-dimensional state spaces or employed the use of deep learning. In this paper, we do both: we propose Deep TAMER, an extension of the TAMER framework that leverages the representational power of deep neural networks in order to learn complex tasks in just a short amount of time with a human trainer. We demonstrate Deep TAMER’s success by using it and just 15 minutes of human-provided feedback to train an agent that performs better than humans on the Atari game of Bowling - a task that has proven difficult for even state-of-the-art reinforcement learning methods. |
Tasks | |
Published | 2017-09-28 |
URL | http://arxiv.org/abs/1709.10163v2 |
http://arxiv.org/pdf/1709.10163v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-tamer-interactive-agent-shaping-in-high |
Repo | https://github.com/JulienDesvergnes/human-reinforcement-learning |
Framework | tf |
Active Convolution: Learning the Shape of Convolution for Image Classification
Title | Active Convolution: Learning the Shape of Convolution for Image Classification |
Authors | Yunho Jeon, Junmo Kim |
Abstract | In recent years, deep learning has achieved great success in many computer vision applications. Convolutional neural networks (CNNs) have lately emerged as a major approach to image classification. Most research on CNNs thus far has focused on developing architectures such as the Inception and residual networks. The convolution layer is the core of the CNN, but few studies have addressed the convolution unit itself. In this paper, we introduce a convolution unit called the active convolution unit (ACU). A new convolution has no fixed shape, because of which we can define any form of convolution. Its shape can be learned through backpropagation during training. Our proposed unit has a few advantages. First, the ACU is a generalization of convolution; it can define not only all conventional convolutions, but also convolutions with fractional pixel coordinates. We can freely change the shape of the convolution, which provides greater freedom to form CNN structures. Second, the shape of the convolution is learned while training and there is no need to tune it by hand. Third, the ACU can learn better than a conventional unit, where we obtained the improvement simply by changing the conventional convolution to an ACU. We tested our proposed method on plain and residual networks, and the results showed significant improvement using our method on various datasets and architectures in comparison with the baseline. |
Tasks | Image Classification |
Published | 2017-03-27 |
URL | http://arxiv.org/abs/1703.09076v1 |
http://arxiv.org/pdf/1703.09076v1.pdf | |
PWC | https://paperswithcode.com/paper/active-convolution-learning-the-shape-of |
Repo | https://github.com/jyh2986/Active-Convolution |
Framework | none |
Long Text Generation via Adversarial Training with Leaked Information
Title | Long Text Generation via Adversarial Training with Leaked Information |
Authors | Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, Jun Wang |
Abstract | Automatically generating coherent and semantically meaningful text has many applications in machine translation, dialogue systems, image captioning, etc. Recently, by combining with policy gradient, Generative Adversarial Nets (GAN) that use a discriminative model to guide the training of the generative model as a reinforcement learning policy has shown promising results in text generation. However, the scalar guiding signal is only available after the entire text has been generated and lacks intermediate information about text structure during the generative process. As such, it limits its success when the length of the generated text samples is long (more than 20 words). In this paper, we propose a new framework, called LeakGAN, to address the problem for long text generation. We allow the discriminative net to leak its own high-level extracted features to the generative net to further help the guidance. The generator incorporates such informative signals into all generation steps through an additional Manager module, which takes the extracted features of current generated words and outputs a latent vector to guide the Worker module for next-word generation. Our extensive experiments on synthetic data and various real-world tasks with Turing test demonstrate that LeakGAN is highly effective in long text generation and also improves the performance in short text generation scenarios. More importantly, without any supervision, LeakGAN would be able to implicitly learn sentence structures only through the interaction between Manager and Worker. |
Tasks | Text Generation |
Published | 2017-09-24 |
URL | http://arxiv.org/abs/1709.08624v2 |
http://arxiv.org/pdf/1709.08624v2.pdf | |
PWC | https://paperswithcode.com/paper/long-text-generation-via-adversarial-training |
Repo | https://github.com/liyzcj/leakgan-py3 |
Framework | tf |
SmoothGrad: removing noise by adding noise
Title | SmoothGrad: removing noise by adding noise |
Authors | Daniel Smilkov, Nikhil Thorat, Been Kim, Fernanda Viégas, Martin Wattenberg |
Abstract | Explaining the output of a deep network remains a challenge. In the case of an image classifier, one type of explanation is to identify pixels that strongly influence the final decision. A starting point for this strategy is the gradient of the class score function with respect to the input image. This gradient can be interpreted as a sensitivity map, and there are several techniques that elaborate on this basic idea. This paper makes two contributions: it introduces SmoothGrad, a simple method that can help visually sharpen gradient-based sensitivity maps, and it discusses lessons in the visualization of these maps. We publish the code for our experiments and a website with our results. |
Tasks | Interpretable Machine Learning |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03825v1 |
http://arxiv.org/pdf/1706.03825v1.pdf | |
PWC | https://paperswithcode.com/paper/smoothgrad-removing-noise-by-adding-noise |
Repo | https://github.com/saivarunr/xshap |
Framework | tf |
Incremental Skip-gram Model with Negative Sampling
Title | Incremental Skip-gram Model with Negative Sampling |
Authors | Nobuhiro Kaji, Hayato Kobayashi |
Abstract | This paper explores an incremental training strategy for the skip-gram model with negative sampling (SGNS) from both empirical and theoretical perspectives. Existing methods of neural word embeddings, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. To address this problem, we present a simple incremental extension of SGNS and provide a thorough theoretical analysis to demonstrate its validity. Empirical experiments demonstrated the correctness of the theoretical analysis as well as the practical usefulness of the incremental algorithm. |
Tasks | Word Embeddings |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.03956v2 |
http://arxiv.org/pdf/1704.03956v2.pdf | |
PWC | https://paperswithcode.com/paper/incremental-skip-gram-model-with-negative |
Repo | https://github.com/yahoojapan/yskip |
Framework | none |