February 2, 2020

3277 words 16 mins read

Paper Group AWR 8

Paper Group AWR 8

Feature-Less End-to-End Nested Term Extraction. A Survey on Deep Learning of Small Sample in Biomedical Image Analysis. Processing Megapixel Images with Deep Attention-Sampling Models. Visualizing Trends of Key Roles in News Articles. “The Michael Jordan of Greatness”: Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007 …

Feature-Less End-to-End Nested Term Extraction

Title Feature-Less End-to-End Nested Term Extraction
Authors Yuze Gao, Yu Yuan
Abstract In this paper, we proposed a deep learning-based end-to-end method on the domain specified automatic term extraction (ATE), it considers possible term spans within a fixed length in the sentence and predicts them whether they can be conceptual terms. In comparison with current ATE methods, the model supports nested term extraction and does not crucially need extra (extracted) features. Results show that it can achieve high recall and a comparable precision on term extraction task with inputting segmented raw text.
Tasks
Published 2019-08-15
URL https://arxiv.org/abs/1908.05426v1
PDF https://arxiv.org/pdf/1908.05426v1.pdf
PWC https://paperswithcode.com/paper/feature-less-end-to-end-nested-term
Repo https://github.com/CooDL/NestedTermExtraction
Framework pytorch

A Survey on Deep Learning of Small Sample in Biomedical Image Analysis

Title A Survey on Deep Learning of Small Sample in Biomedical Image Analysis
Authors Pengyi Zhang, Yunxin Zhong, Yulin Deng, Xiaoying Tang, Xiaoqiong Li
Abstract The success of deep learning has been witnessed as a promising technique for computer-aided biomedical image analysis, due to end-to-end learning framework and availability of large-scale labelled samples. However, in many cases of biomedical image analysis, deep learning techniques suffer from the small sample learning (SSL) dilemma caused mainly by lack of annotations. To be more practical for biomedical image analysis, in this paper we survey the key SSL techniques that help relieve the suffering of deep learning by combining with the development of related techniques in computer vision applications. In order to accelerate the clinical usage of biomedical image analysis based on deep learning techniques, we intentionally expand this survey to include the explanation methods for deep models that are important to clinical decision making. We survey the key SSL techniques by dividing them into five categories: (1) explanation techniques, (2) weakly supervised learning techniques, (3) transfer learning techniques, (4) active learning techniques, and (5) miscellaneous techniques involving data augmentation, domain knowledge, traditional shallow methods and attention mechanism. These key techniques are expected to effectively support the application of deep learning in clinical biomedical image analysis, and furtherly improve the analysis performance, especially when large-scale annotated samples are not available. We bulid demos at https://github.com/PengyiZhang/MIADeepSSL.
Tasks Active Learning, Data Augmentation, Decision Making, Transfer Learning
Published 2019-08-01
URL https://arxiv.org/abs/1908.00473v1
PDF https://arxiv.org/pdf/1908.00473v1.pdf
PWC https://paperswithcode.com/paper/a-survey-on-deep-learning-of-small-sample-in
Repo https://github.com/PengyiZhang/MIADeepSSL
Framework pytorch

Processing Megapixel Images with Deep Attention-Sampling Models

Title Processing Megapixel Images with Deep Attention-Sampling Models
Authors Angelos Katharopoulos, François Fleuret
Abstract Existing deep architectures cannot operate on very large signals such as megapixel images due to computational and memory constraints. To tackle this limitation, we propose a fully differentiable end-to-end trainable model that samples and processes only a fraction of the full resolution input image. The locations to process are sampled from an attention distribution computed from a low resolution view of the input. We refer to our method as attention sampling and it can process images of several megapixels with a standard single GPU setup. We show that sampling from the attention distribution results in an unbiased estimator of the full model with minimal variance, and we derive an unbiased estimator of the gradient that we use to train our model end-to-end with a normal SGD procedure. This new method is evaluated on three classification tasks, where we show that it allows to reduce computation and memory footprint by an order of magnitude for the same accuracy as classical architectures. We also show the consistency of the sampling that indeed focuses on informative parts of the input images.
Tasks Deep Attention
Published 2019-05-03
URL https://arxiv.org/abs/1905.03711v2
PDF https://arxiv.org/pdf/1905.03711v2.pdf
PWC https://paperswithcode.com/paper/190503711
Repo https://github.com/idiap/attention-sampling
Framework tf
Title Visualizing Trends of Key Roles in News Articles
Authors Chen Xia, Haoxiang Zhang, Jacob Moghtader, Allen Wu, Kai-Wei Chang
Abstract There are tons of news articles generated every day reflecting the activities of key roles such as people, organizations and political parties. Analyzing these key roles allows us to understand the trends in news. In this paper, we present a demonstration system that visualizes the trend of key roles in news articles based on natural language processing techniques. Specifically, we apply a semantic role labeler and the dynamic word embedding technique to understand relationships between key roles in the news across different time periods and visualize the trends of key role and news topics change over time.
Tasks
Published 2019-09-12
URL https://arxiv.org/abs/1909.05449v1
PDF https://arxiv.org/pdf/1909.05449v1.pdf
PWC https://paperswithcode.com/paper/visualizing-trends-of-key-roles-in-news
Repo https://github.com/kasinxc/Visualizing-Trend-of-Key-Roles-in-News-Articles
Framework tf

“The Michael Jordan of Greatness”: Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007

Title “The Michael Jordan of Greatness”: Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007
Authors Frank Fischer, Robert Jäschke
Abstract Vossian Antonomasia is a prolific stylistic device, in use since antiquity. It can compress the introduction or description of a person or another named entity into a terse, poignant formulation and can best be explained by an example: When Norwegian world champion Magnus Carlsen is described as “the Mozart of chess”, it is Vossian Antonomasia we are dealing with. The pattern is simple: A source (Mozart) is used to describe a target (Magnus Carlsen), the transfer of meaning is reached via a modifier (“of chess”). This phenomenon has been discussed before (as ‘metaphorical antonomasia’ or, with special focus on the source object, as ‘paragons’), but no corpus-based approach has been undertaken as yet to explore its breadth and variety. We are looking into a full-text newspaper corpus (The New York Times, 1987-2007) and describe a new method for the automatic extraction of Vossian Antonomasia based on Wikidata entities. Our analysis offers new insights into the occurrence of popular paragons and their distribution.
Tasks
Published 2019-02-18
URL http://arxiv.org/abs/1902.06428v1
PDF http://arxiv.org/pdf/1902.06428v1.pdf
PWC https://paperswithcode.com/paper/the-michael-jordan-of-greatness-extracting
Repo https://github.com/weltliteratur/vossanto
Framework none

Adaptively Truncating Backpropagation Through Time to Control Gradient Bias

Title Adaptively Truncating Backpropagation Through Time to Control Gradient Bias
Authors Christopher Aicher, Nicholas J. Foti, Emily B. Fox
Abstract Truncated backpropagation through time (TBPTT) is a popular method for learning in recurrent neural networks (RNNs) that saves computation and memory at the cost of bias by truncating backpropagation after a fixed number of lags. In practice, choosing the optimal truncation length is difficult: TBPTT will not converge if the truncation length is too small, or will converge slowly if it is too large. We propose an adaptive TBPTT scheme that converts the problem from choosing a temporal lag to one of choosing a tolerable amount of gradient bias. For many realistic RNNs, the TBPTT gradients decay geometrically in expectation for large lags; under this condition, we can control the bias by varying the truncation length adaptively. For RNNs with smooth activation functions, we prove that this bias controls the convergence rate of SGD with biased gradients for our non-convex loss. Using this theory, we develop a practical method for adaptively estimating the truncation length during training. We evaluate our adaptive TBPTT method on synthetic data and language modeling tasks and find that our adaptive TBPTT ameliorates the computational pitfalls of fixed TBPTT.
Tasks Language Modelling
Published 2019-05-17
URL https://arxiv.org/abs/1905.07473v2
PDF https://arxiv.org/pdf/1905.07473v2.pdf
PWC https://paperswithcode.com/paper/adaptively-truncating-backpropagation-through
Repo https://github.com/aicherc/adaptive_tbptt
Framework pytorch

X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies

Title X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies
Authors Kehan Qi, Hao Yang, Cheng Li, Zaiyi Liu, Meiyun Wang, Qiegen Liu, Shanshan Wang
Abstract The morbidity of brain stroke increased rapidly in the past few years. To help specialists in lesion measurements and treatment planning, automatic segmentation methods are critically required for clinical practices. Recently, approaches based on deep learning and methods for contextual information extraction have served in many image segmentation tasks. However, their performances are limited due to the insufficient training of a large number of parameters, which sometimes fail in capturing long-range dependencies. To address these issues, we propose a depthwise separable convolution based X-Net that designs a nonlocal operation namely Feature Similarity Module (FSM) to capture long-range dependencies. The adopted depthwise convolution allows to reduce the network size, while the developed FSM provides a more effective, dense contextual information extraction and thus facilitates better segmentation. The effectiveness of X-Net was evaluated on an open dataset Anatomical Tracings of Lesions After Stroke (ATLAS) with superior performance achieved compared to other six state-of-the-art approaches. We make our code and models available at https://github.com/Andrewsher/X-Net.
Tasks Lesion Segmentation, Semantic Segmentation
Published 2019-07-16
URL https://arxiv.org/abs/1907.07000v2
PDF https://arxiv.org/pdf/1907.07000v2.pdf
PWC https://paperswithcode.com/paper/x-net-brain-stroke-lesion-segmentation-based
Repo https://github.com/Andrewsher/X-Net
Framework tf

Comprehensive Evaluation of Deep Learning Architectures for Prediction of DNA/RNA Sequence Binding Specificities

Title Comprehensive Evaluation of Deep Learning Architectures for Prediction of DNA/RNA Sequence Binding Specificities
Authors Ameni Trabelsi, Mohamed Chaabane, Asa Ben Hur
Abstract Motivation: Deep learning architectures have recently demonstrated their power in predicting DNA- and RNA-binding specificities. Existing methods fall into three classes: Some are based on Convolutional Neural Networks (CNNs), others use Recurrent Neural Networks (RNNs), and others rely on hybrid architectures combining CNNs and RNNs. However, based on existing studies it is still unclear which deep learning architecture is achieving the best performance. Thus an in-depth analysis and evaluation of the different methods is needed to fully evaluate their relative. Results: In this study, We present a systematic exploration of various deep learning architectures for predicting DNA- and RNA-binding specificities. For this purpose, we present deepRAM, an end-to-end deep learning tool that provides an implementation of novel and previously proposed architectures; its fully automatic model selection procedure allows us to perform a fair and unbiased comparison of deep learning architectures. We find that an architecture that uses k-mer embedding to represent the sequence, a convolutional layer and a recurrent layer, outperforms all other methods in terms of model accuracy. Our work provides guidelines that will assist the practitioner in choosing the best architecture for the task at hand, and provides some insights on the differences between the models learned by convolutional and recurrent networks. In particular, we find that although recurrent networks improve model accuracy, this comes at the expense of a loss in the interpretability of the features learned by the model. Availability and implementation: The source code for deepRAM is available at https://github.com/MedChaabane/deepRAM
Tasks Automatic Machine Learning Model Selection, Model Selection, Multi-Label Text Classification, Text Classification
Published 2019-01-29
URL http://arxiv.org/abs/1901.10526v1
PDF http://arxiv.org/pdf/1901.10526v1.pdf
PWC https://paperswithcode.com/paper/comprehensive-evaluation-of-deep-learning
Repo https://github.com/MedChaabane/deepRAM
Framework pytorch

Synchronous Bidirectional Neural Machine Translation

Title Synchronous Bidirectional Neural Machine Translation
Authors Long Zhou, Jiajun Zhang, Chengqing Zong
Abstract Existing approaches to neural machine translation (NMT) generate the target language sequence token by token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese-English, WMT14 English-German, and WMT18 Russian-English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49 and 1.04 BLEU points respectively, and obtains the state-of-the-art performance on Chinese-English and English-German translation tasks.
Tasks Machine Translation
Published 2019-05-13
URL https://arxiv.org/abs/1905.04847v1
PDF https://arxiv.org/pdf/1905.04847v1.pdf
PWC https://paperswithcode.com/paper/synchronous-bidirectional-neural-machine
Repo https://github.com/ZNLP/sb-nmt
Framework tf

Learning Deep Bilinear Transformation for Fine-grained Image Representation

Title Learning Deep Bilinear Transformation for Fine-grained Image Representation
Authors Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
Abstract Bilinear feature transformation has shown the state-of-the-art performance in learning fine-grained image representations. However, the computational cost to learn pairwise interactions between deep feature channels is prohibitively expensive, which restricts this powerful transformation to be used in deep neural networks. In this paper, we propose a deep bilinear transformation (DBT) block, which can be deeply stacked in convolutional neural networks to learn fine-grained image representations. The DBT block can uniformly divide input channels into several semantic groups. As bilinear transformation can be represented by calculating pairwise interactions within each group, the computational cost can be heavily relieved. The output of each block is further obtained by aggregating intra-group bilinear features, with residuals from the entire input features. We found that the proposed network achieves new state-of-the-art in several fine-grained image recognition benchmarks, including CUB-Bird, Stanford-Car, and FGVC-Aircraft.
Tasks Fine-Grained Image Recognition
Published 2019-11-09
URL https://arxiv.org/abs/1911.03621v1
PDF https://arxiv.org/pdf/1911.03621v1.pdf
PWC https://paperswithcode.com/paper/learning-deep-bilinear-transformation-for
Repo https://github.com/researchmm/DBTNet
Framework pytorch

Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions

Title Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions
Authors Ali Girayhan Özbay, Sylvain Laizet, Panagiotis Tzirakis, Georgios Rizos, Björn Schuller
Abstract The Poisson equation is commonly encountered in engineering, including in computational fluid dynamics where it is needed to compute corrections to the pressure field. We propose a novel fully convolutional neural network (CNN) architecture to infer the solution of the Poisson equation on a 2D Cartesian grid of varying size and spacing given the right hand side term, arbitrary Dirichlet boundary conditions and grid parameters which provides unprecendented versatility in this application. The boundary conditions are handled using a novel approach by decomposing the original Poisson problem into a homogeneous Poisson problem plus four inhomogeneous Laplace sub-problems. The model is trained using a novel loss function approximating the continuous $L^p$ norm between the prediction and the target. Analytical test cases indicate that our CNN architecture is capable of predicting the correct solution of a Poisson problem with mean percentage errors of 15% and promises improvements in wall-clock runtimes for large problems. Furthermore, even when predicting on meshes denser than previously encountered, our model demonstrates encouraging capacity to reproduce the correct solution profile.
Tasks
Published 2019-10-18
URL https://arxiv.org/abs/1910.08613v1
PDF https://arxiv.org/pdf/1910.08613v1.pdf
PWC https://paperswithcode.com/paper/poisson-cnn-convolutional-neural-networks-for
Repo https://github.com/aligirayhanozbay/poisson_CNN_jupyter
Framework tf

Pay attention to the activations: a modular attention mechanism for fine-grained image recognition

Title Pay attention to the activations: a modular attention mechanism for fine-grained image recognition
Authors Pau Rodríguez López, Diego Velazquez Dorta, Guillem Cucurull Preixens, Josep M. Gonfaus, F. Xavier Roca Marva, Jordi Gonzàlez Sabaté
Abstract Fine-grained image recognition is central to many multimedia tasks such as search, retrieval and captioning. Unfortunately, these tasks are still challenging since the appearance of samples of the same class can be more different than those from different classes. Attention has been typically implemented in neural networks by selecting the most informative regions of the image that improve classification. In contrast, in this paper, attention is not applied at the image level but to the convolutional feature activations. In essence, with our approach, the neural model learns to attend to lower-level feature activations without requiring part annotations and uses those activations to update and rectify the output likelihood distribution. The proposed mechanism is modular, architecture-independent and efficient in terms of both parameters and computation required. Experiments demonstrate that well-known networks such as Wide Residual Networks and ResNeXt, when augmented with our approach, systematically improve their classification accuracy and become more robust to changes in deformation and pose and to the presence of clutter. As a result, our proposal reaches state-of-the-art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford Dogs, and UEC-Food100 while obtaining competitive performance in ImageNet, CIFAR-100, CUB200 Birds, and Stanford Cars. In addition, we analyze the different components of our model, showing that the proposed attention modules succeed in finding the most discriminative regions of the image. Finally, as a proof of concept, we demonstrate that with only local predictions, an augmented neural network can successfully classify an image before reaching any fully connected layer, thus reducing the computational amount up to 10%.
Tasks Fine-Grained Image Recognition
Published 2019-07-30
URL https://arxiv.org/abs/1907.13075v1
PDF https://arxiv.org/pdf/1907.13075v1.pdf
PWC https://paperswithcode.com/paper/pay-attention-to-the-activations-a-modular
Repo https://github.com/prlz77/attend-and-rectify
Framework pytorch

Empirical confidence estimates for classification by deep neural networks

Title Empirical confidence estimates for classification by deep neural networks
Authors Chris Finlay, Adam M. Oberman
Abstract How well can we estimate the probability that the classification predicted by a deep neural network is correct (or in the Top 5)? It is well-known that the softmax values of the network are not estimates of the probabilities of class labels. However, there is a misconception that these values are not informative. We define the notion of \emph{implied loss} and prove that if an uncertainty measure is an implied loss, then low uncertainty means high probability of correct (or top $k$) classification on the test set. We demonstrate empirically that these values can be used to measure the confidence that the classification is correct. Our method is simple to use on existing networks: we proposed confidence measures for Top $k$ which can be evaluated by binning values on the test set.
Tasks
Published 2019-03-21
URL https://arxiv.org/abs/1903.09215v2
PDF https://arxiv.org/pdf/1903.09215v2.pdf
PWC https://paperswithcode.com/paper/empirical-confidence-estimates-for
Repo https://github.com/cfinlay/confident-nn
Framework none

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Title Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Authors Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn
Abstract Solving complex, temporally-extended tasks is a long-standing problem in reinforcement learning (RL). We hypothesize that one critical element of solving such problems is the notion of compositionality. With the ability to learn concepts and sub-skills that can be composed to solve longer tasks, i.e. hierarchical RL, we can acquire temporally-extended behaviors. However, acquiring effective yet general abstractions for hierarchical RL is remarkably challenging. In this paper, we propose to use language as the abstraction, as it provides unique compositional structure, enabling fast learning and combinatorial generalization, while retaining tremendous flexibility, making it suitable for a variety of problems. Our approach learns an instruction-following low-level policy and a high-level policy that can reuse abstractions across tasks, in essence, permitting agents to reason using structured language. To study compositional task learning, we introduce an open-source object interaction environment built using the MuJoCo physics engine and the CLEVR engine. We find that, using our approach, agents can learn to solve to diverse, temporally-extended tasks such as object sorting and multi-object rearrangement, including from raw pixel observations. Our analysis reveals that the compositional nature of language is critical for learning diverse sub-skills and systematically generalizing to new sub-skills in comparison to non-compositional abstractions that use the same supervision.
Tasks
Published 2019-06-18
URL https://arxiv.org/abs/1906.07343v2
PDF https://arxiv.org/pdf/1906.07343v2.pdf
PWC https://paperswithcode.com/paper/language-as-an-abstraction-for-hierarchical
Repo https://github.com/bhiziroglu/Language-as-an-Abstraction-for-Hierarchical-Deep-Reinforcement-Learning
Framework pytorch

Optimizing Through Learned Errors for Accurate Sports Field Registration

Title Optimizing Through Learned Errors for Accurate Sports Field Registration
Authors Wei Jiang, Juan Camilo Gamboa Higuera, Baptiste Angles, Weiwei Sun, Mehrsan Javan, Kwang Moo Yi
Abstract We propose an optimization-based framework to register sports field templates onto broadcast videos. For accurate registration we go beyond the prevalent feed-forward paradigm. Instead, we propose to train a deep network that regresses the registration error, and then register images by finding the registration parameters that minimize the regressed error. We demonstrate the effectiveness of our method by applying it to real-world sports broadcast videos, outperforming the state of the art. We further apply our method on a synthetic toy example and demonstrate that our method brings significant gains even when the problem is simplified and unlimited training data is available.
Tasks
Published 2019-09-17
URL https://arxiv.org/abs/1909.08034v1
PDF https://arxiv.org/pdf/1909.08034v1.pdf
PWC https://paperswithcode.com/paper/optimizing-through-learned-errors-for
Repo https://github.com/vcg-uvic/sportsfield_release
Framework pytorch
comments powered by Disqus