February 2, 2020

3277 words 16 mins read

Paper Group AWR 8

Feature-Less End-to-End Nested Term Extraction. A Survey on Deep Learning of Small Sample in Biomedical Image Analysis. Processing Megapixel Images with Deep Attention-Sampling Models. Visualizing Trends of Key Roles in News Articles. “The Michael Jordan of Greatness”: Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007 …

Feature-Less End-to-End Nested Term Extraction


Title	Feature-Less End-to-End Nested Term Extraction
Authors	Yuze Gao, Yu Yuan
Abstract	In this paper, we proposed a deep learning-based end-to-end method on the domain specified automatic term extraction (ATE), it considers possible term spans within a fixed length in the sentence and predicts them whether they can be conceptual terms. In comparison with current ATE methods, the model supports nested term extraction and does not crucially need extra (extracted) features. Results show that it can achieve high recall and a comparable precision on term extraction task with inputting segmented raw text.
Tasks
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05426v1
PDF	https://arxiv.org/pdf/1908.05426v1.pdf
PWC	https://paperswithcode.com/paper/feature-less-end-to-end-nested-term
Repo	https://github.com/CooDL/NestedTermExtraction
Framework	pytorch

A Survey on Deep Learning of Small Sample in Biomedical Image Analysis


Title	A Survey on Deep Learning of Small Sample in Biomedical Image Analysis
Authors	Pengyi Zhang, Yunxin Zhong, Yulin Deng, Xiaoying Tang, Xiaoqiong Li
Abstract	The success of deep learning has been witnessed as a promising technique for computer-aided biomedical image analysis, due to end-to-end learning framework and availability of large-scale labelled samples. However, in many cases of biomedical image analysis, deep learning techniques suffer from the small sample learning (SSL) dilemma caused mainly by lack of annotations. To be more practical for biomedical image analysis, in this paper we survey the key SSL techniques that help relieve the suffering of deep learning by combining with the development of related techniques in computer vision applications. In order to accelerate the clinical usage of biomedical image analysis based on deep learning techniques, we intentionally expand this survey to include the explanation methods for deep models that are important to clinical decision making. We survey the key SSL techniques by dividing them into five categories: (1) explanation techniques, (2) weakly supervised learning techniques, (3) transfer learning techniques, (4) active learning techniques, and (5) miscellaneous techniques involving data augmentation, domain knowledge, traditional shallow methods and attention mechanism. These key techniques are expected to effectively support the application of deep learning in clinical biomedical image analysis, and furtherly improve the analysis performance, especially when large-scale annotated samples are not available. We bulid demos at https://github.com/PengyiZhang/MIADeepSSL.
Tasks	Active Learning, Data Augmentation, Decision Making, Transfer Learning
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00473v1
PDF	https://arxiv.org/pdf/1908.00473v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-deep-learning-of-small-sample-in
Repo	https://github.com/PengyiZhang/MIADeepSSL
Framework	pytorch

Processing Megapixel Images with Deep Attention-Sampling Models


Title	Processing Megapixel Images with Deep Attention-Sampling Models
Authors	Angelos Katharopoulos, François Fleuret
Abstract	Existing deep architectures cannot operate on very large signals such as megapixel images due to computational and memory constraints. To tackle this limitation, we propose a fully differentiable end-to-end trainable model that samples and processes only a fraction of the full resolution input image. The locations to process are sampled from an attention distribution computed from a low resolution view of the input. We refer to our method as attention sampling and it can process images of several megapixels with a standard single GPU setup. We show that sampling from the attention distribution results in an unbiased estimator of the full model with minimal variance, and we derive an unbiased estimator of the gradient that we use to train our model end-to-end with a normal SGD procedure. This new method is evaluated on three classification tasks, where we show that it allows to reduce computation and memory footprint by an order of magnitude for the same accuracy as classical architectures. We also show the consistency of the sampling that indeed focuses on informative parts of the input images.
Tasks	Deep Attention
Published	2019-05-03
URL	https://arxiv.org/abs/1905.03711v2
PDF	https://arxiv.org/pdf/1905.03711v2.pdf
PWC	https://paperswithcode.com/paper/190503711
Repo	https://github.com/idiap/attention-sampling
Framework	tf

Visualizing Trends of Key Roles in News Articles


Title	Visualizing Trends of Key Roles in News Articles
Authors	Chen Xia, Haoxiang Zhang, Jacob Moghtader, Allen Wu, Kai-Wei Chang
Abstract	There are tons of news articles generated every day reflecting the activities of key roles such as people, organizations and political parties. Analyzing these key roles allows us to understand the trends in news. In this paper, we present a demonstration system that visualizes the trend of key roles in news articles based on natural language processing techniques. Specifically, we apply a semantic role labeler and the dynamic word embedding technique to understand relationships between key roles in the news across different time periods and visualize the trends of key role and news topics change over time.
Tasks
Published	2019-09-12
URL	https://arxiv.org/abs/1909.05449v1
PDF	https://arxiv.org/pdf/1909.05449v1.pdf
PWC	https://paperswithcode.com/paper/visualizing-trends-of-key-roles-in-news
Repo	https://github.com/kasinxc/Visualizing-Trend-of-Key-Roles-in-News-Articles
Framework	tf

“The Michael Jordan of Greatness”: Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007


Title	“The Michael Jordan of Greatness”: Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007
Authors	Frank Fischer, Robert Jäschke
Abstract	Vossian Antonomasia is a prolific stylistic device, in use since antiquity. It can compress the introduction or description of a person or another named entity into a terse, poignant formulation and can best be explained by an example: When Norwegian world champion Magnus Carlsen is described as “the Mozart of chess”, it is Vossian Antonomasia we are dealing with. The pattern is simple: A source (Mozart) is used to describe a target (Magnus Carlsen), the transfer of meaning is reached via a modifier (“of chess”). This phenomenon has been discussed before (as ‘metaphorical antonomasia’ or, with special focus on the source object, as ‘paragons’), but no corpus-based approach has been undertaken as yet to explore its breadth and variety. We are looking into a full-text newspaper corpus (The New York Times, 1987-2007) and describe a new method for the automatic extraction of Vossian Antonomasia based on Wikidata entities. Our analysis offers new insights into the occurrence of popular paragons and their distribution.
Tasks
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06428v1
PDF	http://arxiv.org/pdf/1902.06428v1.pdf
PWC	https://paperswithcode.com/paper/the-michael-jordan-of-greatness-extracting
Repo	https://github.com/weltliteratur/vossanto
Framework	none

Adaptively Truncating Backpropagation Through Time to Control Gradient Bias


Title	Adaptively Truncating Backpropagation Through Time to Control Gradient Bias
Authors	Christopher Aicher, Nicholas J. Foti, Emily B. Fox
Abstract	Truncated backpropagation through time (TBPTT) is a popular method for learning in recurrent neural networks (RNNs) that saves computation and memory at the cost of bias by truncating backpropagation after a fixed number of lags. In practice, choosing the optimal truncation length is difficult: TBPTT will not converge if the truncation length is too small, or will converge slowly if it is too large. We propose an adaptive TBPTT scheme that converts the problem from choosing a temporal lag to one of choosing a tolerable amount of gradient bias. For many realistic RNNs, the TBPTT gradients decay geometrically in expectation for large lags; under this condition, we can control the bias by varying the truncation length adaptively. For RNNs with smooth activation functions, we prove that this bias controls the convergence rate of SGD with biased gradients for our non-convex loss. Using this theory, we develop a practical method for adaptively estimating the truncation length during training. We evaluate our adaptive TBPTT method on synthetic data and language modeling tasks and find that our adaptive TBPTT ameliorates the computational pitfalls of fixed TBPTT.
Tasks	Language Modelling
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07473v2
PDF	https://arxiv.org/pdf/1905.07473v2.pdf
PWC	https://paperswithcode.com/paper/adaptively-truncating-backpropagation-through
Repo	https://github.com/aicherc/adaptive_tbptt
Framework	pytorch

X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies


Title	X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies
Authors	Kehan Qi, Hao Yang, Cheng Li, Zaiyi Liu, Meiyun Wang, Qiegen Liu, Shanshan Wang
Abstract	The morbidity of brain stroke increased rapidly in the past few years. To help specialists in lesion measurements and treatment planning, automatic segmentation methods are critically required for clinical practices. Recently, approaches based on deep learning and methods for contextual information extraction have served in many image segmentation tasks. However, their performances are limited due to the insufficient training of a large number of parameters, which sometimes fail in capturing long-range dependencies. To address these issues, we propose a depthwise separable convolution based X-Net that designs a nonlocal operation namely Feature Similarity Module (FSM) to capture long-range dependencies. The adopted depthwise convolution allows to reduce the network size, while the developed FSM provides a more effective, dense contextual information extraction and thus facilitates better segmentation. The effectiveness of X-Net was evaluated on an open dataset Anatomical Tracings of Lesions After Stroke (ATLAS) with superior performance achieved compared to other six state-of-the-art approaches. We make our code and models available at https://github.com/Andrewsher/X-Net.
Tasks	Lesion Segmentation, Semantic Segmentation
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07000v2
PDF	https://arxiv.org/pdf/1907.07000v2.pdf
PWC	https://paperswithcode.com/paper/x-net-brain-stroke-lesion-segmentation-based
Repo	https://github.com/Andrewsher/X-Net
Framework	tf

Comprehensive Evaluation of Deep Learning Architectures for Prediction of DNA/RNA Sequence Binding Specificities


Title	Comprehensive Evaluation of Deep Learning Architectures for Prediction of DNA/RNA Sequence Binding Specificities
Authors	Ameni Trabelsi, Mohamed Chaabane, Asa Ben Hur
Abstract	Motivation: Deep learning architectures have recently demonstrated their power in predicting DNA- and RNA-binding specificities. Existing methods fall into three classes: Some are based on Convolutional Neural Networks (CNNs), others use Recurrent Neural Networks (RNNs), and others rely on hybrid architectures combining CNNs and RNNs. However, based on existing studies it is still unclear which deep learning architecture is achieving the best performance. Thus an in-depth analysis and evaluation of the different methods is needed to fully evaluate their relative. Results: In this study, We present a systematic exploration of various deep learning architectures for predicting DNA- and RNA-binding specificities. For this purpose, we present deepRAM, an end-to-end deep learning tool that provides an implementation of novel and previously proposed architectures; its fully automatic model selection procedure allows us to perform a fair and unbiased comparison of deep learning architectures. We find that an architecture that uses k-mer embedding to represent the sequence, a convolutional layer and a recurrent layer, outperforms all other methods in terms of model accuracy. Our work provides guidelines that will assist the practitioner in choosing the best architecture for the task at hand, and provides some insights on the differences between the models learned by convolutional and recurrent networks. In particular, we find that although recurrent networks improve model accuracy, this comes at the expense of a loss in the interpretability of the features learned by the model. Availability and implementation: The source code for deepRAM is available at https://github.com/MedChaabane/deepRAM
Tasks	Automatic Machine Learning Model Selection, Model Selection, Multi-Label Text Classification, Text Classification
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10526v1
PDF	http://arxiv.org/pdf/1901.10526v1.pdf
PWC	https://paperswithcode.com/paper/comprehensive-evaluation-of-deep-learning
Repo	https://github.com/MedChaabane/deepRAM
Framework	pytorch

Synchronous Bidirectional Neural Machine Translation


Title	Synchronous Bidirectional Neural Machine Translation
Authors	Long Zhou, Jiajun Zhang, Chengqing Zong
Abstract	Existing approaches to neural machine translation (NMT) generate the target language sequence token by token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese-English, WMT14 English-German, and WMT18 Russian-English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49 and 1.04 BLEU points respectively, and obtains the state-of-the-art performance on Chinese-English and English-German translation tasks.
Tasks	Machine Translation
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04847v1
PDF	https://arxiv.org/pdf/1905.04847v1.pdf
PWC	https://paperswithcode.com/paper/synchronous-bidirectional-neural-machine
Repo	https://github.com/ZNLP/sb-nmt
Framework	tf

Learning Deep Bilinear Transformation for Fine-grained Image Representation


Title	Learning Deep Bilinear Transformation for Fine-grained Image Representation
Authors	Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
Abstract	Bilinear feature transformation has shown the state-of-the-art performance in learning fine-grained image representations. However, the computational cost to learn pairwise interactions between deep feature channels is prohibitively expensive, which restricts this powerful transformation to be used in deep neural networks. In this paper, we propose a deep bilinear transformation (DBT) block, which can be deeply stacked in convolutional neural networks to learn fine-grained image representations. The DBT block can uniformly divide input channels into several semantic groups. As bilinear transformation can be represented by calculating pairwise interactions within each group, the computational cost can be heavily relieved. The output of each block is further obtained by aggregating intra-group bilinear features, with residuals from the entire input features. We found that the proposed network achieves new state-of-the-art in several fine-grained image recognition benchmarks, including CUB-Bird, Stanford-Car, and FGVC-Aircraft.
Tasks	Fine-Grained Image Recognition
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03621v1
PDF	https://arxiv.org/pdf/1911.03621v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-bilinear-transformation-for
Repo	https://github.com/researchmm/DBTNet
Framework	pytorch

Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions


Title	Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions
Authors	Ali Girayhan Özbay, Sylvain Laizet, Panagiotis Tzirakis, Georgios Rizos, Björn Schuller
Abstract	The Poisson equation is commonly encountered in engineering, including in computational fluid dynamics where it is needed to compute corrections to the pressure field. We propose a novel fully convolutional neural network (CNN) architecture to infer the solution of the Poisson equation on a 2D Cartesian grid of varying size and spacing given the right hand side term, arbitrary Dirichlet boundary conditions and grid parameters which provides unprecendented versatility in this application. The boundary conditions are handled using a novel approach by decomposing the original Poisson problem into a homogeneous Poisson problem plus four inhomogeneous Laplace sub-problems. The model is trained using a novel loss function approximating the continuous $L^p$ norm between the prediction and the target. Analytical test cases indicate that our CNN architecture is capable of predicting the correct solution of a Poisson problem with mean percentage errors of 15% and promises improvements in wall-clock runtimes for large problems. Furthermore, even when predicting on meshes denser than previously encountered, our model demonstrates encouraging capacity to reproduce the correct solution profile.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08613v1
PDF	https://arxiv.org/pdf/1910.08613v1.pdf
PWC	https://paperswithcode.com/paper/poisson-cnn-convolutional-neural-networks-for
Repo	https://github.com/aligirayhanozbay/poisson_CNN_jupyter
Framework	tf

Pay attention to the activations: a modular attention mechanism for fine-grained image recognition


Title	Pay attention to the activations: a modular attention mechanism for fine-grained image recognition
Authors	Pau Rodríguez López, Diego Velazquez Dorta, Guillem Cucurull Preixens, Josep M. Gonfaus, F. Xavier Roca Marva, Jordi Gonzàlez Sabaté
Abstract	Fine-grained image recognition is central to many multimedia tasks such as search, retrieval and captioning. Unfortunately, these tasks are still challenging since the appearance of samples of the same class can be more different than those from different classes. Attention has been typically implemented in neural networks by selecting the most informative regions of the image that improve classification. In contrast, in this paper, attention is not applied at the image level but to the convolutional feature activations. In essence, with our approach, the neural model learns to attend to lower-level feature activations without requiring part annotations and uses those activations to update and rectify the output likelihood distribution. The proposed mechanism is modular, architecture-independent and efficient in terms of both parameters and computation required. Experiments demonstrate that well-known networks such as Wide Residual Networks and ResNeXt, when augmented with our approach, systematically improve their classification accuracy and become more robust to changes in deformation and pose and to the presence of clutter. As a result, our proposal reaches state-of-the-art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford Dogs, and UEC-Food100 while obtaining competitive performance in ImageNet, CIFAR-100, CUB200 Birds, and Stanford Cars. In addition, we analyze the different components of our model, showing that the proposed attention modules succeed in finding the most discriminative regions of the image. Finally, as a proof of concept, we demonstrate that with only local predictions, an augmented neural network can successfully classify an image before reaching any fully connected layer, thus reducing the computational amount up to 10%.
Tasks	Fine-Grained Image Recognition
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13075v1
PDF	https://arxiv.org/pdf/1907.13075v1.pdf
PWC	https://paperswithcode.com/paper/pay-attention-to-the-activations-a-modular
Repo	https://github.com/prlz77/attend-and-rectify
Framework	pytorch

Empirical confidence estimates for classification by deep neural networks


Title	Empirical confidence estimates for classification by deep neural networks
Authors	Chris Finlay, Adam M. Oberman
Abstract	How well can we estimate the probability that the classification predicted by a deep neural network is correct (or in the Top 5)? It is well-known that the softmax values of the network are not estimates of the probabilities of class labels. However, there is a misconception that these values are not informative. We define the notion of \emph{implied loss} and prove that if an uncertainty measure is an implied loss, then low uncertainty means high probability of correct (or top $k$) classification on the test set. We demonstrate empirically that these values can be used to measure the confidence that the classification is correct. Our method is simple to use on existing networks: we proposed confidence measures for Top $k$ which can be evaluated by binning values on the test set.
Tasks
Published	2019-03-21
URL	https://arxiv.org/abs/1903.09215v2
PDF	https://arxiv.org/pdf/1903.09215v2.pdf
PWC	https://paperswithcode.com/paper/empirical-confidence-estimates-for
Repo	https://github.com/cfinlay/confident-nn
Framework	none

Language as an Abstraction for Hierarchical Deep Reinforcement Learning


Title	Language as an Abstraction for Hierarchical Deep Reinforcement Learning
Authors	Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn
Abstract	Solving complex, temporally-extended tasks is a long-standing problem in reinforcement learning (RL). We hypothesize that one critical element of solving such problems is the notion of compositionality. With the ability to learn concepts and sub-skills that can be composed to solve longer tasks, i.e. hierarchical RL, we can acquire temporally-extended behaviors. However, acquiring effective yet general abstractions for hierarchical RL is remarkably challenging. In this paper, we propose to use language as the abstraction, as it provides unique compositional structure, enabling fast learning and combinatorial generalization, while retaining tremendous flexibility, making it suitable for a variety of problems. Our approach learns an instruction-following low-level policy and a high-level policy that can reuse abstractions across tasks, in essence, permitting agents to reason using structured language. To study compositional task learning, we introduce an open-source object interaction environment built using the MuJoCo physics engine and the CLEVR engine. We find that, using our approach, agents can learn to solve to diverse, temporally-extended tasks such as object sorting and multi-object rearrangement, including from raw pixel observations. Our analysis reveals that the compositional nature of language is critical for learning diverse sub-skills and systematically generalizing to new sub-skills in comparison to non-compositional abstractions that use the same supervision.
Tasks
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07343v2
PDF	https://arxiv.org/pdf/1906.07343v2.pdf
PWC	https://paperswithcode.com/paper/language-as-an-abstraction-for-hierarchical
Repo	https://github.com/bhiziroglu/Language-as-an-Abstraction-for-Hierarchical-Deep-Reinforcement-Learning
Framework	pytorch

Optimizing Through Learned Errors for Accurate Sports Field Registration


Title	Optimizing Through Learned Errors for Accurate Sports Field Registration
Authors	Wei Jiang, Juan Camilo Gamboa Higuera, Baptiste Angles, Weiwei Sun, Mehrsan Javan, Kwang Moo Yi
Abstract	We propose an optimization-based framework to register sports field templates onto broadcast videos. For accurate registration we go beyond the prevalent feed-forward paradigm. Instead, we propose to train a deep network that regresses the registration error, and then register images by finding the registration parameters that minimize the regressed error. We demonstrate the effectiveness of our method by applying it to real-world sports broadcast videos, outperforming the state of the art. We further apply our method on a synthetic toy example and demonstrate that our method brings significant gains even when the problem is simplified and unlimited training data is available.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.08034v1
PDF	https://arxiv.org/pdf/1909.08034v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-through-learned-errors-for
Repo	https://github.com/vcg-uvic/sportsfield_release
Framework	pytorch