February 2, 2020

3199 words 16 mins read

Paper Group AWR 30

The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies. Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis. Deep Declarative Networks: A New Hope. Discovering Neural Wirings. DVDnet: A Fast Network for Deep Video Denoising. PaperRobot: Incremental Draft Generation of Scientific Ideas. …

The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies


Title	The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies
Authors	Ronen Basri, David Jacobs, Yoni Kasten, Shira Kritchman
Abstract	We study the relationship between the frequency of a function and the speed at which a neural network learns it. We build on recent results that show that the dynamics of overparameterized neural networks trained with gradient descent can be well approximated by a linear system. When normalized training data is uniformly distributed on a hypersphere, the eigenfunctions of this linear system are spherical harmonic functions. We derive the corresponding eigenvalues for each frequency after introducing a bias term in the model. This bias term had been omitted from the linear network model without significantly affecting previous theoretical results. However, we show theoretically and experimentally that a shallow neural network without bias cannot represent or learn simple, low frequency functions with odd frequencies. Our results lead to specific predictions of the time it will take a network to learn functions of varying frequency. These predictions match the empirical behavior of both shallow and deep networks.
Tasks
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00425v3
PDF	https://arxiv.org/pdf/1906.00425v3.pdf
PWC	https://paperswithcode.com/paper/190600425
Repo	https://github.com/ykasten/Convergence-Rate-NN-Different-Frequencies
Framework	pytorch

Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis


Title	Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis
Authors	Jialong Tang, Ziyao Lu, Jinsong Su, Yubin Ge, Linfeng Song, Le Sun, Jiebo Luo
Abstract	In aspect-level sentiment classification (ASC), it is prevalent to equip dominant neural models with attention mechanisms, for the sake of acquiring the importance of each context word on the given aspect. However, such a mechanism tends to excessively focus on a few frequent words with sentiment polarities, while ignoring infrequent ones. In this paper, we propose a progressive self-supervised attention learning approach for neural ASC models, which automatically mines useful attention supervision information from a training corpus to refine attention mechanisms. Specifically, we iteratively conduct sentiment predictions on all training instances. Particularly, at each iteration, the context word with the maximum attention weight is extracted as the one with active/misleading influence on the correct/incorrect prediction of every instance, and then the word itself is masked for subsequent iterations. Finally, we augment the conventional training objective with a regularization term, which enables ASC models to continue equally focusing on the extracted active context words while decreasing weights of those misleading ones. Experimental results on multiple datasets show that our proposed approach yields better attention mechanisms, leading to substantial improvements over the two state-of-the-art neural ASC models. Source code and trained models are available at https://github.com/DeepLearnXMU/PSSAttention.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01213v3
PDF	https://arxiv.org/pdf/1906.01213v3.pdf
PWC	https://paperswithcode.com/paper/progressive-self-supervised-attention
Repo	https://github.com/DeepLearnXMU/PSSAttention
Framework	tf

Deep Declarative Networks: A New Hope


Title	Deep Declarative Networks: A New Hope
Authors	Stephen Gould, Richard Hartley, Dylan Campbell
Abstract	We explore a new class of end-to-end learnable models wherein data processing nodes (or network layers) are defined in terms of desired behavior rather than an explicit forward function. Specifically, the forward function is implicitly defined as the solution to a mathematical optimization problem. Consistent with nomenclature in the programming languages community, we name these models deep declarative networks. Importantly, we show that the class of deep declarative networks subsumes current deep learning models. Moreover, invoking the implicit function theorem, we show how gradients can be back-propagated through many declaratively defined data processing nodes thereby enabling end-to-end learning. We show how these declarative processing nodes can be implemented in the popular PyTorch deep learning software library allowing declarative and imperative nodes to co-exist within the same network. We also provide numerous insights and illustrative examples of declarative nodes and demonstrate their application for image and point cloud classification tasks.
Tasks
Published	2019-09-11
URL	https://arxiv.org/abs/1909.04866v2
PDF	https://arxiv.org/pdf/1909.04866v2.pdf
PWC	https://paperswithcode.com/paper/deep-declarative-networks-a-new-hope
Repo	https://github.com/anucvml/ddn
Framework	pytorch

Discovering Neural Wirings


Title	Discovering Neural Wirings
Authors	Mitchell Wortsman, Ali Farhadi, Mohammad Rastegari
Abstract	The success of neural networks has driven a shift in focus from feature engineering to architecture engineering. However, successful networks today are constructed using a small and manually defined set of building blocks. Even in methods of neural architecture search (NAS) the network connectivity patterns are largely constrained. In this work we propose a method for discovering neural wirings. We relax the typical notion of layers and instead enable channels to form connections independent of each other. This allows for a much larger space of possible networks. The wiring of our network is not fixed during training – as we learn the network parameters we also learn the structure itself. Our experiments demonstrate that our learned connectivity outperforms hand engineered and randomly wired networks. By learning the connectivity of MobileNetV1we boost the ImageNet accuracy by 10% at ~41M FLOPs. Moreover, we show that our method generalizes to recurrent and continuous time networks. Our work may also be regarded as unifying core aspects of the neural architecture search problem with sparse neural network learning. As NAS becomes more fine grained, finding a good architecture is akin to finding a sparse subnetwork of the complete graph. Accordingly, DNW provides an effective mechanism for discovering sparse subnetworks of predefined architectures in a single training run. Though we only ever use a small percentage of the weights during the forward pass, we still play the so-called initialization lottery with a combinatorial number of subnetworks. Code and pretrained models are available at https://github.com/allenai/dnw while additional visualizations may be found at https://mitchellnw.github.io/blog/2019/dnw/.
Tasks	Feature Engineering, Neural Architecture Search
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00586v5
PDF	https://arxiv.org/pdf/1906.00586v5.pdf
PWC	https://paperswithcode.com/paper/190600586
Repo	https://github.com/allenai/dnw
Framework	pytorch

DVDnet: A Fast Network for Deep Video Denoising


Title	DVDnet: A Fast Network for Deep Video Denoising
Authors	Matias Tassano, Julie Delon, Thomas Veit
Abstract	In this paper, we propose a state-of-the-art video denoising algorithm based on a convolutional neural network architecture. Previous neural network based approaches to video denoising have been unsuccessful as their performance cannot compete with the performance of patch-based methods. However, our approach outperforms other patch-based competitors with significantly lower computing times. In contrast to other existing neural network denoisers, our algorithm exhibits several desirable properties such as a small memory footprint, and the ability to handle a wide range of noise levels with a single network model. The combination between its denoising performance and lower computational load makes this algorithm attractive for practical denoising applications. We compare our method with different state-of-art algorithms, both visually and with respect to objective quality metrics. The experiments show that our algorithm compares favorably to other state-of-art methods. Video examples, code and models are publicly available at \url{https://github.com/m-tassano/dvdnet}.
Tasks	Denoising, Video Denoising
Published	2019-06-04
URL	https://arxiv.org/abs/1906.11890v1
PDF	https://arxiv.org/pdf/1906.11890v1.pdf
PWC	https://paperswithcode.com/paper/dvdnet-a-fast-network-for-deep-video
Repo	https://github.com/m-tassano/dvdnet
Framework	pytorch

PaperRobot: Incremental Draft Generation of Scientific Ideas


Title	PaperRobot: Incremental Draft Generation of Scientific Ideas
Authors	Qingyun Wang, Lifu Huang, Zhiying Jiang, Kevin Knight, Heng Ji, Mohit Bansal, Yi Luan
Abstract	We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper. Turing Tests, where a biomedical domain expert is asked to compare a system output and a human-authored string, show PaperRobot generated abstracts, conclusion and future work sections, and new titles are chosen over human-written ones up to 30%, 24% and 12% of the time, respectively.
Tasks	Knowledge Graphs, Paper generation, Text Generation
Published	2019-05-20
URL	https://arxiv.org/abs/1905.07870v4
PDF	https://arxiv.org/pdf/1905.07870v4.pdf
PWC	https://paperswithcode.com/paper/paperrobot-incremental-draft-generation-of
Repo	https://github.com/EagleW/PaperRobot
Framework	pytorch

Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation


Title	Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation
Authors	Hao Tang, Dan Xu, Nicu Sebe, Yan Yan
Abstract	The state-of-the-art approaches in Generative Adversarial Networks (GANs) are able to learn a mapping function from one image domain to another with unpaired image data. However, these methods often produce artifacts and can only be able to convert low-level information, but fail to transfer high-level semantic part of images. The reason is mainly that generators do not have the ability to detect the most discriminative semantic part of images, which thus makes the generated images with low-quality. To handle the limitation, in this paper we propose a novel Attention-Guided Generative Adversarial Network (AGGAN), which can detect the most discriminative semantic object and minimize changes of unwanted part for semantic manipulation problems without using extra data and models. The attention-guided generators in AGGAN are able to produce attention masks via a built-in attention mechanism, and then fuse the input image with the attention mask to obtain a target image with high-quality. Moreover, we propose a novel attention-guided discriminator which only considers attended regions. The proposed AGGAN is trained by an end-to-end fashion with an adversarial loss, cycle-consistency loss, pixel loss and attention loss. Both qualitative and quantitative results demonstrate that our approach is effective to generate sharper and more accurate images than existing models. The code is available at https://github.com/Ha0Tang/AttentionGAN.
Tasks	Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2019-03-28
URL	https://arxiv.org/abs/1903.12296v3
PDF	https://arxiv.org/pdf/1903.12296v3.pdf
PWC	https://paperswithcode.com/paper/attention-guided-generative-adversarial
Repo	https://github.com/Ha0Tang/AGGAN
Framework	pytorch

sktime: A Unified Interface for Machine Learning with Time Series


Title	sktime: A Unified Interface for Machine Learning with Time Series
Authors	Markus Löning, Anthony Bagnall, Sajaysurya Ganesh, Viktor Kazakov, Jason Lines, Franz J. Király
Abstract	We present sktime – a new scikit-learn compatible Python library with a unified interface for machine learning with time series. Time series data gives rise to various distinct but closely related learning tasks, such as forecasting and time series classification, many of which can be solved by reducing them to related simpler tasks. We discuss the main rationale for creating a unified interface, including reduction, as well as the design of sktime’s core API, supported by a clear overview of common time series tasks and reduction approaches.
Tasks	Time Series, Time Series Analysis, Time Series Classification, Time Series Forecasting
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07872v1
PDF	https://arxiv.org/pdf/1909.07872v1.pdf
PWC	https://paperswithcode.com/paper/sktime-a-unified-interface-for-machine
Repo	https://github.com/alan-turing-institute/sktime
Framework	none

SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition


Title	SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition
Authors	Carlos Caetano, Jessica Sena, François Brémond, Jefersson A. dos Santos, William Robson Schwartz
Abstract	Due to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have focused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture longrange joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset.
Tasks	3D Human Action Recognition, Action Recognition In Videos, Skeleton Based Action Recognition, Temporal Action Localization
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13025v1
PDF	https://arxiv.org/pdf/1907.13025v1.pdf
PWC	https://paperswithcode.com/paper/skelemotion-a-new-representation-of-skeleton
Repo	https://github.com/carloscaetano/skeleton-images
Framework	none

Multilingual Named Entity Recognition Using Pretrained Embeddings, Attention Mechanism and NCRF


Title	Multilingual Named Entity Recognition Using Pretrained Embeddings, Attention Mechanism and NCRF
Authors	Anton A. Emelyanov, Ekaterina Artemova
Abstract	In this paper we tackle multilingual named entity recognition task. We use the BERT Language Model as embeddings with bidirectional recurrent network, attention, and NCRF on the top. We apply multilingual BERT only as embedder without any fine-tuning. We test out model on the dataset of the BSNLP shared task, which consists of texts in Bulgarian, Czech, Polish and Russian languages.
Tasks	Joint NER and Classification, Language Modelling, Multilingual text classification, Named Entity Recognition
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09978v1
PDF	https://arxiv.org/pdf/1906.09978v1.pdf
PWC	https://paperswithcode.com/paper/multilingual-named-entity-recognition-using-1
Repo	https://github.com/king-menin/slavic-ner
Framework	pytorch

Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints


Title	Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints
Authors	Carlos Caetano, François Brémond, William Robson Schwartz
Abstract	In the last years, the computer vision research community has studied on how to model temporal dynamics in videos to employ 3D human action recognition. To that end, two main baseline approaches have been researched: (i) Recurrent Neural Networks (RNNs) with Long-Short Term Memory (LSTM); and (ii) skeleton image representations used as input to a Convolutional Neural Network (CNN). Although RNN approaches present excellent results, such methods lack the ability to efficiently learn the spatial relations between the skeleton joints. On the other hand, the representations used to feed CNN approaches present the advantage of having the natural ability of learning structural information from 2D arrays (i.e., they learn spatial relations from the skeleton joints). To further improve such representations, we introduce the Tree Structure Reference Joints Image (TSRJI), a novel skeleton image representation to be used as input to CNNs. The proposed representation has the advantage of combining the use of reference joints and a tree structure skeleton. While the former incorporates different spatial relationships between the joints, the latter preserves important spatial relations by traversing a skeleton tree with a depth-first order algorithm. Experimental results demonstrate the effectiveness of the proposed representation for 3D action recognition on two datasets achieving state-of-the-art results on the recent NTU RGB+D~120 dataset.
Tasks	3D Human Action Recognition, Action Recognition In Videos, Skeleton Based Action Recognition, Temporal Action Localization
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05704v1
PDF	https://arxiv.org/pdf/1909.05704v1.pdf
PWC	https://paperswithcode.com/paper/skeleton-image-representation-for-3d-action
Repo	https://github.com/carloscaetano/skeleton-images
Framework	none

Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment


Title	Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment
Authors	Jaap Jumelet, Willem Zuidema, Dieuwke Hupkes
Abstract	Extensive research has recently shown that recurrent neural language models are able to process a wide range of grammatical phenomena. How these models are able to perform these remarkable feats so well, however, is still an open question. To gain more insight into what information LSTMs base their decisions on, we propose a generalisation of Contextual Decomposition (GCD). In particular, this setup enables us to accurately distil which part of a prediction stems from semantic heuristics, which part truly emanates from syntactic cues and which part arise from the model biases themselves instead. We investigate this technique on tasks pertaining to syntactic agreement and co-reference resolution and discover that the model strongly relies on a default reasoning effect to perform these tasks.
Tasks
Published	2019-09-19
URL	https://arxiv.org/abs/1909.08975v1
PDF	https://arxiv.org/pdf/1909.08975v1.pdf
PWC	https://paperswithcode.com/paper/analysing-neural-language-models-contextual
Repo	https://github.com/i-machine-think/diagnnose
Framework	pytorch

Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation


Title	Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation
Authors	Xiao Huang, Li Dong, Elizabeth Boschee, Nanyun Peng
Abstract	Named entity recognition (NER) identifies typed entity mentions in raw text. While the task is well-established, there is no universally used tagset: often, datasets are annotated for use in downstream applications and accordingly only cover a small set of entity types relevant to a particular task. For instance, in the biomedical domain, one corpus might annotate genes, another chemicals, and another diseases—despite the texts in each corpus containing references to all three types of entities. In this paper, we propose a deep structured model to integrate these “partially annotated” datasets to jointly identify all entity types appearing in the training corpora. By leveraging multiple datasets, the model can learn robust input representations; by building a joint structured model, it avoids potential conflicts caused by combining several models’ predictions at test time. Experiments show that the proposed model significantly outperforms strong multi-task learning baselines when training on multiple, partially annotated datasets and testing on datasets that contain tags from more than one of the training corpora.
Tasks	Multi-Task Learning, Named Entity Recognition
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11535v2
PDF	https://arxiv.org/pdf/1909.11535v2.pdf
PWC	https://paperswithcode.com/paper/learning-a-unified-named-entity-tagger-from
Repo	https://github.com/xhuang28/NewBioNer
Framework	pytorch

Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer


Title	Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
Authors	Edward Choi, Zhen Xu, Yujia Li, Michael W. Dusenberry, Gerardo Flores, Yuan Xue, Andrew M. Dai
Abstract	Effective modeling of electronic health records (EHR) is rapidly becoming an important topic in both academia and industry. A recent study showed that using the graphical structure underlying EHR data (e.g. relationship between diagnoses and treatments) improves the performance of prediction tasks such as heart failure prediction. However, EHR data do not always contain complete structure information. Moreover, when it comes to claims data, structure information is completely unavailable to begin with. Under such circumstances, can we still do better than just treating EHR data as a flat-structured bag-of-features? In this paper, we study the possibility of jointly learning the hidden structure of EHR while performing supervised prediction tasks on EHR data. Specifically, we discuss that Transformer is a suitable basis model to learn the hidden EHR structure, and propose Graph Convolutional Transformer, which uses data statistics to guide the structure learning process. The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for EHR data.
Tasks	Readmission Prediction, Representation Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04716v3
PDF	https://arxiv.org/pdf/1906.04716v3.pdf
PWC	https://paperswithcode.com/paper/graph-convolutional-transformer-learning-the
Repo	https://github.com/mp2893/mime
Framework	tf

PLMP – Point-Line Minimal Problems in Complete Multi-View Visibility


Title	PLMP – Point-Line Minimal Problems in Complete Multi-View Visibility
Authors	Timothy Duff, Kathlén Kohn, Anton Leykin, Tomas Pajdla
Abstract	We present a complete classification of all minimal problems for generic arrangements of points and lines completely observed by calibrated perspective cameras. We show that there are only 30 minimal problems in total, no problems exist for more than 6 cameras, for more than 5 points, and for more than 6 lines. We present a sequence of tests for detecting minimality starting with counting degrees of freedom and ending with full symbolic and numeric verification of representative examples. For all minimal problems discovered, we present their algebraic degrees, i.e. the number of solutions, which measure their intrinsic difficulty. It shows how exactly the difficulty of problems grows with the number of views. Importantly, several new minimal problems have small degrees that might be practical in image matching and 3D reconstruction.
Tasks	3D Reconstruction
Published	2019-03-24
URL	https://arxiv.org/abs/1903.10008v2
PDF	https://arxiv.org/pdf/1903.10008v2.pdf
PWC	https://paperswithcode.com/paper/plmp-point-line-minimal-problems-in-complete
Repo	https://github.com/extreme-assistant/iccv2019
Framework	none