Paper Group AWR 30
The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies. Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis. Deep Declarative Networks: A New Hope. Discovering Neural Wirings. DVDnet: A Fast Network for Deep Video Denoising. PaperRobot: Incremental Draft Generation of Scientific Ideas. …
The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies
Title | The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies |
Authors | Ronen Basri, David Jacobs, Yoni Kasten, Shira Kritchman |
Abstract | We study the relationship between the frequency of a function and the speed at which a neural network learns it. We build on recent results that show that the dynamics of overparameterized neural networks trained with gradient descent can be well approximated by a linear system. When normalized training data is uniformly distributed on a hypersphere, the eigenfunctions of this linear system are spherical harmonic functions. We derive the corresponding eigenvalues for each frequency after introducing a bias term in the model. This bias term had been omitted from the linear network model without significantly affecting previous theoretical results. However, we show theoretically and experimentally that a shallow neural network without bias cannot represent or learn simple, low frequency functions with odd frequencies. Our results lead to specific predictions of the time it will take a network to learn functions of varying frequency. These predictions match the empirical behavior of both shallow and deep networks. |
Tasks | |
Published | 2019-06-02 |
URL | https://arxiv.org/abs/1906.00425v3 |
https://arxiv.org/pdf/1906.00425v3.pdf | |
PWC | https://paperswithcode.com/paper/190600425 |
Repo | https://github.com/ykasten/Convergence-Rate-NN-Different-Frequencies |
Framework | pytorch |
Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis
Title | Progressive Self-Supervised Attention Learning for Aspect-Level Sentiment Analysis |
Authors | Jialong Tang, Ziyao Lu, Jinsong Su, Yubin Ge, Linfeng Song, Le Sun, Jiebo Luo |
Abstract | In aspect-level sentiment classification (ASC), it is prevalent to equip dominant neural models with attention mechanisms, for the sake of acquiring the importance of each context word on the given aspect. However, such a mechanism tends to excessively focus on a few frequent words with sentiment polarities, while ignoring infrequent ones. In this paper, we propose a progressive self-supervised attention learning approach for neural ASC models, which automatically mines useful attention supervision information from a training corpus to refine attention mechanisms. Specifically, we iteratively conduct sentiment predictions on all training instances. Particularly, at each iteration, the context word with the maximum attention weight is extracted as the one with active/misleading influence on the correct/incorrect prediction of every instance, and then the word itself is masked for subsequent iterations. Finally, we augment the conventional training objective with a regularization term, which enables ASC models to continue equally focusing on the extracted active context words while decreasing weights of those misleading ones. Experimental results on multiple datasets show that our proposed approach yields better attention mechanisms, leading to substantial improvements over the two state-of-the-art neural ASC models. Source code and trained models are available at https://github.com/DeepLearnXMU/PSSAttention. |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01213v3 |
https://arxiv.org/pdf/1906.01213v3.pdf | |
PWC | https://paperswithcode.com/paper/progressive-self-supervised-attention |
Repo | https://github.com/DeepLearnXMU/PSSAttention |
Framework | tf |
Deep Declarative Networks: A New Hope
Title | Deep Declarative Networks: A New Hope |
Authors | Stephen Gould, Richard Hartley, Dylan Campbell |
Abstract | We explore a new class of end-to-end learnable models wherein data processing nodes (or network layers) are defined in terms of desired behavior rather than an explicit forward function. Specifically, the forward function is implicitly defined as the solution to a mathematical optimization problem. Consistent with nomenclature in the programming languages community, we name these models deep declarative networks. Importantly, we show that the class of deep declarative networks subsumes current deep learning models. Moreover, invoking the implicit function theorem, we show how gradients can be back-propagated through many declaratively defined data processing nodes thereby enabling end-to-end learning. We show how these declarative processing nodes can be implemented in the popular PyTorch deep learning software library allowing declarative and imperative nodes to co-exist within the same network. We also provide numerous insights and illustrative examples of declarative nodes and demonstrate their application for image and point cloud classification tasks. |
Tasks | |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.04866v2 |
https://arxiv.org/pdf/1909.04866v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-declarative-networks-a-new-hope |
Repo | https://github.com/anucvml/ddn |
Framework | pytorch |
Discovering Neural Wirings
Title | Discovering Neural Wirings |
Authors | Mitchell Wortsman, Ali Farhadi, Mohammad Rastegari |
Abstract | The success of neural networks has driven a shift in focus from feature engineering to architecture engineering. However, successful networks today are constructed using a small and manually defined set of building blocks. Even in methods of neural architecture search (NAS) the network connectivity patterns are largely constrained. In this work we propose a method for discovering neural wirings. We relax the typical notion of layers and instead enable channels to form connections independent of each other. This allows for a much larger space of possible networks. The wiring of our network is not fixed during training – as we learn the network parameters we also learn the structure itself. Our experiments demonstrate that our learned connectivity outperforms hand engineered and randomly wired networks. By learning the connectivity of MobileNetV1we boost the ImageNet accuracy by 10% at ~41M FLOPs. Moreover, we show that our method generalizes to recurrent and continuous time networks. Our work may also be regarded as unifying core aspects of the neural architecture search problem with sparse neural network learning. As NAS becomes more fine grained, finding a good architecture is akin to finding a sparse subnetwork of the complete graph. Accordingly, DNW provides an effective mechanism for discovering sparse subnetworks of predefined architectures in a single training run. Though we only ever use a small percentage of the weights during the forward pass, we still play the so-called initialization lottery with a combinatorial number of subnetworks. Code and pretrained models are available at https://github.com/allenai/dnw while additional visualizations may be found at https://mitchellnw.github.io/blog/2019/dnw/. |
Tasks | Feature Engineering, Neural Architecture Search |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00586v5 |
https://arxiv.org/pdf/1906.00586v5.pdf | |
PWC | https://paperswithcode.com/paper/190600586 |
Repo | https://github.com/allenai/dnw |
Framework | pytorch |
DVDnet: A Fast Network for Deep Video Denoising
Title | DVDnet: A Fast Network for Deep Video Denoising |
Authors | Matias Tassano, Julie Delon, Thomas Veit |
Abstract | In this paper, we propose a state-of-the-art video denoising algorithm based on a convolutional neural network architecture. Previous neural network based approaches to video denoising have been unsuccessful as their performance cannot compete with the performance of patch-based methods. However, our approach outperforms other patch-based competitors with significantly lower computing times. In contrast to other existing neural network denoisers, our algorithm exhibits several desirable properties such as a small memory footprint, and the ability to handle a wide range of noise levels with a single network model. The combination between its denoising performance and lower computational load makes this algorithm attractive for practical denoising applications. We compare our method with different state-of-art algorithms, both visually and with respect to objective quality metrics. The experiments show that our algorithm compares favorably to other state-of-art methods. Video examples, code and models are publicly available at \url{https://github.com/m-tassano/dvdnet}. |
Tasks | Denoising, Video Denoising |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.11890v1 |
https://arxiv.org/pdf/1906.11890v1.pdf | |
PWC | https://paperswithcode.com/paper/dvdnet-a-fast-network-for-deep-video |
Repo | https://github.com/m-tassano/dvdnet |
Framework | pytorch |
PaperRobot: Incremental Draft Generation of Scientific Ideas
Title | PaperRobot: Incremental Draft Generation of Scientific Ideas |
Authors | Qingyun Wang, Lifu Huang, Zhiying Jiang, Kevin Knight, Heng Ji, Mohit Bansal, Yi Luan |
Abstract | We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper. Turing Tests, where a biomedical domain expert is asked to compare a system output and a human-authored string, show PaperRobot generated abstracts, conclusion and future work sections, and new titles are chosen over human-written ones up to 30%, 24% and 12% of the time, respectively. |
Tasks | Knowledge Graphs, Paper generation, Text Generation |
Published | 2019-05-20 |
URL | https://arxiv.org/abs/1905.07870v4 |
https://arxiv.org/pdf/1905.07870v4.pdf | |
PWC | https://paperswithcode.com/paper/paperrobot-incremental-draft-generation-of |
Repo | https://github.com/EagleW/PaperRobot |
Framework | pytorch |
Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation
Title | Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation |
Authors | Hao Tang, Dan Xu, Nicu Sebe, Yan Yan |
Abstract | The state-of-the-art approaches in Generative Adversarial Networks (GANs) are able to learn a mapping function from one image domain to another with unpaired image data. However, these methods often produce artifacts and can only be able to convert low-level information, but fail to transfer high-level semantic part of images. The reason is mainly that generators do not have the ability to detect the most discriminative semantic part of images, which thus makes the generated images with low-quality. To handle the limitation, in this paper we propose a novel Attention-Guided Generative Adversarial Network (AGGAN), which can detect the most discriminative semantic object and minimize changes of unwanted part for semantic manipulation problems without using extra data and models. The attention-guided generators in AGGAN are able to produce attention masks via a built-in attention mechanism, and then fuse the input image with the attention mask to obtain a target image with high-quality. Moreover, we propose a novel attention-guided discriminator which only considers attended regions. The proposed AGGAN is trained by an end-to-end fashion with an adversarial loss, cycle-consistency loss, pixel loss and attention loss. Both qualitative and quantitative results demonstrate that our approach is effective to generate sharper and more accurate images than existing models. The code is available at https://github.com/Ha0Tang/AttentionGAN. |
Tasks | Image-to-Image Translation, Unsupervised Image-To-Image Translation |
Published | 2019-03-28 |
URL | https://arxiv.org/abs/1903.12296v3 |
https://arxiv.org/pdf/1903.12296v3.pdf | |
PWC | https://paperswithcode.com/paper/attention-guided-generative-adversarial |
Repo | https://github.com/Ha0Tang/AGGAN |
Framework | pytorch |
sktime: A Unified Interface for Machine Learning with Time Series
Title | sktime: A Unified Interface for Machine Learning with Time Series |
Authors | Markus Löning, Anthony Bagnall, Sajaysurya Ganesh, Viktor Kazakov, Jason Lines, Franz J. Király |
Abstract | We present sktime – a new scikit-learn compatible Python library with a unified interface for machine learning with time series. Time series data gives rise to various distinct but closely related learning tasks, such as forecasting and time series classification, many of which can be solved by reducing them to related simpler tasks. We discuss the main rationale for creating a unified interface, including reduction, as well as the design of sktime’s core API, supported by a clear overview of common time series tasks and reduction approaches. |
Tasks | Time Series, Time Series Analysis, Time Series Classification, Time Series Forecasting |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07872v1 |
https://arxiv.org/pdf/1909.07872v1.pdf | |
PWC | https://paperswithcode.com/paper/sktime-a-unified-interface-for-machine |
Repo | https://github.com/alan-turing-institute/sktime |
Framework | none |
SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition
Title | SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition |
Authors | Carlos Caetano, Jessica Sena, François Brémond, Jefersson A. dos Santos, William Robson Schwartz |
Abstract | Due to the availability of large-scale skeleton datasets, 3D human action recognition has recently called the attention of computer vision community. Many works have focused on encoding skeleton data as skeleton image representations based on spatial structure of the skeleton joints, in which the temporal dynamics of the sequence is encoded as variations in columns and the spatial structure of each frame is represented as rows of a matrix. To further improve such representations, we introduce a novel skeleton image representation to be used as input of Convolutional Neural Networks (CNNs), named SkeleMotion. The proposed approach encodes the temporal dynamics by explicitly computing the magnitude and orientation values of the skeleton joints. Different temporal scales are employed to compute motion values to aggregate more temporal dynamics to the representation making it able to capture longrange joint interactions involved in actions as well as filtering noisy motion values. Experimental results demonstrate the effectiveness of the proposed representation on 3D action recognition outperforming the state-of-the-art on NTU RGB+D 120 dataset. |
Tasks | 3D Human Action Recognition, Action Recognition In Videos, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.13025v1 |
https://arxiv.org/pdf/1907.13025v1.pdf | |
PWC | https://paperswithcode.com/paper/skelemotion-a-new-representation-of-skeleton |
Repo | https://github.com/carloscaetano/skeleton-images |
Framework | none |
Multilingual Named Entity Recognition Using Pretrained Embeddings, Attention Mechanism and NCRF
Title | Multilingual Named Entity Recognition Using Pretrained Embeddings, Attention Mechanism and NCRF |
Authors | Anton A. Emelyanov, Ekaterina Artemova |
Abstract | In this paper we tackle multilingual named entity recognition task. We use the BERT Language Model as embeddings with bidirectional recurrent network, attention, and NCRF on the top. We apply multilingual BERT only as embedder without any fine-tuning. We test out model on the dataset of the BSNLP shared task, which consists of texts in Bulgarian, Czech, Polish and Russian languages. |
Tasks | Joint NER and Classification, Language Modelling, Multilingual text classification, Named Entity Recognition |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.09978v1 |
https://arxiv.org/pdf/1906.09978v1.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-named-entity-recognition-using-1 |
Repo | https://github.com/king-menin/slavic-ner |
Framework | pytorch |
Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints
Title | Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints |
Authors | Carlos Caetano, François Brémond, William Robson Schwartz |
Abstract | In the last years, the computer vision research community has studied on how to model temporal dynamics in videos to employ 3D human action recognition. To that end, two main baseline approaches have been researched: (i) Recurrent Neural Networks (RNNs) with Long-Short Term Memory (LSTM); and (ii) skeleton image representations used as input to a Convolutional Neural Network (CNN). Although RNN approaches present excellent results, such methods lack the ability to efficiently learn the spatial relations between the skeleton joints. On the other hand, the representations used to feed CNN approaches present the advantage of having the natural ability of learning structural information from 2D arrays (i.e., they learn spatial relations from the skeleton joints). To further improve such representations, we introduce the Tree Structure Reference Joints Image (TSRJI), a novel skeleton image representation to be used as input to CNNs. The proposed representation has the advantage of combining the use of reference joints and a tree structure skeleton. While the former incorporates different spatial relationships between the joints, the latter preserves important spatial relations by traversing a skeleton tree with a depth-first order algorithm. Experimental results demonstrate the effectiveness of the proposed representation for 3D action recognition on two datasets achieving state-of-the-art results on the recent NTU RGB+D~120 dataset. |
Tasks | 3D Human Action Recognition, Action Recognition In Videos, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05704v1 |
https://arxiv.org/pdf/1909.05704v1.pdf | |
PWC | https://paperswithcode.com/paper/skeleton-image-representation-for-3d-action |
Repo | https://github.com/carloscaetano/skeleton-images |
Framework | none |
Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment
Title | Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment |
Authors | Jaap Jumelet, Willem Zuidema, Dieuwke Hupkes |
Abstract | Extensive research has recently shown that recurrent neural language models are able to process a wide range of grammatical phenomena. How these models are able to perform these remarkable feats so well, however, is still an open question. To gain more insight into what information LSTMs base their decisions on, we propose a generalisation of Contextual Decomposition (GCD). In particular, this setup enables us to accurately distil which part of a prediction stems from semantic heuristics, which part truly emanates from syntactic cues and which part arise from the model biases themselves instead. We investigate this technique on tasks pertaining to syntactic agreement and co-reference resolution and discover that the model strongly relies on a default reasoning effect to perform these tasks. |
Tasks | |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.08975v1 |
https://arxiv.org/pdf/1909.08975v1.pdf | |
PWC | https://paperswithcode.com/paper/analysing-neural-language-models-contextual |
Repo | https://github.com/i-machine-think/diagnnose |
Framework | pytorch |
Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation
Title | Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation |
Authors | Xiao Huang, Li Dong, Elizabeth Boschee, Nanyun Peng |
Abstract | Named entity recognition (NER) identifies typed entity mentions in raw text. While the task is well-established, there is no universally used tagset: often, datasets are annotated for use in downstream applications and accordingly only cover a small set of entity types relevant to a particular task. For instance, in the biomedical domain, one corpus might annotate genes, another chemicals, and another diseases—despite the texts in each corpus containing references to all three types of entities. In this paper, we propose a deep structured model to integrate these “partially annotated” datasets to jointly identify all entity types appearing in the training corpora. By leveraging multiple datasets, the model can learn robust input representations; by building a joint structured model, it avoids potential conflicts caused by combining several models’ predictions at test time. Experiments show that the proposed model significantly outperforms strong multi-task learning baselines when training on multiple, partially annotated datasets and testing on datasets that contain tags from more than one of the training corpora. |
Tasks | Multi-Task Learning, Named Entity Recognition |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11535v2 |
https://arxiv.org/pdf/1909.11535v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-unified-named-entity-tagger-from |
Repo | https://github.com/xhuang28/NewBioNer |
Framework | pytorch |
Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
Title | Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer |
Authors | Edward Choi, Zhen Xu, Yujia Li, Michael W. Dusenberry, Gerardo Flores, Yuan Xue, Andrew M. Dai |
Abstract | Effective modeling of electronic health records (EHR) is rapidly becoming an important topic in both academia and industry. A recent study showed that using the graphical structure underlying EHR data (e.g. relationship between diagnoses and treatments) improves the performance of prediction tasks such as heart failure prediction. However, EHR data do not always contain complete structure information. Moreover, when it comes to claims data, structure information is completely unavailable to begin with. Under such circumstances, can we still do better than just treating EHR data as a flat-structured bag-of-features? In this paper, we study the possibility of jointly learning the hidden structure of EHR while performing supervised prediction tasks on EHR data. Specifically, we discuss that Transformer is a suitable basis model to learn the hidden EHR structure, and propose Graph Convolutional Transformer, which uses data statistics to guide the structure learning process. The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for EHR data. |
Tasks | Readmission Prediction, Representation Learning |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04716v3 |
https://arxiv.org/pdf/1906.04716v3.pdf | |
PWC | https://paperswithcode.com/paper/graph-convolutional-transformer-learning-the |
Repo | https://github.com/mp2893/mime |
Framework | tf |
PLMP – Point-Line Minimal Problems in Complete Multi-View Visibility
Title | PLMP – Point-Line Minimal Problems in Complete Multi-View Visibility |
Authors | Timothy Duff, Kathlén Kohn, Anton Leykin, Tomas Pajdla |
Abstract | We present a complete classification of all minimal problems for generic arrangements of points and lines completely observed by calibrated perspective cameras. We show that there are only 30 minimal problems in total, no problems exist for more than 6 cameras, for more than 5 points, and for more than 6 lines. We present a sequence of tests for detecting minimality starting with counting degrees of freedom and ending with full symbolic and numeric verification of representative examples. For all minimal problems discovered, we present their algebraic degrees, i.e. the number of solutions, which measure their intrinsic difficulty. It shows how exactly the difficulty of problems grows with the number of views. Importantly, several new minimal problems have small degrees that might be practical in image matching and 3D reconstruction. |
Tasks | 3D Reconstruction |
Published | 2019-03-24 |
URL | https://arxiv.org/abs/1903.10008v2 |
https://arxiv.org/pdf/1903.10008v2.pdf | |
PWC | https://paperswithcode.com/paper/plmp-point-line-minimal-problems-in-complete |
Repo | https://github.com/extreme-assistant/iccv2019 |
Framework | none |