October 20, 2019

2974 words 14 mins read

Paper Group AWR 299

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. Deep Graph Infomax. Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking. Bootstrapping Generators from Noisy Data. Understanding Back-Translation at Scale. CTAP: Complementary Temporal Action Proposal Generation. FlowQA: Grasping Flo …

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction


Title	UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Authors	Leland McInnes, John Healy, James Melville
Abstract	UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.
Tasks	Dimensionality Reduction
Published	2018-02-09
URL	http://arxiv.org/abs/1802.03426v2
PDF	http://arxiv.org/pdf/1802.03426v2.pdf
PWC	https://paperswithcode.com/paper/umap-uniform-manifold-approximation-and
Repo	https://github.com/NCBI-Hackathons/DiseaseCluster
Framework	none

Deep Graph Infomax


Title	Deep Graph Infomax
Authors	Petar Veličković, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, R Devon Hjelm
Abstract	We present Deep Graph Infomax (DGI), a general approach for learning node representations within graph-structured data in an unsupervised manner. DGI relies on maximizing mutual information between patch representations and corresponding high-level summaries of graphs—both derived using established graph convolutional network architectures. The learnt patch representations summarize subgraphs centered around nodes of interest, and can thus be reused for downstream node-wise learning tasks. In contrast to most prior approaches to unsupervised learning with GCNs, DGI does not rely on random walk objectives, and is readily applicable to both transductive and inductive learning setups. We demonstrate competitive performance on a variety of node classification benchmarks, which at times even exceeds the performance of supervised learning.
Tasks	Node Classification
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10341v2
PDF	http://arxiv.org/pdf/1809.10341v2.pdf
PWC	https://paperswithcode.com/paper/deep-graph-infomax
Repo	https://github.com/PetarV-/DGI
Framework	pytorch

Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking


Title	Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking
Authors	Yingjie Yao, Xiaohe Wu, Lei Zhang, Shiguang Shan, Wangmeng Zuo
Abstract	Correlation filter (CF) based trackers generally include two modules, i.e., feature representation and on-line model adaptation. In existing off-line deep learning models for CF trackers, the model adaptation usually is either abandoned or has closed-form solution to make it feasible to learn deep representation in an end-to-end manner. However, such solutions fail to exploit the advances in CF models, and cannot achieve competitive accuracy in comparison with the state-of-the-art CF trackers. In this paper, we investigate the joint learning of deep representation and model adaptation, where an updater network is introduced for better tracking on future frame by taking current frame representation, tracking result, and last CF tracker as input. By modeling the representor as convolutional neural network (CNN), we truncate the alternating direction method of multipliers (ADMM) and interpret it as a deep network of updater, resulting in our model for learning representation and truncated inference (RTINet). Experiments demonstrate that our RTINet tracker achieves favorable tracking accuracy against the state-of-the-art trackers and its rapid version can run at a real-time speed of 24 fps. The code and pre-trained models will be publicly available at https://github.com/tourmaline612/RTINet.
Tasks
Published	2018-07-29
URL	http://arxiv.org/abs/1807.11071v1
PDF	http://arxiv.org/pdf/1807.11071v1.pdf
PWC	https://paperswithcode.com/paper/joint-representation-and-truncated-inference
Repo	https://github.com/tourmaline612/RTINet
Framework	none

Bootstrapping Generators from Noisy Data


Title	Bootstrapping Generators from Noisy Data
Authors	Laura Perez-Beltrachini, Mirella Lapata
Abstract	A core step in statistical data-to-text generation concerns learning correspondences between structured data representations (e.g., facts in a database) and associated texts. In this paper we aim to bootstrap generators from large scale datasets where the data (e.g., DBPedia facts) and related texts (e.g., Wikipedia abstracts) are loosely aligned. We tackle this challenging task by introducing a special-purpose content selection mechanism. We use multi-instance learning to automatically discover correspondences between data and text pairs and show how these can be used to enhance the content signal while training an encoder-decoder architecture. Experimental results demonstrate that models trained with content-specific objectives improve upon a vanilla encoder-decoder which solely relies on soft attention.
Tasks	Data-to-Text Generation, Text Generation
Published	2018-04-17
URL	https://arxiv.org/abs/1804.06385v4
PDF	https://arxiv.org/pdf/1804.06385v4.pdf
PWC	https://paperswithcode.com/paper/bootstrapping-generators-from-noisy-data
Repo	https://github.com/EdinburghNLP/wikigen
Framework	none

Understanding Back-Translation at Scale


Title	Understanding Back-Translation at Scale
Authors	Sergey Edunov, Myle Ott, Michael Auli, David Grangier
Abstract	An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences. This work broadens the understanding of back-translation and investigates a number of methods to generate synthetic source sentences. We find that in all but resource poor settings back-translations obtained via sampling or noised beam outputs are most effective. Our analysis shows that sampling or noisy synthetic data gives a much stronger training signal than data generated by beam or greedy search. We also compare how synthetic data compares to genuine bitext and study various domain effects. Finally, we scale to hundreds of millions of monolingual sentences and achieve a new state of the art of 35 BLEU on the WMT’14 English-German test set.
Tasks	Machine Translation
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09381v2
PDF	http://arxiv.org/pdf/1808.09381v2.pdf
PWC	https://paperswithcode.com/paper/understanding-back-translation-at-scale
Repo	https://github.com/pytorch/fairseq
Framework	pytorch

CTAP: Complementary Temporal Action Proposal Generation


Title	CTAP: Complementary Temporal Action Proposal Generation
Authors	Jiyang Gao, Kan Chen, Ram Nevatia
Abstract	Temporal action proposal generation is an important task, akin to object proposals, temporal action proposals are intended to capture “clips” or temporal intervals in videos that are likely to contain an action. Previous methods can be divided to two groups: sliding window ranking and actionness score grouping. Sliding windows uniformly cover all segments in videos, but the temporal boundaries are imprecise; grouping based method may have more precise boundaries but it may omit some proposals when the quality of actionness score is low. Based on the complementary characteristics of these two methods, we propose a novel Complementary Temporal Action Proposal (CTAP) generator. Specifically, we apply a Proposal-level Actionness Trustworthiness Estimator (PATE) on the sliding windows proposals to generate the probabilities indicating whether the actions can be correctly detected by actionness scores, the windows with high scores are collected. The collected sliding windows and actionness proposals are then processed by a temporal convolutional neural network for proposal ranking and boundary adjustment. CTAP outperforms state-of-the-art methods on average recall (AR) by a large margin on THUMOS-14 and ActivityNet 1.3 datasets. We further apply CTAP as a proposal generation method in an existing action detector, and show consistent significant improvements.
Tasks	Temporal Action Proposal Generation
Published	2018-07-12
URL	http://arxiv.org/abs/1807.04821v2
PDF	http://arxiv.org/pdf/1807.04821v2.pdf
PWC	https://paperswithcode.com/paper/ctap-complementary-temporal-action-proposal
Repo	https://github.com/jiyanggao/CTAP
Framework	tf

FlowQA: Grasping Flow in History for Conversational Machine Comprehension


Title	FlowQA: Grasping Flow in History for Conversational Machine Comprehension
Authors	Hsin-Yuan Huang, Eunsol Choi, Wen-tau Yih
Abstract	Conversational machine comprehension requires the understanding of the conversation history, such as previous question/answer pairs, the document context, and the current question. To enable traditional, single-turn models to encode the history comprehensively, we introduce Flow, a mechanism that can incorporate intermediate representations generated during the process of answering previous questions, through an alternating parallel processing structure. Compared to approaches that concatenate previous questions/answers as input, Flow integrates the latent semantics of the conversation history more deeply. Our model, FlowQA, shows superior performance on two recently proposed conversational challenges (+7.2% F1 on CoQA and +4.0% on QuAC). The effectiveness of Flow also shows in other tasks. By reducing sequential instruction understanding to conversational machine comprehension, FlowQA outperforms the best models on all three domains in SCONE, with +1.8% to +4.4% improvement in accuracy.
Tasks	Question Answering, Reading Comprehension
Published	2018-10-06
URL	http://arxiv.org/abs/1810.06683v3
PDF	http://arxiv.org/pdf/1810.06683v3.pdf
PWC	https://paperswithcode.com/paper/flowqa-grasping-flow-in-history-for
Repo	https://github.com/momohuang/FlowQA
Framework	pytorch

A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization


Title	A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization
Authors	Dai Quoc Nguyen, Thanh Vu, Tu Dinh Nguyen, Dat Quoc Nguyen, Dinh Phung
Abstract	In this paper, we introduce an embedding model, named CapsE, exploring a capsule network to model relationship triples (subject, relation, object). Our CapsE represents each triple as a 3-column matrix where each column vector represents the embedding of an element in the triple. This 3-column matrix is then fed to a convolution layer where multiple filters are operated to generate different feature maps. These feature maps are reconstructed into corresponding capsules which are then routed to another capsule to produce a continuous vector. The length of this vector is used to measure the plausibility score of the triple. Our proposed CapsE obtains better performance than previous state-of-the-art embedding models for knowledge graph completion on two benchmark datasets WN18RR and FB15k-237, and outperforms strong search personalization baselines on SEARCH17.
Tasks	Knowledge Graph Completion, Link Prediction
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04122v3
PDF	http://arxiv.org/pdf/1808.04122v3.pdf
PWC	https://paperswithcode.com/paper/a-capsule-network-based-embedding-model-for-1
Repo	https://github.com/daiquocnguyen/ConvKB
Framework	tf

CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction


Title	CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction
Authors	Shing Yan Loo, Ali Jahani Amiri, Syamsiah Mashohor, Sai Hong Tang, Hong Zhang
Abstract	Reliable feature correspondence between frames is a critical step in visual odometry (VO) and visual simultaneous localization and mapping (V-SLAM) algorithms. In comparison with existing VO and V-SLAM algorithms, semi-direct visual odometry (SVO) has two main advantages that lead to state-of-the-art frame rate camera motion estimation: direct pixel correspondence and efficient implementation of probabilistic mapping method. This paper improves the SVO mapping by initializing the mean and the variance of the depth at a feature location according to the depth prediction from a single-image depth prediction network. By significantly reducing the depth uncertainty of the initialized map point (i.e., small variance centred about the depth prediction), the benefits are twofold: reliable feature correspondence between views and fast convergence to the true depth in order to create new map points. We evaluate our method with two outdoor datasets: KITTI dataset and Oxford Robotcar dataset. The experimental results indicate that the improved SVO mapping results in increased robustness and camera tracking accuracy.
Tasks	Depth Estimation, Motion Estimation, Simultaneous Localization and Mapping, Visual Odometry
Published	2018-10-01
URL	http://arxiv.org/abs/1810.01011v1
PDF	http://arxiv.org/pdf/1810.01011v1.pdf
PWC	https://paperswithcode.com/paper/cnn-svo-improving-the-mapping-in-semi-direct
Repo	https://github.com/Yelen719/CNN-DSO
Framework	tf

Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image Classification


Title	Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image Classification
Authors	Shaoning Zeng, Bob Zhang, Yanghao Zhang, Jianping Gou
Abstract	Deep convolutional neural networks provide a powerful feature learning capability for image classification. The deep image features can be utilized to deal with many image understanding tasks like image classification and object recognition. However, the robustness obtained in one dataset can be hardly reproduced in the other domain, which leads to inefficient models far from state-of-the-art. We propose a deep collaborative weight-based classification (DeepCWC) method to resolve this problem, by providing a novel option to fully take advantage of deep features in classic machine learning. It firstly performs the L2-norm based collaborative representation on the original images, as well as the deep features extracted by deep CNN models. Then, two distance vectors, obtained based on the pair of linear representations, are fused together via a novel collaborative weight. This collaborative weight enables deep and classic representations to weigh each other. We observed the complementarity between two representations in a series of experiments on 10 facial and object datasets. The proposed DeepCWC produces very promising classification results, and outperforms many other benchmark methods, especially the ones claimed for Fashion-MNIST. The code is going to be published in our public repository.
Tasks	Image Classification, L2 Regularization, Object Recognition
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07589v2
PDF	http://arxiv.org/pdf/1802.07589v2.pdf
PWC	https://paperswithcode.com/paper/collaboratively-weighting-deep-and-classic
Repo	https://github.com/zengsn/research
Framework	none

signSGD: Compressed Optimisation for Non-Convex Problems


Title	signSGD: Compressed Optimisation for Non-Convex Problems
Authors	Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Anima Anandkumar
Abstract	Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. signSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative $\ell_1/\ell_2$ geometry of gradients, noise and curvature informs whether signSGD or SGD is theoretically better suited to a particular problem. On the practical side we find that the momentum counterpart of signSGD is able to match the accuracy and convergence speed of Adam on deep Imagenet models. We extend our theory to the distributed setting, where the parameter server uses majority vote to aggregate gradient signs from each worker enabling 1-bit compression of worker-server communication in both directions. Using a theorem by Gauss we prove that majority vote can achieve the same reduction in variance as full precision distributed SGD. Thus, there is great promise for sign-based optimisation schemes to achieve fast communication and fast convergence. Code to reproduce experiments is to be found at https://github.com/jxbz/signSGD .
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04434v3
PDF	http://arxiv.org/pdf/1802.04434v3.pdf
PWC	https://paperswithcode.com/paper/signsgd-compressed-optimisation-for-non
Repo	https://github.com/jxbz/signSGD
Framework	tf

Neural Proximal Gradient Descent for Compressive Imaging


Title	Neural Proximal Gradient Descent for Compressive Imaging
Authors	Morteza Mardani, Qingyun Sun, Shreyas Vasawanala, Vardan Papyan, Hatef Monajemi, John Pauly, David Donoho
Abstract	Recovering high-resolution images from limited sensory data typically leads to a serious ill-posed inverse problem, demanding inversion algorithms that effectively capture the prior information. Learning a good inverse mapping from training data faces severe challenges, including: (i) scarcity of training data; (ii) need for plausible reconstructions that are physically feasible; (iii) need for fast reconstruction, especially in real-time applications. We develop a successful system solving all these challenges, using as basic architecture the recurrent application of proximal gradient algorithm. We learn a proximal map that works well with real images based on residual networks. Contraction of the resulting map is analyzed, and incoherence conditions are investigated that drive the convergence of the iterates. Extensive experiments are carried out under different settings: (a) reconstructing abdominal MRI of pediatric patients from highly undersampled Fourier-space data and (b) superresolving natural face images. Our key findings include: 1. a recurrent ResNet with a single residual block unrolled from an iterative algorithm yields an effective proximal which accurately reveals MR image details. 2. Our architecture significantly outperforms conventional non-recurrent deep ResNets by 2dB SNR; it is also trained much more rapidly. 3. It outperforms state-of-the-art compressed-sensing Wavelet-based methods by 4dB SNR, with 100x speedups in reconstruction time.
Tasks
Published	2018-06-01
URL	http://arxiv.org/abs/1806.03963v1
PDF	http://arxiv.org/pdf/1806.03963v1.pdf
PWC	https://paperswithcode.com/paper/neural-proximal-gradient-descent-for
Repo	https://github.com/MortezaMardani/Neural-PGD
Framework	tf

Graphene: Semantically-Linked Propositions in Open Information Extraction


Title	Graphene: Semantically-Linked Propositions in Open Information Extraction
Authors	Matthias Cetto, Christina Niklaus, André Freitas, Siegfried Handschuh
Abstract	We present an Open Information Extraction (IE) approach that uses a two-layered transformation stage consisting of a clausal disembedding layer and a phrasal disembedding layer, together with rhetorical relation identification. In that way, we convert sentences that present a complex linguistic structure into simplified, syntactically sound sentences, from which we can extract propositions that are represented in a two-layered hierarchy in the form of core relational tuples and accompanying contextual information which are semantically linked via rhetorical relations. In a comparative evaluation, we demonstrate that our reference implementation Graphene outperforms state-of-the-art Open IE systems in the construction of correct n-ary predicate-argument structures. Moreover, we show that existing Open IE approaches can benefit from the transformation process of our framework.
Tasks	Open Information Extraction
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11276v1
PDF	http://arxiv.org/pdf/1807.11276v1.pdf
PWC	https://paperswithcode.com/paper/graphene-semantically-linked-propositions-in
Repo	https://github.com/Lambda-3/Graphene
Framework	none

Deep Enhanced Representation for Implicit Discourse Relation Recognition


Title	Deep Enhanced Representation for Implicit Discourse Relation Recognition
Authors	Hongxiao Bai, Hai Zhao
Abstract	Implicit discourse relation recognition is a challenging task as the relation prediction without explicit connectives in discourse parsing needs understanding of text spans and cannot be easily derived from surface features from the input sentence pairs. Thus, properly representing the text is very crucial to this task. In this paper, we propose a model augmented with different grained text representations, including character, subword, word, sentence, and sentence pair levels. The proposed deeper model is evaluated on the benchmark treebank and achieves state-of-the-art accuracy with greater than 48% in 11-way and $F_1$ score greater than 50% in 4-way classifications for the first time according to our best knowledge.
Tasks
Published	2018-07-13
URL	http://arxiv.org/abs/1807.05154v1
PDF	http://arxiv.org/pdf/1807.05154v1.pdf
PWC	https://paperswithcode.com/paper/deep-enhanced-representation-for-implicit
Repo	https://github.com/diccooo/Deep_Enhanced_Repr_for_IDRR
Framework	pytorch

NeVAE: A Deep Generative Model for Molecular Graphs


Title	NeVAE: A Deep Generative Model for Molecular Graphs
Authors	Bidisha Samanta, Abir De, Gourhari Jana, Pratim Kumar Chattaraj, Niloy Ganguly, Manuel Gomez-Rodriguez
Abstract	Deep generative models have been praised for their ability to learn smooth latent representation of images, text, and audio, which can then be used to generate new, plausible data. However, current generative models are unable to work with molecular graphs due to their unique characteristics-their underlying structure is not Euclidean or grid-like, they remain isomorphic under permutation of the nodes labels, and they come with a different number of nodes and edges. In this paper, we first propose a novel variational autoencoder for molecular graphs, whose encoder and decoder are specially designed to account for the above properties by means of several technical innovations. Moreover, in contrast with the state of the art, our decoder is able to provide the spatial coordinates of the atoms of the molecules it generates. Then, we develop a gradient-based algorithm to optimize the decoder of our model so that it learns to generate molecules that maximize the value of certain property of interest and, given a molecule of interest, it is able to optimize the spatial configuration of its atoms for greater stability. Experiments reveal that our variational autoencoder can discover plausible, diverse and novel molecules more effectively than several state of the art models. Moreover, for several properties of interest, our optimized decoder is able to identify molecules with property values 121% higher than those identified by several state of the art methods based on Bayesian optimization and reinforcement learning
Tasks
Published	2018-02-14
URL	https://arxiv.org/abs/1802.05283v4
PDF	https://arxiv.org/pdf/1802.05283v4.pdf
PWC	https://paperswithcode.com/paper/nevae-a-deep-generative-model-for-molecular
Repo	https://github.com/Networks-Learning/nevae
Framework	tf