Paper Group AWR 299
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. Deep Graph Infomax. Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking. Bootstrapping Generators from Noisy Data. Understanding Back-Translation at Scale. CTAP: Complementary Temporal Action Proposal Generation. FlowQA: Grasping Flo …
UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
Title | UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction |
Authors | Leland McInnes, John Healy, James Melville |
Abstract | UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning. |
Tasks | Dimensionality Reduction |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.03426v2 |
http://arxiv.org/pdf/1802.03426v2.pdf | |
PWC | https://paperswithcode.com/paper/umap-uniform-manifold-approximation-and |
Repo | https://github.com/NCBI-Hackathons/DiseaseCluster |
Framework | none |
Deep Graph Infomax
Title | Deep Graph Infomax |
Authors | Petar Veličković, William Fedus, William L. Hamilton, Pietro Liò, Yoshua Bengio, R Devon Hjelm |
Abstract | We present Deep Graph Infomax (DGI), a general approach for learning node representations within graph-structured data in an unsupervised manner. DGI relies on maximizing mutual information between patch representations and corresponding high-level summaries of graphs—both derived using established graph convolutional network architectures. The learnt patch representations summarize subgraphs centered around nodes of interest, and can thus be reused for downstream node-wise learning tasks. In contrast to most prior approaches to unsupervised learning with GCNs, DGI does not rely on random walk objectives, and is readily applicable to both transductive and inductive learning setups. We demonstrate competitive performance on a variety of node classification benchmarks, which at times even exceeds the performance of supervised learning. |
Tasks | Node Classification |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10341v2 |
http://arxiv.org/pdf/1809.10341v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-graph-infomax |
Repo | https://github.com/PetarV-/DGI |
Framework | pytorch |
Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking
Title | Joint Representation and Truncated Inference Learning for Correlation Filter based Tracking |
Authors | Yingjie Yao, Xiaohe Wu, Lei Zhang, Shiguang Shan, Wangmeng Zuo |
Abstract | Correlation filter (CF) based trackers generally include two modules, i.e., feature representation and on-line model adaptation. In existing off-line deep learning models for CF trackers, the model adaptation usually is either abandoned or has closed-form solution to make it feasible to learn deep representation in an end-to-end manner. However, such solutions fail to exploit the advances in CF models, and cannot achieve competitive accuracy in comparison with the state-of-the-art CF trackers. In this paper, we investigate the joint learning of deep representation and model adaptation, where an updater network is introduced for better tracking on future frame by taking current frame representation, tracking result, and last CF tracker as input. By modeling the representor as convolutional neural network (CNN), we truncate the alternating direction method of multipliers (ADMM) and interpret it as a deep network of updater, resulting in our model for learning representation and truncated inference (RTINet). Experiments demonstrate that our RTINet tracker achieves favorable tracking accuracy against the state-of-the-art trackers and its rapid version can run at a real-time speed of 24 fps. The code and pre-trained models will be publicly available at https://github.com/tourmaline612/RTINet. |
Tasks | |
Published | 2018-07-29 |
URL | http://arxiv.org/abs/1807.11071v1 |
http://arxiv.org/pdf/1807.11071v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-representation-and-truncated-inference |
Repo | https://github.com/tourmaline612/RTINet |
Framework | none |
Bootstrapping Generators from Noisy Data
Title | Bootstrapping Generators from Noisy Data |
Authors | Laura Perez-Beltrachini, Mirella Lapata |
Abstract | A core step in statistical data-to-text generation concerns learning correspondences between structured data representations (e.g., facts in a database) and associated texts. In this paper we aim to bootstrap generators from large scale datasets where the data (e.g., DBPedia facts) and related texts (e.g., Wikipedia abstracts) are loosely aligned. We tackle this challenging task by introducing a special-purpose content selection mechanism. We use multi-instance learning to automatically discover correspondences between data and text pairs and show how these can be used to enhance the content signal while training an encoder-decoder architecture. Experimental results demonstrate that models trained with content-specific objectives improve upon a vanilla encoder-decoder which solely relies on soft attention. |
Tasks | Data-to-Text Generation, Text Generation |
Published | 2018-04-17 |
URL | https://arxiv.org/abs/1804.06385v4 |
https://arxiv.org/pdf/1804.06385v4.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-generators-from-noisy-data |
Repo | https://github.com/EdinburghNLP/wikigen |
Framework | none |
Understanding Back-Translation at Scale
Title | Understanding Back-Translation at Scale |
Authors | Sergey Edunov, Myle Ott, Michael Auli, David Grangier |
Abstract | An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences. This work broadens the understanding of back-translation and investigates a number of methods to generate synthetic source sentences. We find that in all but resource poor settings back-translations obtained via sampling or noised beam outputs are most effective. Our analysis shows that sampling or noisy synthetic data gives a much stronger training signal than data generated by beam or greedy search. We also compare how synthetic data compares to genuine bitext and study various domain effects. Finally, we scale to hundreds of millions of monolingual sentences and achieve a new state of the art of 35 BLEU on the WMT’14 English-German test set. |
Tasks | Machine Translation |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09381v2 |
http://arxiv.org/pdf/1808.09381v2.pdf | |
PWC | https://paperswithcode.com/paper/understanding-back-translation-at-scale |
Repo | https://github.com/pytorch/fairseq |
Framework | pytorch |
CTAP: Complementary Temporal Action Proposal Generation
Title | CTAP: Complementary Temporal Action Proposal Generation |
Authors | Jiyang Gao, Kan Chen, Ram Nevatia |
Abstract | Temporal action proposal generation is an important task, akin to object proposals, temporal action proposals are intended to capture “clips” or temporal intervals in videos that are likely to contain an action. Previous methods can be divided to two groups: sliding window ranking and actionness score grouping. Sliding windows uniformly cover all segments in videos, but the temporal boundaries are imprecise; grouping based method may have more precise boundaries but it may omit some proposals when the quality of actionness score is low. Based on the complementary characteristics of these two methods, we propose a novel Complementary Temporal Action Proposal (CTAP) generator. Specifically, we apply a Proposal-level Actionness Trustworthiness Estimator (PATE) on the sliding windows proposals to generate the probabilities indicating whether the actions can be correctly detected by actionness scores, the windows with high scores are collected. The collected sliding windows and actionness proposals are then processed by a temporal convolutional neural network for proposal ranking and boundary adjustment. CTAP outperforms state-of-the-art methods on average recall (AR) by a large margin on THUMOS-14 and ActivityNet 1.3 datasets. We further apply CTAP as a proposal generation method in an existing action detector, and show consistent significant improvements. |
Tasks | Temporal Action Proposal Generation |
Published | 2018-07-12 |
URL | http://arxiv.org/abs/1807.04821v2 |
http://arxiv.org/pdf/1807.04821v2.pdf | |
PWC | https://paperswithcode.com/paper/ctap-complementary-temporal-action-proposal |
Repo | https://github.com/jiyanggao/CTAP |
Framework | tf |
FlowQA: Grasping Flow in History for Conversational Machine Comprehension
Title | FlowQA: Grasping Flow in History for Conversational Machine Comprehension |
Authors | Hsin-Yuan Huang, Eunsol Choi, Wen-tau Yih |
Abstract | Conversational machine comprehension requires the understanding of the conversation history, such as previous question/answer pairs, the document context, and the current question. To enable traditional, single-turn models to encode the history comprehensively, we introduce Flow, a mechanism that can incorporate intermediate representations generated during the process of answering previous questions, through an alternating parallel processing structure. Compared to approaches that concatenate previous questions/answers as input, Flow integrates the latent semantics of the conversation history more deeply. Our model, FlowQA, shows superior performance on two recently proposed conversational challenges (+7.2% F1 on CoQA and +4.0% on QuAC). The effectiveness of Flow also shows in other tasks. By reducing sequential instruction understanding to conversational machine comprehension, FlowQA outperforms the best models on all three domains in SCONE, with +1.8% to +4.4% improvement in accuracy. |
Tasks | Question Answering, Reading Comprehension |
Published | 2018-10-06 |
URL | http://arxiv.org/abs/1810.06683v3 |
http://arxiv.org/pdf/1810.06683v3.pdf | |
PWC | https://paperswithcode.com/paper/flowqa-grasping-flow-in-history-for |
Repo | https://github.com/momohuang/FlowQA |
Framework | pytorch |
A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization
Title | A Capsule Network-based Embedding Model for Knowledge Graph Completion and Search Personalization |
Authors | Dai Quoc Nguyen, Thanh Vu, Tu Dinh Nguyen, Dat Quoc Nguyen, Dinh Phung |
Abstract | In this paper, we introduce an embedding model, named CapsE, exploring a capsule network to model relationship triples (subject, relation, object). Our CapsE represents each triple as a 3-column matrix where each column vector represents the embedding of an element in the triple. This 3-column matrix is then fed to a convolution layer where multiple filters are operated to generate different feature maps. These feature maps are reconstructed into corresponding capsules which are then routed to another capsule to produce a continuous vector. The length of this vector is used to measure the plausibility score of the triple. Our proposed CapsE obtains better performance than previous state-of-the-art embedding models for knowledge graph completion on two benchmark datasets WN18RR and FB15k-237, and outperforms strong search personalization baselines on SEARCH17. |
Tasks | Knowledge Graph Completion, Link Prediction |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04122v3 |
http://arxiv.org/pdf/1808.04122v3.pdf | |
PWC | https://paperswithcode.com/paper/a-capsule-network-based-embedding-model-for-1 |
Repo | https://github.com/daiquocnguyen/ConvKB |
Framework | tf |
CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction
Title | CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction |
Authors | Shing Yan Loo, Ali Jahani Amiri, Syamsiah Mashohor, Sai Hong Tang, Hong Zhang |
Abstract | Reliable feature correspondence between frames is a critical step in visual odometry (VO) and visual simultaneous localization and mapping (V-SLAM) algorithms. In comparison with existing VO and V-SLAM algorithms, semi-direct visual odometry (SVO) has two main advantages that lead to state-of-the-art frame rate camera motion estimation: direct pixel correspondence and efficient implementation of probabilistic mapping method. This paper improves the SVO mapping by initializing the mean and the variance of the depth at a feature location according to the depth prediction from a single-image depth prediction network. By significantly reducing the depth uncertainty of the initialized map point (i.e., small variance centred about the depth prediction), the benefits are twofold: reliable feature correspondence between views and fast convergence to the true depth in order to create new map points. We evaluate our method with two outdoor datasets: KITTI dataset and Oxford Robotcar dataset. The experimental results indicate that the improved SVO mapping results in increased robustness and camera tracking accuracy. |
Tasks | Depth Estimation, Motion Estimation, Simultaneous Localization and Mapping, Visual Odometry |
Published | 2018-10-01 |
URL | http://arxiv.org/abs/1810.01011v1 |
http://arxiv.org/pdf/1810.01011v1.pdf | |
PWC | https://paperswithcode.com/paper/cnn-svo-improving-the-mapping-in-semi-direct |
Repo | https://github.com/Yelen719/CNN-DSO |
Framework | tf |
Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image Classification
Title | Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image Classification |
Authors | Shaoning Zeng, Bob Zhang, Yanghao Zhang, Jianping Gou |
Abstract | Deep convolutional neural networks provide a powerful feature learning capability for image classification. The deep image features can be utilized to deal with many image understanding tasks like image classification and object recognition. However, the robustness obtained in one dataset can be hardly reproduced in the other domain, which leads to inefficient models far from state-of-the-art. We propose a deep collaborative weight-based classification (DeepCWC) method to resolve this problem, by providing a novel option to fully take advantage of deep features in classic machine learning. It firstly performs the L2-norm based collaborative representation on the original images, as well as the deep features extracted by deep CNN models. Then, two distance vectors, obtained based on the pair of linear representations, are fused together via a novel collaborative weight. This collaborative weight enables deep and classic representations to weigh each other. We observed the complementarity between two representations in a series of experiments on 10 facial and object datasets. The proposed DeepCWC produces very promising classification results, and outperforms many other benchmark methods, especially the ones claimed for Fashion-MNIST. The code is going to be published in our public repository. |
Tasks | Image Classification, L2 Regularization, Object Recognition |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07589v2 |
http://arxiv.org/pdf/1802.07589v2.pdf | |
PWC | https://paperswithcode.com/paper/collaboratively-weighting-deep-and-classic |
Repo | https://github.com/zengsn/research |
Framework | none |
signSGD: Compressed Optimisation for Non-Convex Problems
Title | signSGD: Compressed Optimisation for Non-Convex Problems |
Authors | Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Anima Anandkumar |
Abstract | Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. signSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative $\ell_1/\ell_2$ geometry of gradients, noise and curvature informs whether signSGD or SGD is theoretically better suited to a particular problem. On the practical side we find that the momentum counterpart of signSGD is able to match the accuracy and convergence speed of Adam on deep Imagenet models. We extend our theory to the distributed setting, where the parameter server uses majority vote to aggregate gradient signs from each worker enabling 1-bit compression of worker-server communication in both directions. Using a theorem by Gauss we prove that majority vote can achieve the same reduction in variance as full precision distributed SGD. Thus, there is great promise for sign-based optimisation schemes to achieve fast communication and fast convergence. Code to reproduce experiments is to be found at https://github.com/jxbz/signSGD . |
Tasks | |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04434v3 |
http://arxiv.org/pdf/1802.04434v3.pdf | |
PWC | https://paperswithcode.com/paper/signsgd-compressed-optimisation-for-non |
Repo | https://github.com/jxbz/signSGD |
Framework | tf |
Neural Proximal Gradient Descent for Compressive Imaging
Title | Neural Proximal Gradient Descent for Compressive Imaging |
Authors | Morteza Mardani, Qingyun Sun, Shreyas Vasawanala, Vardan Papyan, Hatef Monajemi, John Pauly, David Donoho |
Abstract | Recovering high-resolution images from limited sensory data typically leads to a serious ill-posed inverse problem, demanding inversion algorithms that effectively capture the prior information. Learning a good inverse mapping from training data faces severe challenges, including: (i) scarcity of training data; (ii) need for plausible reconstructions that are physically feasible; (iii) need for fast reconstruction, especially in real-time applications. We develop a successful system solving all these challenges, using as basic architecture the recurrent application of proximal gradient algorithm. We learn a proximal map that works well with real images based on residual networks. Contraction of the resulting map is analyzed, and incoherence conditions are investigated that drive the convergence of the iterates. Extensive experiments are carried out under different settings: (a) reconstructing abdominal MRI of pediatric patients from highly undersampled Fourier-space data and (b) superresolving natural face images. Our key findings include: 1. a recurrent ResNet with a single residual block unrolled from an iterative algorithm yields an effective proximal which accurately reveals MR image details. 2. Our architecture significantly outperforms conventional non-recurrent deep ResNets by 2dB SNR; it is also trained much more rapidly. 3. It outperforms state-of-the-art compressed-sensing Wavelet-based methods by 4dB SNR, with 100x speedups in reconstruction time. |
Tasks | |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.03963v1 |
http://arxiv.org/pdf/1806.03963v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-proximal-gradient-descent-for |
Repo | https://github.com/MortezaMardani/Neural-PGD |
Framework | tf |
Graphene: Semantically-Linked Propositions in Open Information Extraction
Title | Graphene: Semantically-Linked Propositions in Open Information Extraction |
Authors | Matthias Cetto, Christina Niklaus, André Freitas, Siegfried Handschuh |
Abstract | We present an Open Information Extraction (IE) approach that uses a two-layered transformation stage consisting of a clausal disembedding layer and a phrasal disembedding layer, together with rhetorical relation identification. In that way, we convert sentences that present a complex linguistic structure into simplified, syntactically sound sentences, from which we can extract propositions that are represented in a two-layered hierarchy in the form of core relational tuples and accompanying contextual information which are semantically linked via rhetorical relations. In a comparative evaluation, we demonstrate that our reference implementation Graphene outperforms state-of-the-art Open IE systems in the construction of correct n-ary predicate-argument structures. Moreover, we show that existing Open IE approaches can benefit from the transformation process of our framework. |
Tasks | Open Information Extraction |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11276v1 |
http://arxiv.org/pdf/1807.11276v1.pdf | |
PWC | https://paperswithcode.com/paper/graphene-semantically-linked-propositions-in |
Repo | https://github.com/Lambda-3/Graphene |
Framework | none |
Deep Enhanced Representation for Implicit Discourse Relation Recognition
Title | Deep Enhanced Representation for Implicit Discourse Relation Recognition |
Authors | Hongxiao Bai, Hai Zhao |
Abstract | Implicit discourse relation recognition is a challenging task as the relation prediction without explicit connectives in discourse parsing needs understanding of text spans and cannot be easily derived from surface features from the input sentence pairs. Thus, properly representing the text is very crucial to this task. In this paper, we propose a model augmented with different grained text representations, including character, subword, word, sentence, and sentence pair levels. The proposed deeper model is evaluated on the benchmark treebank and achieves state-of-the-art accuracy with greater than 48% in 11-way and $F_1$ score greater than 50% in 4-way classifications for the first time according to our best knowledge. |
Tasks | |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.05154v1 |
http://arxiv.org/pdf/1807.05154v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-enhanced-representation-for-implicit |
Repo | https://github.com/diccooo/Deep_Enhanced_Repr_for_IDRR |
Framework | pytorch |
NeVAE: A Deep Generative Model for Molecular Graphs
Title | NeVAE: A Deep Generative Model for Molecular Graphs |
Authors | Bidisha Samanta, Abir De, Gourhari Jana, Pratim Kumar Chattaraj, Niloy Ganguly, Manuel Gomez-Rodriguez |
Abstract | Deep generative models have been praised for their ability to learn smooth latent representation of images, text, and audio, which can then be used to generate new, plausible data. However, current generative models are unable to work with molecular graphs due to their unique characteristics-their underlying structure is not Euclidean or grid-like, they remain isomorphic under permutation of the nodes labels, and they come with a different number of nodes and edges. In this paper, we first propose a novel variational autoencoder for molecular graphs, whose encoder and decoder are specially designed to account for the above properties by means of several technical innovations. Moreover, in contrast with the state of the art, our decoder is able to provide the spatial coordinates of the atoms of the molecules it generates. Then, we develop a gradient-based algorithm to optimize the decoder of our model so that it learns to generate molecules that maximize the value of certain property of interest and, given a molecule of interest, it is able to optimize the spatial configuration of its atoms for greater stability. Experiments reveal that our variational autoencoder can discover plausible, diverse and novel molecules more effectively than several state of the art models. Moreover, for several properties of interest, our optimized decoder is able to identify molecules with property values 121% higher than those identified by several state of the art methods based on Bayesian optimization and reinforcement learning |
Tasks | |
Published | 2018-02-14 |
URL | https://arxiv.org/abs/1802.05283v4 |
https://arxiv.org/pdf/1802.05283v4.pdf | |
PWC | https://paperswithcode.com/paper/nevae-a-deep-generative-model-for-molecular |
Repo | https://github.com/Networks-Learning/nevae |
Framework | tf |