February 1, 2020

3179 words 15 mins read

Paper Group AWR 185

Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection. Effective Rotation-invariant Point CNN with Spherical Harmonics kernels. Compressive Transformers for Long-Range Sequence Modelling. Weakly Supervised Domain Detection. Domain Robustness in Neural Machine Translation. Evaluating Protein Transfer Lear …

Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection


Title	Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection
Authors	Sooji Han, Jie Gao, Fabio Ciravegna
Abstract	The scarcity and class imbalance of training data are known issues in current rumor detection tasks. We propose a straight-forward and general-purpose data augmentation technique which is beneficial to early rumor detection relying on event propagation patterns. The key idea is to exploit massive unlabeled event data sets on social media to augment limited labeled rumor source tweets. This work is based on rumor spreading patterns revealed by recent rumor studies and semantic relatedness between labeled and unlabeled data. A state-of-the-art neural language model (NLM) and large credibility-focused Twitter corpora are employed to learn context-sensitive representations of rumor tweets. Six different real-world events based on three publicly available rumor datasets are employed in our experiments to provide a comparative evaluation of the effectiveness of the method. The results show that our method can expand the size of an existing rumor data set nearly by 200% and corresponding social context (i.e., conversational threads) by 100% with reasonable quality. Preliminary experiments with a state-of-the-art deep learning-based rumor detection model show that augmented data can alleviate over-fitting and class imbalance caused by limited train data and can help to train complex neural networks (NNs). With augmented data, the performance of rumor detection can be improved by 12.1% in terms of F-score. Our experiments also indicate that augmented training data can help to generalize rumor detection models on unseen rumors.
Tasks	Data Augmentation, Language Modelling
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07033v1
PDF	https://arxiv.org/pdf/1907.07033v1.pdf
PWC	https://paperswithcode.com/paper/neural-language-model-based-training-data
Repo	https://github.com/soojihan/Multitask4Veracity
Framework	none

Effective Rotation-invariant Point CNN with Spherical Harmonics kernels


Title	Effective Rotation-invariant Point CNN with Spherical Harmonics kernels
Authors	Adrien Poulenard, Marie-Julie Rakotosaona, Yann Ponty, Maks Ovsjanikov
Abstract	We present a novel rotation invariant architecture operating directly on point cloud data. We demonstrate how rotation invariance can be injected into a recently proposed point-based PCNN architecture, at all layers of the network, achieving invariance to both global shape transformations, and to local rotations on the level of patches or parts, useful when dealing with non-rigid objects. We achieve this by employing a spherical harmonics based kernel at different layers of the network, which is guaranteed to be invariant to rigid motions. We also introduce a more efficient pooling operation for PCNN using space-partitioning data-structures. This results in a flexible, simple and efficient architecture that achieves accurate results on challenging shape analysis tasks including classification and segmentation, without requiring data-augmentation, typically employed by non-invariant approaches.
Tasks	Data Augmentation
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11555v2
PDF	https://arxiv.org/pdf/1906.11555v2.pdf
PWC	https://paperswithcode.com/paper/effective-rotation-invariant-point-cnn-with
Repo	https://github.com/adrienPoulenard/SPHnet
Framework	tf

Compressive Transformers for Long-Range Sequence Modelling


Title	Compressive Transformers for Long-Range Sequence Modelling
Authors	Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
Abstract	We present the Compressive Transformer, an attentive sequence model which compresses past memories for long-range sequence learning. We find the Compressive Transformer obtains state-of-the-art language modelling results in the WikiText-103 and Enwik8 benchmarks, achieving 17.1 ppl and 0.97 bpc respectively. We also find it can model high-frequency speech effectively and can be used as a memory mechanism for RL, demonstrated on an object matching task. To promote the domain of long-range sequence learning, we propose a new open-vocabulary language modelling benchmark derived from books, PG-19.
Tasks	Language Modelling
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05507v1
PDF	https://arxiv.org/pdf/1911.05507v1.pdf
PWC	https://paperswithcode.com/paper/compressive-transformers-for-long-range-1
Repo	https://github.com/deepmind/pg19
Framework	none

Weakly Supervised Domain Detection


Title	Weakly Supervised Domain Detection
Authors	Yumo Xu, Mirella Lapata
Abstract	In this paper we introduce domain detection as a new natural language processing task. We argue that the ability to detect textual segments which are domain-heavy, i.e., sentences or phrases which are representative of and provide evidence for a given domain could enhance the robustness and portability of various text classification applications. We propose an encoder-detector framework for domain detection and bootstrap classifiers with multiple instance learning (MIL). The model is hierarchically organized and suited to multilabel classification. We demonstrate that despite learning with minimal supervision, our model can be applied to text spans of different granularities, languages, and genres. We also showcase the potential of domain detection for text summarization.
Tasks	Multiple Instance Learning, Text Classification, Text Summarization
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11499v1
PDF	https://arxiv.org/pdf/1907.11499v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-domain-detection
Repo	https://github.com/yumoxu/detnet
Framework	none

Domain Robustness in Neural Machine Translation


Title	Domain Robustness in Neural Machine Translation
Authors	Mathias Müller, Annette Rios, Rico Sennrich
Abstract	Translating text that diverges from the training domain is a key challenge for neural machine translation (NMT). Domain robustness - the generalization of models to unseen test domains - is low compared to statistical machine translation. In this paper, we investigate the performance of NMT on out-of-domain test sets, and ways to improve it. We observe that hallucination (translations that are fluent but unrelated to the source) is common in out-of-domain settings, and we empirically compare methods that improve adequacy (reconstruction), out-of-domain translation (subword regularization), or robustness against adversarial examples (defensive distillation), as well as noisy channel models. In experiments on German to English OPUS data, and German to Romansh, a low-resource scenario, we find that several methods improve domain robustness, reconstruction standing out as a method that not only improves automatic scores, but also shows improvements in a manual assessments of adequacy, albeit at some loss in fluency. However, out-of-domain performance is still relatively low and domain robustness remains an open problem.
Tasks	Machine Translation
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03109v1
PDF	https://arxiv.org/pdf/1911.03109v1.pdf
PWC	https://paperswithcode.com/paper/domain-robustness-in-neural-machine
Repo	https://github.com/ZurichNLP/domain-robustness
Framework	none

Evaluating Protein Transfer Learning with TAPE


Title	Evaluating Protein Transfer Learning with TAPE
Authors	Roshan Rao, Nicholas Bhattacharya, Neil Thomas, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, Yun S. Song
Abstract	Protein modeling is an increasingly popular area of machine learning research. Semi-supervised learning has emerged as an important paradigm in protein modeling due to the high cost of acquiring supervised protein labels, but the current literature is fragmented when it comes to datasets and standardized evaluation techniques. To facilitate progress in this field, we introduce the Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. We curate tasks into specific training, validation, and test splits to ensure that each task tests biologically relevant generalization that transfers to real-life scenarios. We benchmark a range of approaches to semi-supervised protein representation learning, which span recent work as well as canonical sequence learning techniques. We find that self-supervised pretraining is helpful for almost all models on all tasks, more than doubling performance in some cases. Despite this increase, in several cases features learned by self-supervised pretraining still lag behind features extracted by state-of-the-art non-neural techniques. This gap in performance suggests a huge opportunity for innovative architecture design and improved modeling paradigms that better capture the signal in biological sequences. TAPE will help the machine learning community focus effort on scientifically relevant problems. Toward this end, all data and code used to run these experiments are available at https://github.com/songlab-cal/tape.
Tasks	Representation Learning, Transfer Learning
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08230v1
PDF	https://arxiv.org/pdf/1906.08230v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-protein-transfer-learning-with
Repo	https://github.com/songlab-cal/tape-neurips2019
Framework	tf

PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures


Title	PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures
Authors	Mathieu Carrière, Frédéric Chazal, Yuichi Ike, Théo Lacombe, Martin Royer, Yuhei Umeda
Abstract	Persistence diagrams, the most common descriptors of Topological Data Analysis, encode topological properties of data and have already proved pivotal in many different applications of data science. However, since the (metric) space of persistence diagrams is not Hilbert, they end up being difficult inputs for most Machine Learning techniques. To address this concern, several vectorization methods have been put forward that embed persistence diagrams into either finite-dimensional Euclidean space or (implicit) infinite dimensional Hilbert space with kernels. In this work, we focus on persistence diagrams built on top of graphs. Relying on extended persistence theory and the so-called heat kernel signature, we show how graphs can be encoded by (extended) persistence diagrams in a provably stable way. We then propose a general and versatile framework for learning vectorizations of persistence diagrams, which encompasses most of the vectorization techniques used in the literature. We finally showcase the experimental strength of our setup by achieving competitive scores on classification tasks on real-life graph datasets.
Tasks	Graph Classification, Topological Data Analysis
Published	2019-04-20
URL	https://arxiv.org/abs/1904.09378v4
PDF	https://arxiv.org/pdf/1904.09378v4.pdf
PWC	https://paperswithcode.com/paper/190409378
Repo	https://github.com/MathieuCarriere/perslay
Framework	tf

A Two-Stream Siamese Neural Network for Vehicle Re-Identification by Using Non-Overlapping Cameras


Title	A Two-Stream Siamese Neural Network for Vehicle Re-Identification by Using Non-Overlapping Cameras
Authors	Icaro O. de Oliveira, Keiko V. O. Fonseca, Rodrigo Minetto
Abstract	We describe in this paper a Two-Stream Siamese Neural Network for vehicle re-identification. The proposed network is fed simultaneously with small coarse patches of the vehicle shape’s, with 96 x 96 pixels, in one stream, and fine features extracted from license plate patches, easily readable by humans, with 96 x 48 pixels, in the other one. Then, we combined the strengths of both streams by merging the Siamese distance descriptors with a sequence of fully connected layers, as an attempt to tackle a major problem in the field, false alarms caused by a huge number of car design and models with nearly the same appearance or by similar license plate strings. In our experiments, with 2 hours of videos containing 2982 vehicles, extracted from two low-cost cameras in the same roadway, 546 ft away, we achieved a F-measure and accuracy of 92.6% and 98.7%, respectively. We show that the proposed network, available at https://github.com/icarofua/siamese-two-stream, outperforms other One-Stream architectures, even if they use higher resolution image features.
Tasks	Vehicle Re-Identification
Published	2019-02-04
URL	https://arxiv.org/abs/1902.01496v4
PDF	https://arxiv.org/pdf/1902.01496v4.pdf
PWC	https://paperswithcode.com/paper/a-two-stream-siamese-neural-network-for
Repo	https://github.com/icarofua/siamese-two-stream
Framework	tf

Differentiable Linearized ADMM


Title	Differentiable Linearized ADMM
Authors	Xingyu Xie, Jianlong Wu, Zhisheng Zhong, Guangcan Liu, Zhouchen Lin
Abstract	Recently, a number of learning-based optimization methods that combine data-driven architectures with the classical optimization algorithms have been proposed and explored, showing superior empirical performance in solving various ill-posed inverse problems, but there is still a scarcity of rigorous analysis about the convergence behaviors of learning-based optimization. In particular, most existing analyses are specific to unconstrained problems but cannot apply to the more general cases where some variables of interest are subject to certain constraints. In this paper, we propose Differentiable Linearized ADMM (D-LADMM) for solving the problems with linear constraints. Specifically, D-LADMM is a K-layer LADMM inspired deep neural network, which is obtained by firstly introducing some learnable weights in the classical Linearized ADMM algorithm and then generalizing the proximal operator to some learnable activation function. Notably, we rigorously prove that there exist a set of learnable parameters for D-LADMM to generate globally converged solutions, and we show that those desired parameters can be attained by training D-LADMM in a proper way. To the best of our knowledge, we are the first to provide the convergence analysis for the learning-based optimization method on constrained problems.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06179v1
PDF	https://arxiv.org/pdf/1905.06179v1.pdf
PWC	https://paperswithcode.com/paper/differentiable-linearized-admm
Repo	https://github.com/zzs1994/D-LADMM
Framework	pytorch

Efficient Learning for Deep Quantum Neural Networks


Title	Efficient Learning for Deep Quantum Neural Networks
Authors	Kerstin Beer, Dmytro Bondarenko, Terry Farrelly, Tobias J. Osborne, Robert Salzmann, Ramona Wolf
Abstract	Neural networks enjoy widespread success in both research and industry and, with the imminent advent of quantum technology, it is now a crucial challenge to design quantum neural networks for fully quantum learning tasks. Here we propose the use of quantum neurons as a building block for quantum feed-forward neural networks capable of universal quantum computation. We describe the efficient training of these networks using the fidelity as a cost function and provide both classical and efficient quantum implementations. Our method allows for fast optimisation with reduced memory requirements: the number of qudits required scales with only the width, allowing the optimisation of deep networks. We benchmark our proposal for the quantum task of learning an unknown unitary and find remarkable generalisation behaviour and a striking robustness to noisy training data.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10445v1
PDF	http://arxiv.org/pdf/1902.10445v1.pdf
PWC	https://paperswithcode.com/paper/efficient-learning-for-deep-quantum-neural
Repo	https://github.com/R8monaW/DeepQNN
Framework	none

Tell Me What They’re Holding: Weakly-supervised Object Detection with Transferable Knowledge from Human-object Interaction


Title	Tell Me What They’re Holding: Weakly-supervised Object Detection with Transferable Knowledge from Human-object Interaction
Authors	Daesik Kim, Gyujeong Lee, Jisoo Jeong, Nojun Kwak
Abstract	In this work, we introduce a novel weakly supervised object detection (WSOD) paradigm to detect objects belonging to rare classes that have not many examples using transferable knowledge from human-object interactions (HOI). While WSOD shows lower performance than full supervision, we mainly focus on HOI as the main context which can strongly supervise complex semantics in images. Therefore, we propose a novel module called RRPN (relational region proposal network) which outputs an object-localizing attention map only with human poses and action verbs. In the source domain, we fully train an object detector and the RRPN with full supervision of HOI. With transferred knowledge about localization map from the trained RRPN, a new object detector can learn unseen objects with weak verbal supervision of HOI without bounding box annotations in the target domain. Because the RRPN is designed as an add-on type, we can apply it not only to the object detection but also to other domains such as semantic segmentation. The experimental results on HICO-DET dataset show the possibility that the proposed method can be a cheap alternative for the current supervised object detection paradigm. Moreover, qualitative results demonstrate that our model can properly localize unseen objects on HICO-DET and V-COCO datasets.
Tasks	Human-Object Interaction Detection, Object Detection, Semantic Segmentation, Weakly Supervised Object Detection
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08141v1
PDF	https://arxiv.org/pdf/1911.08141v1.pdf
PWC	https://paperswithcode.com/paper/tell-me-what-theyre-holding-weakly-supervised
Repo	https://github.com/vt-vl-lab/reading_group
Framework	none

Studying Cultural Differences in Emoji Usage across the East and the West


Title	Studying Cultural Differences in Emoji Usage across the East and the West
Authors	Sharath Chandra Guntuku, Mingyang Li, Louis Tay, Lyle H. Ungar
Abstract	Global acceptance of Emojis suggests a cross-cultural, normative use of Emojis. Meanwhile, nuances in Emoji use across cultures may also exist due to linguistic differences in expressing emotions and diversity in conceptualizing topics. Indeed, literature in cross-cultural psychology has found both normative and culture-specific ways in which emotions are expressed. In this paper, using social media, we compare the Emoji usage based on frequency, context, and topic associations across countries in the East (China and Japan) and the West (United States, United Kingdom, and Canada). Across the East and the West, our study examines a) similarities and differences on the usage of different categories of Emojis such as People, Food & Drink, Travel & Places etc., b) potential mapping of Emoji use differences with previously identified cultural differences in users’ expression about diverse concepts such as death, money emotions and family, and c) relative correspondence of validated psycho-linguistic categories with Ekman’s emotions. The analysis of Emoji use in the East and the West reveals recognizable normative and culture specific patterns. This research reveals the ways in which Emojis can be used for cross-cultural communication.
Tasks
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02671v1
PDF	http://arxiv.org/pdf/1904.02671v1.pdf
PWC	https://paperswithcode.com/paper/studying-cultural-differences-in-emoji-usage
Repo	https://github.com/tslmy/ICWSM2019
Framework	none

Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction


Title	Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction
Authors	Chen-Hsuan Lin, Oliver Wang, Bryan C. Russell, Eli Shechtman, Vladimir G. Kim, Matthew Fisher, Simon Lucey
Abstract	In this paper, we address the problem of 3D object mesh reconstruction from RGB videos. Our approach combines the best of multi-view geometric and data-driven methods for 3D reconstruction by optimizing object meshes for multi-view photometric consistency while constraining mesh deformations with a shape prior. We pose this as a piecewise image alignment problem for each mesh face projection. Our approach allows us to update shape parameters from the photometric error without any depth or mask information. Moreover, we show how to avoid a degeneracy of zero photometric gradients via rasterizing from a virtual viewpoint. We demonstrate 3D object mesh reconstruction results from both synthetic and real-world videos with our photometric mesh optimization, which is unachievable with either na"ive mesh generation networks or traditional pipelines of surface reconstruction without heavy manual post-processing.
Tasks	3D Object Reconstruction, 3D Reconstruction, Object Reconstruction
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08642v1
PDF	http://arxiv.org/pdf/1903.08642v1.pdf
PWC	https://paperswithcode.com/paper/photometric-mesh-optimization-for-video
Repo	https://github.com/chenhsuanlin/photometric-mesh-optim
Framework	pytorch

What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality Attention


Title	What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality Attention
Authors	Antonino Furnari, Giovanni Maria Farinella
Abstract	Egocentric action anticipation consists in understanding which objects the camera wearer will interact with in the near future and which actions they will perform. We tackle the problem proposing an architecture able to anticipate actions at multiple temporal scales using two LSTMs to 1) summarize the past, and 2) formulate predictions about the future. The input video is processed considering three complimentary modalities: appearance (RGB), motion (optical flow) and objects (object-based features). Modality-specific predictions are fused using a novel Modality ATTention (MATT) mechanism which learns to weigh modalities in an adaptive fashion. Extensive evaluations on two large-scale benchmark datasets show that our method outperforms prior art by up to +7% on the challenging EPIC-Kitchens dataset including more than 2500 actions, and generalizes to EGTEA Gaze+. Our approach is also shown to generalize to the tasks of early action recognition and action recognition. Our method is ranked first in the public leaderboard of the EPIC-Kitchens egocentric action anticipation challenge 2019. Please see our web pages for code and examples: http://iplab.dmi.unict.it/rulstm - https://github.com/fpv-iplab/rulstm.
Tasks	Optical Flow Estimation, Temporal Action Localization
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09035v2
PDF	https://arxiv.org/pdf/1905.09035v2.pdf
PWC	https://paperswithcode.com/paper/what-would-you-expect-anticipating-egocentric
Repo	https://github.com/antoninofurnari/rulstm
Framework	pytorch

`Project & Excite’ Modules for Segmentation of Volumetric Medical Scans


Title	`Project & Excite’ Modules for Segmentation of Volumetric Medical Scans \|
Authors	Anne-Marie Rickmann, Abhijit Guha Roy, Ignacio Sarasua, Nassir Navab, Christian Wachinger
Abstract	Fully Convolutional Neural Networks (F-CNNs) achieve state-of-the-art performance for image segmentation in medical imaging. Recently, squeeze and excitation (SE) modules and variations thereof have been introduced to recalibrate feature maps channel- and spatial-wise, which can boost performance while only minimally increasing model complexity. So far, the development of SE has focused on 2D images. In this paper, we propose `Project & Excite' (PE) modules that base upon the ideas of SE and extend them to operating on 3D volumetric images.` Project & Excite’ does not perform global average pooling, but squeezes feature maps along different slices of a tensor separately to retain more spatial information that is subsequently used in the excitation step. We demonstrate that PE modules can be easily integrated in 3D U-Net, boosting performance by 5% Dice points, while only increasing the model complexity by 2%. We evaluate the PE module on two challenging tasks, whole-brain segmentation of MRI scans and whole-body segmentation of CT scans. Code: https://github.com/ai-med/squeeze_and_excitation
Tasks	Brain Segmentation, Semantic Segmentation
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04649v2
PDF	https://arxiv.org/pdf/1906.04649v2.pdf
PWC	https://paperswithcode.com/paper/project-excite-modules-for-segmentation-of
Repo	https://github.com/ai-med/squeeze_and_excitation
Framework	pytorch