February 1, 2020

3246 words 16 mins read

Paper Group AWR 303

Revealing quantum chaos with machine learning. Competitive Gradient Descent. A review of domain adaptation without target labels. Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data. Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scar …

Revealing quantum chaos with machine learning


Title	Revealing quantum chaos with machine learning
Authors	Y. A. Kharkov, V. E. Sotskov, A. A. Karazeev, E. O. Kiktenko, A. K. Fedorov
Abstract	Understanding properties of quantum matter is an outstanding challenge in science. In this paper, we demonstrate how machine-learning methods can be successfully applied for the classification of various regimes in single-particle and many-body systems. We realize neural network algorithms that perform a classification between regular and chaotic behavior in quantum billiard models with remarkably high accuracy. We use the variational autoencoder for autosupervised classification of regular/chaotic wave functions, as well as demonstrating that variational autoencoders could be used as a tool for detection of anomalous quantum states, such as quantum scars. By taking this method further, we show that machine learning techniques allow us to pin down the transition from integrability to many-body quantum chaos in Heisenberg XXZ spin chains. For both cases, we confirm the existence of universal W shapes that characterize the transition. Our results pave the way for exploring the power of machine learning tools for revealing exotic phenomena in quantum many-body systems.
Tasks
Published	2019-02-25
URL	https://arxiv.org/abs/1902.09216v2
PDF	https://arxiv.org/pdf/1902.09216v2.pdf
PWC	https://paperswithcode.com/paper/revealing-quantum-chaos-with-machine-learning
Repo	https://github.com/yourball/QML_chaos
Framework	pytorch

Competitive Gradient Descent


Title	Competitive Gradient Descent
Authors	Florian Schäfer, Anima Anandkumar
Abstract	We introduce a new algorithm for the numerical computation of Nash equilibria of competitive two-player games. Our method is a natural generalization of gradient descent to the two-player setting where the update is given by the Nash equilibrium of a regularized bilinear local approximation of the underlying game. It avoids oscillatory and divergent behaviors seen in alternating gradient descent. Using numerical experiments and rigorous analysis, we provide a detailed comparison to methods based on \emph{optimism} and \emph{consensus} and show that our method avoids making any unnecessary changes to the gradient dynamics while achieving exponential (local) convergence for (locally) convex-concave zero sum games. Convergence and stability properties of our method are robust to strong interactions between the players, without adapting the stepsize, which is not the case with previous methods. In our numerical experiments on non-convex-concave problems, existing methods are prone to divergence and instability due to their sensitivity to interactions among the players, whereas we never observe divergence of our algorithm. The ability to choose larger stepsizes furthermore allows our algorithm to achieve faster convergence, as measured by the number of model evaluations.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12103v2
PDF	https://arxiv.org/pdf/1905.12103v2.pdf
PWC	https://paperswithcode.com/paper/competitive-gradient-descent
Repo	https://github.com/GopiKishan14/Reproducibility_Challenge_NeurIPS_2019
Framework	pytorch

A review of domain adaptation without target labels


Title	A review of domain adaptation without target labels
Authors	Wouter M. Kouw, Marco Loog
Abstract	Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.
Tasks	Domain Adaptation, Domain Generalization, Unsupervised Domain Adaptation
Published	2019-01-16
URL	https://arxiv.org/abs/1901.05335v2
PDF	https://arxiv.org/pdf/1901.05335v2.pdf
PWC	https://paperswithcode.com/paper/a-review-of-single-source-unsupervised-domain
Repo	https://github.com/wmkouw/libTLDA
Framework	none

Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data


Title	Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data
Authors	Xiangyu Yue, Yang Zhang, Sicheng Zhao, Alberto Sangiovanni-Vincentelli, Kurt Keutzer, Boqing Gong
Abstract	We propose to harness the potential of simulation for the semantic segmentation of real-world self-driving scenes in a domain generalization fashion. The segmentation network is trained without any data of target domains and tested on the unseen target domains. To this end, we propose a new approach of domain randomization and pyramid consistency to learn a model with high generalizability. First, we propose to randomize the synthetic images with the styles of real images in terms of visual appearances using auxiliary datasets, in order to effectively learn domain-invariant representations. Second, we further enforce pyramid consistency across different “stylized” images and within an image, in order to learn domain-invariant and scale-invariant features, respectively. Extensive experiments are conducted on the generalization from GTA and SYNTHIA to Cityscapes, BDDS and Mapillary; and our method achieves superior results over the state-of-the-art techniques. Remarkably, our generalization results are on par with or even better than those obtained by state-of-the-art simulation-to-real domain adaptation methods, which access the target domain data at training time.
Tasks	Domain Adaptation, Domain Generalization, Semantic Segmentation
Published	2019-09-02
URL	https://arxiv.org/abs/1909.00889v1
PDF	https://arxiv.org/pdf/1909.00889v1.pdf
PWC	https://paperswithcode.com/paper/domain-randomization-and-pyramid-consistency
Repo	https://github.com/xyyue/DRPC
Framework	pytorch

Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scarcity


Title	Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scarcity
Authors	Chris Emmery, Ben Verhoeven, Guy De Pauw, Gilles Jacobs, Cynthia Van Hee, Els Lefever, Bart Desmet, Véronique Hoste, Walter Daelemans
Abstract	The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data. Nevertheless, while computational power and affordability of resources continue to increase, the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability. In this paper, we further illustrate these issues, as we (i) evaluate many publicly available resources for this task and demonstrate difficulties with data collection. These predominantly yield small datasets that fail to capture the required complex social dynamics and impede direct comparison of progress. We (ii) conduct an extensive set of experiments that indicate a general lack of cross-domain generalization of classifiers trained on these sources, and openly provide this framework to replicate and extend our evaluation criteria. Finally, we (iii) present an effective crowdsourcing method: simulating real-life bullying scenarios in a lab setting generates plausible data that can be effectively used to enrich real data. This largely circumvents the restrictions on data that can be collected, and increases classifier performance. We believe these contributions can aid in improving the empirical practices of future research in the field.
Tasks	Domain Generalization
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11922v1
PDF	https://arxiv.org/pdf/1910.11922v1.pdf
PWC	https://paperswithcode.com/paper/current-limitations-in-cyberbullying
Repo	https://github.com/sweta20/Detecting-Cyberbullying-Across-SMPs
Framework	tf

Diametrical Risk Minimization: Theory and Computations


Title	Diametrical Risk Minimization: Theory and Computations
Authors	Matthew Norton, Johannes O. Royset
Abstract	The theoretical and empirical performance of Empirical Risk Minimization (ERM) often suffers when loss functions are poorly behaved with large Lipschitz moduli and spurious sharp minimizers. We propose and analyze a counterpart to ERM called Diametrical Risk Minimization (DRM), which accounts for worst-case empirical risks within neighborhoods in parameter space. DRM has generalization bounds that are independent of Lipschitz moduli for convex as well as nonconvex problems and it can be implemented using a practical algorithm based on stochastic gradient descent. Numerical results illustrate the ability of DRM to find quality solutions with low generalization error in chaotic landscapes from benchmark neural network classification problems with corrupted labels.
Tasks
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10844v2
PDF	https://arxiv.org/pdf/1910.10844v2.pdf
PWC	https://paperswithcode.com/paper/diametrical-risk-minimization-theory-and
Repo	https://github.com/matthew-norton/Diametrical_Learning
Framework	pytorch

Re-balancing Variational Autoencoder Loss for Molecule Sequence Generation


Title	Re-balancing Variational Autoencoder Loss for Molecule Sequence Generation
Authors	Chaochao Yan, Sheng Wang, Jinyu Yang, Tingyang Xu, Junzhou Huang
Abstract	Molecule generation is to design new molecules with specific chemical properties and further to optimize the desired chemical properties. Following previous work, we encode molecules into continuous vectors in the latent space and then decode the vectors into molecules under the variational autoencoder (VAE) framework. We investigate the posterior collapse problem of current RNN-based VAEs for molecule sequence generation. For the first time, we find that underestimated reconstruction loss leads to posterior collapse, and provide both theoretical and experimental evidence. We propose an effective and efficient solution to fix the problem and avoid posterior collapse. Without bells and whistles, our method achieves SOTA reconstruction accuracy and competitive validity on the ZINC 250K dataset. When generating 10,000 unique valid SMILES from random prior sampling, it costs JT-VAE1450s while our method only needs 9s. Our implementation is at https://github.com/chaoyan1037/Re-balanced-VAE.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00698v2
PDF	https://arxiv.org/pdf/1910.00698v2.pdf
PWC	https://paperswithcode.com/paper/re-balancing-variational-autoencoder-loss-for
Repo	https://github.com/chaoyan1037/Re-balanced-VAE
Framework	pytorch

Visual Natural Language Query Auto-Completion for Estimating Instance Probabilities


Title	Visual Natural Language Query Auto-Completion for Estimating Instance Probabilities
Authors	Samuel Sharpe, Jin Yan, Fan Wu, Iddo Drori
Abstract	We present a new task of query auto-completion for estimating instance probabilities. We complete a user query prefix conditioned upon an image. Given the complete query, we fine tune a BERT embedding for estimating probabilities of a broad set of instances. The resulting instance probabilities are used for selection while being agnostic to the segmentation or attention mechanism. Our results demonstrate that auto-completion using both language and vision performs better than using only language, and that fine tuning a BERT embedding allows to efficiently rank instances in the image. In the spirit of reproducible research we make our data, models, and code available.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04887v1
PDF	https://arxiv.org/pdf/1910.04887v1.pdf
PWC	https://paperswithcode.com/paper/visual-natural-language-query-auto-completion
Repo	https://github.com/ssharpe42/VNLQAC
Framework	tf

Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling


Title	Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling
Authors	Filippo Maria Bianchi, Daniele Grattarola, Lorenzo Livi, Cesare Alippi
Abstract	In graph neural networks (GNNs), pooling operators compute local summaries of input graphs to capture their global properties; in turn, they are fundamental operators for building deep GNNs that learn effective, hierarchical representations. In this work, we propose the Node Decimation Pooling (NDP), a pooling operator for GNNs that generates coarsened versions of a graph by leveraging on its topology only. During training, the GNN learns new representations for the vertices and fits them to a pyramid of coarsened graphs, which is computed in a pre-processing step. As theoretical contributions, we first demonstrate the equivalence between the MAXCUT partition and the node decimation procedure on which NDP is based. Then, we propose a procedure to sparsify the coarsened graphs for reducing the computational complexity in the GNN; we also demonstrate that it is possible to drop many edges without significantly altering the graph spectra of coarsened graphs. Experimental results show that NDP grants a significantly lower computational cost once compared to state-of-the-art graph pooling operators, while reaching, at the same time, competitive accuracy performance on a variety of graph classification tasks.
Tasks	Graph Classification, Representation Learning
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11436v1
PDF	https://arxiv.org/pdf/1910.11436v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-representation-learning-in-graph
Repo	https://github.com/danielegrattarola/decimation-pooling
Framework	tf

Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects


Title	Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects
Authors	Yang Xiao, Xuchong Qiu, Pierre-Alain Langlois, Mathieu Aubry, Renaud Marlet
Abstract	Most deep pose estimation methods need to be trained for specific object instances or categories. In this work we propose a completely generic deep pose estimation approach, which does not require the network to have been trained on relevant categories, nor objects in a category to have a canonical pose. We believe this is a crucial step to design robotic systems that can interact with new objects in the wild not belonging to a predefined category. Our main insight is to dynamically condition pose estimation with a representation of the 3D shape of the target object. More precisely, we train a Convolutional Neural Network that takes as input both a test image and a 3D model, and outputs the relative 3D pose of the object in the input image with respect to the 3D model. We demonstrate that our method boosts performances for supervised category pose estimation on standard benchmarks, namely Pascal3D+, ObjectNet3D and Pix3D, on which we provide results superior to the state of the art. More importantly, we show that our network trained on everyday man-made objects from ShapeNet generalizes without any additional training to completely new types of 3D objects by providing results on the LINEMOD dataset as well as on natural entities such as animals from ImageNet.
Tasks	Pose Estimation, Viewpoint Estimation
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05105v2
PDF	https://arxiv.org/pdf/1906.05105v2.pdf
PWC	https://paperswithcode.com/paper/pose-from-shape-deep-pose-estimation-for
Repo	https://github.com/YoungXIAO13/PoseFromShape
Framework	pytorch

How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations


Title	How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations
Authors	Betty van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
Abstract	Bidirectional Encoder Representations from Transformers (BERT) reach state-of-the-art results in a variety of Natural Language Processing tasks. However, understanding of their internal functioning is still insufficient and unsatisfactory. In order to better understand BERT and other Transformer-based models, we present a layer-wise analysis of BERT’s hidden states. Unlike previous research, which mainly focuses on explaining Transformer models by their attention weights, we argue that hidden states contain equally valuable information. Specifically, our analysis focuses on models fine-tuned on the task of Question Answering (QA) as an example of a complex downstream task. We inspect how QA models transform token vectors in order to find the correct answer. To this end, we apply a set of general and QA-specific probing tasks that reveal the information stored in each representation layer. Our qualitative analysis of hidden state visualizations provides additional insights into BERT’s reasoning process. Our results show that the transformations within BERT go through phases that are related to traditional pipeline tasks. The system can therefore implicitly incorporate task-specific information into its token representations. Furthermore, our analysis reveals that fine-tuning has little impact on the models’ semantic abilities and that prediction errors can be recognized in the vector representations of even early layers.
Tasks	Question Answering
Published	2019-09-11
URL	https://arxiv.org/abs/1909.04925v1
PDF	https://arxiv.org/pdf/1909.04925v1.pdf
PWC	https://paperswithcode.com/paper/how-does-bert-answer-questions-a-layer-wise
Repo	https://github.com/bvanaken/explain-BERT-QA
Framework	none

Learning Visual Dynamics Models of Rigid Objects using Relational Inductive Biases


Title	Learning Visual Dynamics Models of Rigid Objects using Relational Inductive Biases
Authors	Fabio Ferreira, Lin Shao, Tamim Asfour, Jeannette Bohg
Abstract	Endowing robots with human-like physical reasoning abilities remains challenging. We argue that existing methods often disregard spatio-temporal relations and by using Graph Neural Networks (GNNs) that incorporate a relational inductive bias, we can shift the learning process towards exploiting relations. In this work, we learn action-conditional forward dynamics models of a simulated manipulation task from visual observations involving cluttered and irregularly shaped objects. We investigate two GNN approaches and empirically assess their capability to generalize to scenarios with novel and an increasing number of objects. The first, Graph Networks (GN) based approach, considers explicitly defined edge attributes and not only does it consistently underperform an auto-encoder baseline that we modified to predict future states, our results indicate how different edge attributes can significantly influence the predictions. Consequently, we develop the Auto-Predictor that does not rely on explicitly defined edge attributes. It outperforms the baseline and the GN-based models. Overall, our results show the sensitivity of GNN-based approaches to the task representation, the efficacy of relational inductive biases and advocate choosing lightweight approaches that implicitly reason about relations over ones that leave these decisions to human designers.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03749v3
PDF	https://arxiv.org/pdf/1909.03749v3.pdf
PWC	https://paperswithcode.com/paper/learning-visual-dynamics-models-of-rigid
Repo	https://github.com/ferreirafabio/learningdynamics
Framework	tf

Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation


Title	Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Authors	Benet Oriol Sabat, Cristian Canton Ferrer, Xavier Giro-i-Nieto
Abstract	This work addresses the challenge of hate speech detection in Internet memes, and attempts using visual information to automatically detect hate speech, unlike any previous work of our knowledge. Memes are pixel-based multimedia documents that contain photos or illustrations together with phrases which, when combined, usually adopt a funny meaning. However, hate memes are also used to spread hate through social networks, so their automatic detection would help reduce their harmful societal impact. Our results indicate that the model can learn to detect some of the memes, but that the task is far from being solved with this simple architecture. While previous work focuses on linguistic hate speech, our experiments indicate how the visual modality can be much more informative for hate speech detection than the linguistic one in memes. In our experiments, we built a dataset of 5,020 memes to train and evaluate a multi-layer perceptron over the visual and language representations, whether independently or fused. The source code and mode and models are available https://github.com/imatge-upc/hate-speech-detection .
Tasks	Hate Speech Detection
Published	2019-10-05
URL	https://arxiv.org/abs/1910.02334v1
PDF	https://arxiv.org/pdf/1910.02334v1.pdf
PWC	https://paperswithcode.com/paper/hate-speech-in-pixels-detection-of-offensive
Repo	https://github.com/imatge-upc/hate-speech-detection
Framework	pytorch

Temporal Cycle-Consistency Learning


Title	Temporal Cycle-Consistency Learning
Authors	Debidatta Dwibedi, Yusuf Aytar, Jonathan Tompson, Pierre Sermanet, Andrew Zisserman
Abstract	We introduce a self-supervised representation learning method based on the task of temporal alignment between videos. The method trains a network using temporal cycle consistency (TCC), a differentiable cycle-consistency loss that can be used to find correspondences across time in multiple videos. The resulting per-frame embeddings can be used to align videos by simply matching frames using the nearest-neighbors in the learned embedding space. To evaluate the power of the embeddings, we densely label the Pouring and Penn Action video datasets for action phases. We show that (i) the learned embeddings enable few-shot classification of these action phases, significantly reducing the supervised training requirements; and (ii) TCC is complementary to other methods of self-supervised learning in videos, such as Shuffle and Learn and Time-Contrastive Networks. The embeddings are also used for a number of applications based on alignment (dense temporal correspondence) between video pairs, including transfer of metadata of synchronized modalities between videos (sounds, temporal semantic labels), synchronized playback of multiple videos, and anomaly detection. Project webpage: https://sites.google.com/view/temporal-cycle-consistency .
Tasks	Anomaly Detection, Representation Learning, Video Alignment
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07846v1
PDF	http://arxiv.org/pdf/1904.07846v1.pdf
PWC	https://paperswithcode.com/paper/temporal-cycle-consistency-learning
Repo	https://github.com/google-research/google-research/tree/master/tcc
Framework	tf

Strategies for Pre-training Graph Neural Networks


Title	Strategies for Pre-training Graph Neural Networks
Authors	Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, Jure Leskovec
Abstract	Many applications of machine learning require a model to make accurate pre-dictions on test examples that are distributionally different from training ones, while task-specific labels are scarce during training. An effective approach to this challenge is to pre-train a model on related tasks where data is abundant, and then fine-tune it on a downstream task of interest. While pre-training has been effective in many language and vision domains, it remains an open question how to effectively use pre-training on graph datasets. In this paper, we develop a new strategy and self-supervised methods for pre-training Graph Neural Networks (GNNs). The key to the success of our strategy is to pre-train an expressive GNN at the level of individual nodes as well as entire graphs so that the GNN can learn useful local and global representations simultaneously. We systematically study pre-training on multiple graph classification datasets. We find that naive strategies, which pre-train GNNs at the level of either entire graphs or individual nodes, give limited improvement and can even lead to negative transfer on many downstream tasks. In contrast, our strategy avoids negative transfer and improves generalization significantly across downstream tasks, leading up to 9.4% absolute improvements in ROC-AUC over non-pre-trained models and achieving state-of-the-art performance for molecular property prediction and protein function prediction.
Tasks	Graph Classification, Molecular Property Prediction, Protein Function Prediction, Representation Learning
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12265v3
PDF	https://arxiv.org/pdf/1905.12265v3.pdf
PWC	https://paperswithcode.com/paper/pre-training-graph-neural-networks
Repo	https://github.com/jacquesboitreaud/DeepFRED
Framework	pytorch