Paper Group AWR 303
Revealing quantum chaos with machine learning. Competitive Gradient Descent. A review of domain adaptation without target labels. Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data. Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scar …
Revealing quantum chaos with machine learning
Title | Revealing quantum chaos with machine learning |
Authors | Y. A. Kharkov, V. E. Sotskov, A. A. Karazeev, E. O. Kiktenko, A. K. Fedorov |
Abstract | Understanding properties of quantum matter is an outstanding challenge in science. In this paper, we demonstrate how machine-learning methods can be successfully applied for the classification of various regimes in single-particle and many-body systems. We realize neural network algorithms that perform a classification between regular and chaotic behavior in quantum billiard models with remarkably high accuracy. We use the variational autoencoder for autosupervised classification of regular/chaotic wave functions, as well as demonstrating that variational autoencoders could be used as a tool for detection of anomalous quantum states, such as quantum scars. By taking this method further, we show that machine learning techniques allow us to pin down the transition from integrability to many-body quantum chaos in Heisenberg XXZ spin chains. For both cases, we confirm the existence of universal W shapes that characterize the transition. Our results pave the way for exploring the power of machine learning tools for revealing exotic phenomena in quantum many-body systems. |
Tasks | |
Published | 2019-02-25 |
URL | https://arxiv.org/abs/1902.09216v2 |
https://arxiv.org/pdf/1902.09216v2.pdf | |
PWC | https://paperswithcode.com/paper/revealing-quantum-chaos-with-machine-learning |
Repo | https://github.com/yourball/QML_chaos |
Framework | pytorch |
Competitive Gradient Descent
Title | Competitive Gradient Descent |
Authors | Florian Schäfer, Anima Anandkumar |
Abstract | We introduce a new algorithm for the numerical computation of Nash equilibria of competitive two-player games. Our method is a natural generalization of gradient descent to the two-player setting where the update is given by the Nash equilibrium of a regularized bilinear local approximation of the underlying game. It avoids oscillatory and divergent behaviors seen in alternating gradient descent. Using numerical experiments and rigorous analysis, we provide a detailed comparison to methods based on \emph{optimism} and \emph{consensus} and show that our method avoids making any unnecessary changes to the gradient dynamics while achieving exponential (local) convergence for (locally) convex-concave zero sum games. Convergence and stability properties of our method are robust to strong interactions between the players, without adapting the stepsize, which is not the case with previous methods. In our numerical experiments on non-convex-concave problems, existing methods are prone to divergence and instability due to their sensitivity to interactions among the players, whereas we never observe divergence of our algorithm. The ability to choose larger stepsizes furthermore allows our algorithm to achieve faster convergence, as measured by the number of model evaluations. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12103v2 |
https://arxiv.org/pdf/1905.12103v2.pdf | |
PWC | https://paperswithcode.com/paper/competitive-gradient-descent |
Repo | https://github.com/GopiKishan14/Reproducibility_Challenge_NeurIPS_2019 |
Framework | pytorch |
A review of domain adaptation without target labels
Title | A review of domain adaptation without target labels |
Authors | Wouter M. Kouw, Marco Loog |
Abstract | Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research. |
Tasks | Domain Adaptation, Domain Generalization, Unsupervised Domain Adaptation |
Published | 2019-01-16 |
URL | https://arxiv.org/abs/1901.05335v2 |
https://arxiv.org/pdf/1901.05335v2.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-single-source-unsupervised-domain |
Repo | https://github.com/wmkouw/libTLDA |
Framework | none |
Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data
Title | Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization without Accessing Target Domain Data |
Authors | Xiangyu Yue, Yang Zhang, Sicheng Zhao, Alberto Sangiovanni-Vincentelli, Kurt Keutzer, Boqing Gong |
Abstract | We propose to harness the potential of simulation for the semantic segmentation of real-world self-driving scenes in a domain generalization fashion. The segmentation network is trained without any data of target domains and tested on the unseen target domains. To this end, we propose a new approach of domain randomization and pyramid consistency to learn a model with high generalizability. First, we propose to randomize the synthetic images with the styles of real images in terms of visual appearances using auxiliary datasets, in order to effectively learn domain-invariant representations. Second, we further enforce pyramid consistency across different “stylized” images and within an image, in order to learn domain-invariant and scale-invariant features, respectively. Extensive experiments are conducted on the generalization from GTA and SYNTHIA to Cityscapes, BDDS and Mapillary; and our method achieves superior results over the state-of-the-art techniques. Remarkably, our generalization results are on par with or even better than those obtained by state-of-the-art simulation-to-real domain adaptation methods, which access the target domain data at training time. |
Tasks | Domain Adaptation, Domain Generalization, Semantic Segmentation |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00889v1 |
https://arxiv.org/pdf/1909.00889v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-randomization-and-pyramid-consistency |
Repo | https://github.com/xyyue/DRPC |
Framework | pytorch |
Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scarcity
Title | Current Limitations in Cyberbullying Detection: on Evaluation Criteria, Reproducibility, and Data Scarcity |
Authors | Chris Emmery, Ben Verhoeven, Guy De Pauw, Gilles Jacobs, Cynthia Van Hee, Els Lefever, Bart Desmet, Véronique Hoste, Walter Daelemans |
Abstract | The detection of online cyberbullying has seen an increase in societal importance, popularity in research, and available open data. Nevertheless, while computational power and affordability of resources continue to increase, the access restrictions on high-quality data limit the applicability of state-of-the-art techniques. Consequently, much of the recent research uses small, heterogeneous datasets, without a thorough evaluation of applicability. In this paper, we further illustrate these issues, as we (i) evaluate many publicly available resources for this task and demonstrate difficulties with data collection. These predominantly yield small datasets that fail to capture the required complex social dynamics and impede direct comparison of progress. We (ii) conduct an extensive set of experiments that indicate a general lack of cross-domain generalization of classifiers trained on these sources, and openly provide this framework to replicate and extend our evaluation criteria. Finally, we (iii) present an effective crowdsourcing method: simulating real-life bullying scenarios in a lab setting generates plausible data that can be effectively used to enrich real data. This largely circumvents the restrictions on data that can be collected, and increases classifier performance. We believe these contributions can aid in improving the empirical practices of future research in the field. |
Tasks | Domain Generalization |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11922v1 |
https://arxiv.org/pdf/1910.11922v1.pdf | |
PWC | https://paperswithcode.com/paper/current-limitations-in-cyberbullying |
Repo | https://github.com/sweta20/Detecting-Cyberbullying-Across-SMPs |
Framework | tf |
Diametrical Risk Minimization: Theory and Computations
Title | Diametrical Risk Minimization: Theory and Computations |
Authors | Matthew Norton, Johannes O. Royset |
Abstract | The theoretical and empirical performance of Empirical Risk Minimization (ERM) often suffers when loss functions are poorly behaved with large Lipschitz moduli and spurious sharp minimizers. We propose and analyze a counterpart to ERM called Diametrical Risk Minimization (DRM), which accounts for worst-case empirical risks within neighborhoods in parameter space. DRM has generalization bounds that are independent of Lipschitz moduli for convex as well as nonconvex problems and it can be implemented using a practical algorithm based on stochastic gradient descent. Numerical results illustrate the ability of DRM to find quality solutions with low generalization error in chaotic landscapes from benchmark neural network classification problems with corrupted labels. |
Tasks | |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.10844v2 |
https://arxiv.org/pdf/1910.10844v2.pdf | |
PWC | https://paperswithcode.com/paper/diametrical-risk-minimization-theory-and |
Repo | https://github.com/matthew-norton/Diametrical_Learning |
Framework | pytorch |
Re-balancing Variational Autoencoder Loss for Molecule Sequence Generation
Title | Re-balancing Variational Autoencoder Loss for Molecule Sequence Generation |
Authors | Chaochao Yan, Sheng Wang, Jinyu Yang, Tingyang Xu, Junzhou Huang |
Abstract | Molecule generation is to design new molecules with specific chemical properties and further to optimize the desired chemical properties. Following previous work, we encode molecules into continuous vectors in the latent space and then decode the vectors into molecules under the variational autoencoder (VAE) framework. We investigate the posterior collapse problem of current RNN-based VAEs for molecule sequence generation. For the first time, we find that underestimated reconstruction loss leads to posterior collapse, and provide both theoretical and experimental evidence. We propose an effective and efficient solution to fix the problem and avoid posterior collapse. Without bells and whistles, our method achieves SOTA reconstruction accuracy and competitive validity on the ZINC 250K dataset. When generating 10,000 unique valid SMILES from random prior sampling, it costs JT-VAE1450s while our method only needs 9s. Our implementation is at https://github.com/chaoyan1037/Re-balanced-VAE. |
Tasks | |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00698v2 |
https://arxiv.org/pdf/1910.00698v2.pdf | |
PWC | https://paperswithcode.com/paper/re-balancing-variational-autoencoder-loss-for |
Repo | https://github.com/chaoyan1037/Re-balanced-VAE |
Framework | pytorch |
Visual Natural Language Query Auto-Completion for Estimating Instance Probabilities
Title | Visual Natural Language Query Auto-Completion for Estimating Instance Probabilities |
Authors | Samuel Sharpe, Jin Yan, Fan Wu, Iddo Drori |
Abstract | We present a new task of query auto-completion for estimating instance probabilities. We complete a user query prefix conditioned upon an image. Given the complete query, we fine tune a BERT embedding for estimating probabilities of a broad set of instances. The resulting instance probabilities are used for selection while being agnostic to the segmentation or attention mechanism. Our results demonstrate that auto-completion using both language and vision performs better than using only language, and that fine tuning a BERT embedding allows to efficiently rank instances in the image. In the spirit of reproducible research we make our data, models, and code available. |
Tasks | |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04887v1 |
https://arxiv.org/pdf/1910.04887v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-natural-language-query-auto-completion |
Repo | https://github.com/ssharpe42/VNLQAC |
Framework | tf |
Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling
Title | Hierarchical Representation Learning in Graph Neural Networks with Node Decimation Pooling |
Authors | Filippo Maria Bianchi, Daniele Grattarola, Lorenzo Livi, Cesare Alippi |
Abstract | In graph neural networks (GNNs), pooling operators compute local summaries of input graphs to capture their global properties; in turn, they are fundamental operators for building deep GNNs that learn effective, hierarchical representations. In this work, we propose the Node Decimation Pooling (NDP), a pooling operator for GNNs that generates coarsened versions of a graph by leveraging on its topology only. During training, the GNN learns new representations for the vertices and fits them to a pyramid of coarsened graphs, which is computed in a pre-processing step. As theoretical contributions, we first demonstrate the equivalence between the MAXCUT partition and the node decimation procedure on which NDP is based. Then, we propose a procedure to sparsify the coarsened graphs for reducing the computational complexity in the GNN; we also demonstrate that it is possible to drop many edges without significantly altering the graph spectra of coarsened graphs. Experimental results show that NDP grants a significantly lower computational cost once compared to state-of-the-art graph pooling operators, while reaching, at the same time, competitive accuracy performance on a variety of graph classification tasks. |
Tasks | Graph Classification, Representation Learning |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.11436v1 |
https://arxiv.org/pdf/1910.11436v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-representation-learning-in-graph |
Repo | https://github.com/danielegrattarola/decimation-pooling |
Framework | tf |
Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects
Title | Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects |
Authors | Yang Xiao, Xuchong Qiu, Pierre-Alain Langlois, Mathieu Aubry, Renaud Marlet |
Abstract | Most deep pose estimation methods need to be trained for specific object instances or categories. In this work we propose a completely generic deep pose estimation approach, which does not require the network to have been trained on relevant categories, nor objects in a category to have a canonical pose. We believe this is a crucial step to design robotic systems that can interact with new objects in the wild not belonging to a predefined category. Our main insight is to dynamically condition pose estimation with a representation of the 3D shape of the target object. More precisely, we train a Convolutional Neural Network that takes as input both a test image and a 3D model, and outputs the relative 3D pose of the object in the input image with respect to the 3D model. We demonstrate that our method boosts performances for supervised category pose estimation on standard benchmarks, namely Pascal3D+, ObjectNet3D and Pix3D, on which we provide results superior to the state of the art. More importantly, we show that our network trained on everyday man-made objects from ShapeNet generalizes without any additional training to completely new types of 3D objects by providing results on the LINEMOD dataset as well as on natural entities such as animals from ImageNet. |
Tasks | Pose Estimation, Viewpoint Estimation |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05105v2 |
https://arxiv.org/pdf/1906.05105v2.pdf | |
PWC | https://paperswithcode.com/paper/pose-from-shape-deep-pose-estimation-for |
Repo | https://github.com/YoungXIAO13/PoseFromShape |
Framework | pytorch |
How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations
Title | How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations |
Authors | Betty van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers |
Abstract | Bidirectional Encoder Representations from Transformers (BERT) reach state-of-the-art results in a variety of Natural Language Processing tasks. However, understanding of their internal functioning is still insufficient and unsatisfactory. In order to better understand BERT and other Transformer-based models, we present a layer-wise analysis of BERT’s hidden states. Unlike previous research, which mainly focuses on explaining Transformer models by their attention weights, we argue that hidden states contain equally valuable information. Specifically, our analysis focuses on models fine-tuned on the task of Question Answering (QA) as an example of a complex downstream task. We inspect how QA models transform token vectors in order to find the correct answer. To this end, we apply a set of general and QA-specific probing tasks that reveal the information stored in each representation layer. Our qualitative analysis of hidden state visualizations provides additional insights into BERT’s reasoning process. Our results show that the transformations within BERT go through phases that are related to traditional pipeline tasks. The system can therefore implicitly incorporate task-specific information into its token representations. Furthermore, our analysis reveals that fine-tuning has little impact on the models’ semantic abilities and that prediction errors can be recognized in the vector representations of even early layers. |
Tasks | Question Answering |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.04925v1 |
https://arxiv.org/pdf/1909.04925v1.pdf | |
PWC | https://paperswithcode.com/paper/how-does-bert-answer-questions-a-layer-wise |
Repo | https://github.com/bvanaken/explain-BERT-QA |
Framework | none |
Learning Visual Dynamics Models of Rigid Objects using Relational Inductive Biases
Title | Learning Visual Dynamics Models of Rigid Objects using Relational Inductive Biases |
Authors | Fabio Ferreira, Lin Shao, Tamim Asfour, Jeannette Bohg |
Abstract | Endowing robots with human-like physical reasoning abilities remains challenging. We argue that existing methods often disregard spatio-temporal relations and by using Graph Neural Networks (GNNs) that incorporate a relational inductive bias, we can shift the learning process towards exploiting relations. In this work, we learn action-conditional forward dynamics models of a simulated manipulation task from visual observations involving cluttered and irregularly shaped objects. We investigate two GNN approaches and empirically assess their capability to generalize to scenarios with novel and an increasing number of objects. The first, Graph Networks (GN) based approach, considers explicitly defined edge attributes and not only does it consistently underperform an auto-encoder baseline that we modified to predict future states, our results indicate how different edge attributes can significantly influence the predictions. Consequently, we develop the Auto-Predictor that does not rely on explicitly defined edge attributes. It outperforms the baseline and the GN-based models. Overall, our results show the sensitivity of GNN-based approaches to the task representation, the efficacy of relational inductive biases and advocate choosing lightweight approaches that implicitly reason about relations over ones that leave these decisions to human designers. |
Tasks | |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03749v3 |
https://arxiv.org/pdf/1909.03749v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-visual-dynamics-models-of-rigid |
Repo | https://github.com/ferreirafabio/learningdynamics |
Framework | tf |
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Title | Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation |
Authors | Benet Oriol Sabat, Cristian Canton Ferrer, Xavier Giro-i-Nieto |
Abstract | This work addresses the challenge of hate speech detection in Internet memes, and attempts using visual information to automatically detect hate speech, unlike any previous work of our knowledge. Memes are pixel-based multimedia documents that contain photos or illustrations together with phrases which, when combined, usually adopt a funny meaning. However, hate memes are also used to spread hate through social networks, so their automatic detection would help reduce their harmful societal impact. Our results indicate that the model can learn to detect some of the memes, but that the task is far from being solved with this simple architecture. While previous work focuses on linguistic hate speech, our experiments indicate how the visual modality can be much more informative for hate speech detection than the linguistic one in memes. In our experiments, we built a dataset of 5,020 memes to train and evaluate a multi-layer perceptron over the visual and language representations, whether independently or fused. The source code and mode and models are available https://github.com/imatge-upc/hate-speech-detection . |
Tasks | Hate Speech Detection |
Published | 2019-10-05 |
URL | https://arxiv.org/abs/1910.02334v1 |
https://arxiv.org/pdf/1910.02334v1.pdf | |
PWC | https://paperswithcode.com/paper/hate-speech-in-pixels-detection-of-offensive |
Repo | https://github.com/imatge-upc/hate-speech-detection |
Framework | pytorch |
Temporal Cycle-Consistency Learning
Title | Temporal Cycle-Consistency Learning |
Authors | Debidatta Dwibedi, Yusuf Aytar, Jonathan Tompson, Pierre Sermanet, Andrew Zisserman |
Abstract | We introduce a self-supervised representation learning method based on the task of temporal alignment between videos. The method trains a network using temporal cycle consistency (TCC), a differentiable cycle-consistency loss that can be used to find correspondences across time in multiple videos. The resulting per-frame embeddings can be used to align videos by simply matching frames using the nearest-neighbors in the learned embedding space. To evaluate the power of the embeddings, we densely label the Pouring and Penn Action video datasets for action phases. We show that (i) the learned embeddings enable few-shot classification of these action phases, significantly reducing the supervised training requirements; and (ii) TCC is complementary to other methods of self-supervised learning in videos, such as Shuffle and Learn and Time-Contrastive Networks. The embeddings are also used for a number of applications based on alignment (dense temporal correspondence) between video pairs, including transfer of metadata of synchronized modalities between videos (sounds, temporal semantic labels), synchronized playback of multiple videos, and anomaly detection. Project webpage: https://sites.google.com/view/temporal-cycle-consistency . |
Tasks | Anomaly Detection, Representation Learning, Video Alignment |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07846v1 |
http://arxiv.org/pdf/1904.07846v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-cycle-consistency-learning |
Repo | https://github.com/google-research/google-research/tree/master/tcc |
Framework | tf |
Strategies for Pre-training Graph Neural Networks
Title | Strategies for Pre-training Graph Neural Networks |
Authors | Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, Jure Leskovec |
Abstract | Many applications of machine learning require a model to make accurate pre-dictions on test examples that are distributionally different from training ones, while task-specific labels are scarce during training. An effective approach to this challenge is to pre-train a model on related tasks where data is abundant, and then fine-tune it on a downstream task of interest. While pre-training has been effective in many language and vision domains, it remains an open question how to effectively use pre-training on graph datasets. In this paper, we develop a new strategy and self-supervised methods for pre-training Graph Neural Networks (GNNs). The key to the success of our strategy is to pre-train an expressive GNN at the level of individual nodes as well as entire graphs so that the GNN can learn useful local and global representations simultaneously. We systematically study pre-training on multiple graph classification datasets. We find that naive strategies, which pre-train GNNs at the level of either entire graphs or individual nodes, give limited improvement and can even lead to negative transfer on many downstream tasks. In contrast, our strategy avoids negative transfer and improves generalization significantly across downstream tasks, leading up to 9.4% absolute improvements in ROC-AUC over non-pre-trained models and achieving state-of-the-art performance for molecular property prediction and protein function prediction. |
Tasks | Graph Classification, Molecular Property Prediction, Protein Function Prediction, Representation Learning |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12265v3 |
https://arxiv.org/pdf/1905.12265v3.pdf | |
PWC | https://paperswithcode.com/paper/pre-training-graph-neural-networks |
Repo | https://github.com/jacquesboitreaud/DeepFRED |
Framework | pytorch |