February 1, 2020

3155 words 15 mins read

Paper Group AWR 265

Continuous Meta-Learning without Tasks. Exact Combinatorial Optimization with Graph Convolutional Neural Networks. Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets. Metric Learning for Dynamic Text Classification. Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling. Rethinking Zero …

Continuous Meta-Learning without Tasks


Title	Continuous Meta-Learning without Tasks
Authors	James Harrison, Apoorva Sharma, Chelsea Finn, Marco Pavone
Abstract	Meta-learning is a promising strategy for learning to efficiently learn within new tasks, using data gathered from a distribution of tasks. However, the meta-learning literature thus far has focused on the task segmented setting, where at train-time, offline data is assumed to be split according to the underlying task, and at test-time, the algorithms are optimized to learn in a single task. In this work, we enable the application of generic meta-learning algorithms to settings where this task segmentation is unavailable, such as continual online learning with a time-varying task. We present meta-learning via online changepoint analysis (MOCA), an approach which augments a meta-learning algorithm with a differentiable Bayesian changepoint detection scheme. The framework allows both training and testing directly on time series data without segmenting it into discrete tasks. We demonstrate the utility of this approach on a nonlinear meta-regression benchmark as well as two meta-image-classification benchmarks.
Tasks	Image Classification, Meta-Learning, Time Series
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08866v1
PDF	https://arxiv.org/pdf/1912.08866v1.pdf
PWC	https://paperswithcode.com/paper/continuous-meta-learning-without-tasks-1
Repo	https://github.com/StanfordASL/moca
Framework	pytorch

Exact Combinatorial Optimization with Graph Convolutional Neural Networks


Title	Exact Combinatorial Optimization with Graph Convolutional Neural Networks
Authors	Maxime Gasse, Didier Chételat, Nicola Ferroni, Laurent Charlin, Andrea Lodi
Abstract	Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose a new graph convolutional neural network model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs. We train our model via imitation learning from the strong branching expert rule, and demonstrate on a series of hard problems that our approach produces policies that improve upon state-of-the-art machine-learning methods for branching and generalize to instances significantly larger than seen during training. Moreover, we improve for the first time over expert-designed branching rules implemented in a state-of-the-art solver on large problems. Code for reproducing all the experiments can be found at https://github.com/ds4dm/learn2branch.
Tasks	Combinatorial Optimization, Imitation Learning
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01629v3
PDF	https://arxiv.org/pdf/1906.01629v3.pdf
PWC	https://paperswithcode.com/paper/exact-combinatorial-optimization-with-graph
Repo	https://github.com/ds4dm/learn2branch
Framework	tf

Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets


Title	Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets
Authors	Riccardo Volpi, Vittorio Murino
Abstract	We are concerned with the vulnerability of computer vision models to distributional shifts. We formulate a combinatorial optimization problem that allows evaluating the regions in the image space where a given model is more vulnerable, in terms of image transformations applied to the input, and face it with standard search algorithms. We further embed this idea in a training procedure, where we define new data augmentation rules according to the image transformations that the current model is most vulnerable to, over iterations. An empirical evaluation on classification and semantic segmentation problems suggests that the devised algorithm allows to train models that are more robust against content-preserving image manipulations and, in general, against distributional shifts.
Tasks	Combinatorial Optimization, Data Augmentation, Semantic Segmentation
Published	2019-03-28
URL	https://arxiv.org/abs/1903.11900v2
PDF	https://arxiv.org/pdf/1903.11900v2.pdf
PWC	https://paperswithcode.com/paper/model-vulnerability-to-distributional-shifts
Repo	https://github.com/ricvolpi/domain-shift-robustness
Framework	tf

Metric Learning for Dynamic Text Classification


Title	Metric Learning for Dynamic Text Classification
Authors	Jeremy Wohlwend, Ethan R. Elenberg, Samuel Altschul, Shawn Henry, Tao Lei
Abstract	Traditional text classifiers are limited to predicting over a fixed set of labels. However, in many real-world applications the label set is frequently changing. For example, in intent classification, new intents may be added over time while others are removed. We propose to address the problem of dynamic text classification by replacing the traditional, fixed-size output layer with a learned, semantically meaningful metric space. Here the distances between textual inputs are optimized to perform nearest-neighbor classification across overlapping label sets. Changing the label set does not involve removing parameters, but rather simply adding or removing support points in the metric space. Then the learned metric can be fine-tuned with only a few additional training examples. We demonstrate that this simple strategy is robust to changes in the label space. Furthermore, our results show that learning a non-Euclidean metric can improve performance in the low data regime, suggesting that further work on metric spaces may benefit low-resource research.
Tasks	Intent Classification, Metric Learning, Text Classification
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01026v1
PDF	https://arxiv.org/pdf/1911.01026v1.pdf
PWC	https://paperswithcode.com/paper/metric-learning-for-dynamic-text-1
Repo	https://github.com/asappresearch/dynamic-classification
Framework	none

Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling


Title	Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling
Authors	Xiaochuang Han, Jacob Eisenstein
Abstract	Contextualized word embeddings such as ELMo and BERT provide a foundation for strong performance across a wide range of natural language processing tasks by pretraining on large corpora of unlabeled text. However, the applicability of this approach is unknown when the target domain varies substantially from the pretraining corpus. We are specifically interested in the scenario in which labeled data is available in only a canonical source domain such as newstext, and the target domain is distinct from both the labeled and pretraining texts. To address this scenario, we propose domain-adaptive fine-tuning, in which the contextualized embeddings are adapted by masked language modeling on text from the target domain. We test this approach on sequence labeling in two challenging domains: Early Modern English and Twitter. Both domains differ substantially from existing pretraining corpora, and domain-adaptive fine-tuning yields substantial improvements over strong BERT baselines, with particularly impressive results on out-of-vocabulary words. We conclude that domain-adaptive fine-tuning offers a simple and effective approach for the unsupervised adaptation of sequence labeling to difficult new domains.
Tasks	Domain Adaptation, Language Modelling, Part-Of-Speech Tagging, Unsupervised Domain Adaptation, Word Embeddings
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02817v2
PDF	https://arxiv.org/pdf/1904.02817v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-domain-adaptation-of
Repo	https://github.com/xhan77/AdaptaBERT
Framework	pytorch

Rethinking Zero-Shot Learning: A Conditional Visual Classification Perspective


Title	Rethinking Zero-Shot Learning: A Conditional Visual Classification Perspective
Authors	Kai Li, Martin Renqiang Min, Yun Fu
Abstract	Zero-shot learning (ZSL) aims to recognize instances of unseen classes solely based on the semantic descriptions of the classes. Existing algorithms usually formulate it as a semantic-visual correspondence problem, by learning mappings from one feature space to the other. Despite being reasonable, previous approaches essentially discard the highly precious discriminative power of visual features in an implicit way, and thus produce undesirable results. We instead reformulate ZSL as a conditioned visual classification problem, i.e., classifying visual features based on the classifiers learned from the semantic descriptions. With this reformulation, we develop algorithms targeting various ZSL settings: For the conventional setting, we propose to train a deep neural network that directly generates visual feature classifiers from the semantic attributes with an episode-based training scheme; For the generalized setting, we concatenate the learned highly discriminative classifiers for seen classes and the generated classifiers for unseen classes to classify visual features of all classes; For the transductive setting, we exploit unlabeled data to effectively calibrate the classifier generator using a novel learning-without-forgetting self-training mechanism and guide the process by a robust generalized cross-entropy loss. Extensive experiments show that our proposed algorithms significantly outperform state-of-the-art methods by large margins on most benchmark datasets in all the ZSL settings. Our code is available at \url{https://github.com/kailigo/cvcZSL}
Tasks	Zero-Shot Learning
Published	2019-09-13
URL	https://arxiv.org/abs/1909.05995v2
PDF	https://arxiv.org/pdf/1909.05995v2.pdf
PWC	https://paperswithcode.com/paper/rethinking-zero-shot-learning-a-conditional
Repo	https://github.com/kailigo/cvcZSL
Framework	pytorch

A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers


Title	A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers
Authors	Rafael Gomes Mantovani, André Luis Debiaso Rossi, Edesio Alcobaça, Joaquin Vanschoren, André Carlos Ponce de Leon Ferreira de Carvalho
Abstract	For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify exactly when it is better to use default values and when to tune hyperparameters for each new dataset. Besides, an in-depth analysis is performed to understand what they take into account for their decisions, providing useful insights. An extensive analysis of different categories of meta-features, meta-learners, and setups across 156 datasets is performed. Results show that it is possible to accurately predict when tuning will significantly improve the performance of the induced models. The proposed system reduces the time spent on optimization processes, without reducing the predictive performance of the induced models (when compared with the ones obtained using tuned hyperparameters). We also explain the decision-making process of the meta-learners in terms of linear separability-based hypotheses. Although this analysis is focused on the tuning of Support Vector Machines, it can also be applied to other algorithms, as shown in experiments performed with decision trees.
Tasks	Decision Making, Meta-Learning, Recommendation Systems
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01684v2
PDF	https://arxiv.org/pdf/1906.01684v2.pdf
PWC	https://paperswithcode.com/paper/a-meta-learning-recommender-system-for
Repo	https://github.com/rgmantovani/mtlSuite
Framework	none

$360^o$ Surface Regression with a Hyper-Sphere Loss


Title	$360^o$ Surface Regression with a Hyper-Sphere Loss
Authors	Antonis Karakottas, Nikolaos Zioulis, Stamatis Samaras, Dimitrios Ataloglou, Vasileios Gkitsas, Dimitrios Zarpalas, Petros Daras
Abstract	Omnidirectional vision is becoming increasingly relevant as more efficient $360^o$ image acquisition is now possible. However, the lack of annotated $360^o$ datasets has hindered the application of deep learning techniques on spherical content. This is further exaggerated on tasks where ground truth acquisition is difficult, such as monocular surface estimation. While recent research approaches on the 2D domain overcome this challenge by relying on generating normals from depth cues using RGB-D sensors, this is very difficult to apply on the spherical domain. In this work, we address the unavailability of sufficient $360^o$ ground truth normal data, by leveraging existing 3D datasets and remodelling them via rendering. We present a dataset of $360^o$ images of indoor spaces with their corresponding ground truth surface normal, and train a deep convolutional neural network (CNN) on the task of monocular 360 surface estimation. We achieve this by minimizing a novel angular loss function defined on the hyper-sphere using simple quaternion algebra. We put an effort to appropriately compare with other state of the art methods trained on planar datasets and finally, present the practical applicability of our trained model on a spherical image re-lighting task using completely unseen data by qualitatively showing the promising generalization ability of our dataset and model. The dataset is available at: vcl3d.github.io/HyperSphereSurfaceRegression.
Tasks	Surface Normals Estimation
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07043v1
PDF	https://arxiv.org/pdf/1909.07043v1.pdf
PWC	https://paperswithcode.com/paper/360o-surface-regression-with-a-hyper-sphere
Repo	https://github.com/VCL3D/SphericalViewSynthesis
Framework	pytorch

Deep Reinforcement Learning Control of Quantum Cartpoles


Title	Deep Reinforcement Learning Control of Quantum Cartpoles
Authors	Zhikang T. Wang, Yuto Ashida, Masahito Ueda
Abstract	We generalize a standard benchmark of reinforcement learning, the classical cartpole balancing problem, to the quantum regime by stabilizing a particle in an unstable potential through measurement and feedback. We use the state-of-the-art deep reinforcement learning to stabilize the quantum cartpole and find that our deep learning approach performs comparably to or better than other strategies in standard control theory. Our approach also applies to measurement-feedback cooling of quantum oscillators, showing the applicability of deep learning to general continuous-space quantum control.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09200v2
PDF	https://arxiv.org/pdf/1910.09200v2.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-control-of
Repo	https://github.com/Z-T-WANG/DeepReinforcementLearningControlOfQuantumCartpoles
Framework	none

Reinforcement Knowledge Graph Reasoning for Explainable Recommendation


Title	Reinforcement Knowledge Graph Reasoning for Explainable Recommendation
Authors	Yikun Xian, Zuohui Fu, S. Muthukrishnan, Gerard de Melo, Yongfeng Zhang
Abstract	Recent advances in personalized recommendation have sparked great interest in the exploitation of rich structured information provided by knowledge graphs. Unlike most existing approaches that only focus on leveraging knowledge graphs for more accurate recommendation, we perform explicit reasoning with knowledge for decision making so that the recommendations are generated and supported by an interpretable causal inference procedure. To this end, we propose a method called Policy-Guided Path Reasoning (PGPR), which couples recommendation and interpretability by providing actual paths in a knowledge graph. Our contributions include four aspects. We first highlight the significance of incorporating knowledge graphs into recommendation to formally define and interpret the reasoning process. Second, we propose a reinforcement learning (RL) approach featuring an innovative soft reward strategy, user-conditional action pruning and a multi-hop scoring function. Third, we design a policy-guided graph search algorithm to efficiently and effectively sample reasoning paths for recommendation. Finally, we extensively evaluate our method on several large-scale real-world benchmark datasets, obtaining favorable results compared with state-of-the-art methods.
Tasks	Causal Inference, Decision Making, Knowledge Graphs
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05237v1
PDF	https://arxiv.org/pdf/1906.05237v1.pdf
PWC	https://paperswithcode.com/paper/reinforcement-knowledge-graph-reasoning-for
Repo	https://github.com/orcax/PGPR
Framework	pytorch

Constructing Artificial Data for Fine-tuning for Low-Resource Biomedical Text Tagging with Applications in PICO Annotation


Title	Constructing Artificial Data for Fine-tuning for Low-Resource Biomedical Text Tagging with Applications in PICO Annotation
Authors	Gaurav Singh, Zahra Sabet, John Shawe-Taylor, James Thomas
Abstract	Biomedical text tagging systems are plagued by the dearth of labeled training data. There have been recent attempts at using pre-trained encoders to deal with this issue. Pre-trained encoder provides representation of the input text which is then fed to task-specific layers for classification. The entire network is fine-tuned on the labeled data from the target task. Unfortunately, a low-resource biomedical task often has too few labeled instances for satisfactory fine-tuning. Also, if the label space is large, it contains few or no labeled instances for majority of the labels. Most biomedical tagging systems treat labels as indexes, ignoring the fact that these labels are often concepts expressed in natural language e.g. `Appearance of lesion on brain imaging’. To address these issues, we propose constructing extra labeled instances using label-text (i.e. label’s name) as input for the corresponding label-index (i.e. label’s index). In fact, we propose a number of strategies for manufacturing multiple artificial labeled instances from a single label. The network is then fine-tuned on a combination of real and these newly constructed artificial labeled instances. We evaluate the proposed approach on an important low-resource biomedical task called \textit{PICO annotation}, which requires tagging raw text describing clinical trials with labels corresponding to different aspects of the trial i.e. PICO (Population, Intervention/Control, Outcome) characteristics of the trial. Our empirical results show that the proposed method achieves a new state-of-the-art performance for PICO annotation with very significant improvements over competitive baselines. \|
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09255v3
PDF	https://arxiv.org/pdf/1910.09255v3.pdf
PWC	https://paperswithcode.com/paper/constructing-artificial-data-for-fine-tuning
Repo	https://github.com/gauravsc/pico-tagging
Framework	pytorch

A Multi-Phase Gammatone Filterbank for Speech Separation via TasNet


Title	A Multi-Phase Gammatone Filterbank for Speech Separation via TasNet
Authors	David Ditter, Timo Gerkmann
Abstract	In this work, we investigate if the learned encoder of the end-to-end convolutional time domain audio separation network (Conv-TasNet) is the key to its recent success, or if the encoder can just as well be replaced by a deterministic hand-crafted filterbank. Motivated by the resemblance of the trained encoder of Conv-TasNet to auditory filterbanks, we propose to employ a deterministic gammatone filterbank. In contrast to a common gammatone filterbank, our filters are restricted to 2 ms length to allow for low-latency processing. Inspired by the encoder learned by Conv-TasNet, in addition to the logarithmically spaced filters, the proposed filterbank holds multiple gammatone filters at the same center frequency with varying phase shifts. We show that replacing the learned encoder with our proposed multi-phase gammatone filterbank (MP-GTF) even leads to a scale-invariant source-to-noise ratio (SI-SNR) improvement of 0.7 dB. Furthermore, in contrast to using the learned encoder we show that the number of filters can be reduced from 512 to 128 without loss of performance.
Tasks	Speech Separation
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11615v2
PDF	https://arxiv.org/pdf/1910.11615v2.pdf
PWC	https://paperswithcode.com/paper/a-multi-phase-gammatone-filterbank-for-speech
Repo	https://github.com/sp-uhh/mp-gtf
Framework	none

Understanding Attention and Generalization in Graph Neural Networks


Title	Understanding Attention and Generalization in Graph Neural Networks
Authors	Boris Knyazev, Graham W. Taylor, Mohamed R. Amer
Abstract	We aim to better understand attention over nodes in graph neural networks (GNNs) and identify factors influencing its effectiveness. We particularly focus on the ability of attention GNNs to generalize to larger, more complex or noisy graphs. Motivated by insights from the work on Graph Isomorphism Networks, we design simple graph reasoning tasks that allow us to study attention in a controlled environment. We find that under typical conditions the effect of attention is negligible or even harmful, but under certain conditions it provides an exceptional gain in performance of more than 60% in some of our classification tasks. Satisfying these conditions in practice is challenging and often requires optimal initialization or supervised training of attention. We propose an alternative recipe and train attention in a weakly-supervised fashion that approaches the performance of supervised models, and, compared to unsupervised models, improves results on several synthetic as well as real datasets. Source code and datasets are available at https://github.com/bknyaz/graph_attention_pool.
Tasks	Graph Classification
Published	2019-05-08
URL	https://arxiv.org/abs/1905.02850v3
PDF	https://arxiv.org/pdf/1905.02850v3.pdf
PWC	https://paperswithcode.com/paper/understanding-attention-in-graph-neural
Repo	https://github.com/bknyaz/graph_attention_pool
Framework	pytorch

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples


Title	Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples
Authors	Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle
Abstract	Few-shot classification refers to learning a classifier for new classes given only a few examples. While a plethora of models have emerged to tackle it, we find the procedure and datasets that are used to assess their progress lacking. To address this limitation, we propose Meta-Dataset: a new benchmark for training and evaluating models that is large-scale, consists of diverse datasets, and presents more realistic tasks. We experiment with popular baselines and meta-learners on Meta-Dataset, along with a competitive method that we propose. We analyze performance as a function of various characteristics of test tasks and examine the models’ ability to leverage diverse training sources for improving their generalization. We also propose a new set of baselines for quantifying the benefit of meta-learning in Meta-Dataset. Our extensive experimentation has uncovered important research challenges and we hope to inspire work in these directions.
Tasks	Meta-Learning
Published	2019-03-07
URL	https://arxiv.org/abs/1903.03096v3
PDF	https://arxiv.org/pdf/1903.03096v3.pdf
PWC	https://paperswithcode.com/paper/meta-dataset-a-dataset-of-datasets-for
Repo	https://github.com/cambridge-mlg/cnaps
Framework	pytorch

Anomaly Detection in Video Sequence with Appearance-Motion Correspondence


Title	Anomaly Detection in Video Sequence with Appearance-Motion Correspondence
Authors	Trong Nguyen Nguyen, Jean Meunier
Abstract	Anomaly detection in surveillance videos is currently a challenge because of the diversity of possible events. We propose a deep convolutional neural network (CNN) that addresses this problem by learning a correspondence between common object appearances (e.g. pedestrian, background, tree, etc.) and their associated motions. Our model is designed as a combination of a reconstruction network and an image translation model that share the same encoder. The former sub-network determines the most significant structures that appear in video frames and the latter one attempts to associate motion templates to such structures. The training stage is performed using only videos of normal events and the model is then capable to estimate frame-level scores for an unknown input. The experiments on 6 benchmark datasets demonstrate the competitive performance of the proposed approach with respect to state-of-the-art methods.
Tasks	Anomaly Detection, Anomaly Detection In Surveillance Videos
Published	2019-08-17
URL	https://arxiv.org/abs/1908.06351v1
PDF	https://arxiv.org/pdf/1908.06351v1.pdf
PWC	https://paperswithcode.com/paper/anomaly-detection-in-video-sequence-with
Repo	https://github.com/nguyetn89/Anomaly_detection_ICCV2019
Framework	tf