Paper Group AWR 265
Continuous Meta-Learning without Tasks. Exact Combinatorial Optimization with Graph Convolutional Neural Networks. Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets. Metric Learning for Dynamic Text Classification. Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling. Rethinking Zero …
Continuous Meta-Learning without Tasks
Title | Continuous Meta-Learning without Tasks |
Authors | James Harrison, Apoorva Sharma, Chelsea Finn, Marco Pavone |
Abstract | Meta-learning is a promising strategy for learning to efficiently learn within new tasks, using data gathered from a distribution of tasks. However, the meta-learning literature thus far has focused on the task segmented setting, where at train-time, offline data is assumed to be split according to the underlying task, and at test-time, the algorithms are optimized to learn in a single task. In this work, we enable the application of generic meta-learning algorithms to settings where this task segmentation is unavailable, such as continual online learning with a time-varying task. We present meta-learning via online changepoint analysis (MOCA), an approach which augments a meta-learning algorithm with a differentiable Bayesian changepoint detection scheme. The framework allows both training and testing directly on time series data without segmenting it into discrete tasks. We demonstrate the utility of this approach on a nonlinear meta-regression benchmark as well as two meta-image-classification benchmarks. |
Tasks | Image Classification, Meta-Learning, Time Series |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08866v1 |
https://arxiv.org/pdf/1912.08866v1.pdf | |
PWC | https://paperswithcode.com/paper/continuous-meta-learning-without-tasks-1 |
Repo | https://github.com/StanfordASL/moca |
Framework | pytorch |
Exact Combinatorial Optimization with Graph Convolutional Neural Networks
Title | Exact Combinatorial Optimization with Graph Convolutional Neural Networks |
Authors | Maxime Gasse, Didier Chételat, Nicola Ferroni, Laurent Charlin, Andrea Lodi |
Abstract | Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose a new graph convolutional neural network model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs. We train our model via imitation learning from the strong branching expert rule, and demonstrate on a series of hard problems that our approach produces policies that improve upon state-of-the-art machine-learning methods for branching and generalize to instances significantly larger than seen during training. Moreover, we improve for the first time over expert-designed branching rules implemented in a state-of-the-art solver on large problems. Code for reproducing all the experiments can be found at https://github.com/ds4dm/learn2branch. |
Tasks | Combinatorial Optimization, Imitation Learning |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01629v3 |
https://arxiv.org/pdf/1906.01629v3.pdf | |
PWC | https://paperswithcode.com/paper/exact-combinatorial-optimization-with-graph |
Repo | https://github.com/ds4dm/learn2branch |
Framework | tf |
Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets
Title | Addressing Model Vulnerability to Distributional Shifts over Image Transformation Sets |
Authors | Riccardo Volpi, Vittorio Murino |
Abstract | We are concerned with the vulnerability of computer vision models to distributional shifts. We formulate a combinatorial optimization problem that allows evaluating the regions in the image space where a given model is more vulnerable, in terms of image transformations applied to the input, and face it with standard search algorithms. We further embed this idea in a training procedure, where we define new data augmentation rules according to the image transformations that the current model is most vulnerable to, over iterations. An empirical evaluation on classification and semantic segmentation problems suggests that the devised algorithm allows to train models that are more robust against content-preserving image manipulations and, in general, against distributional shifts. |
Tasks | Combinatorial Optimization, Data Augmentation, Semantic Segmentation |
Published | 2019-03-28 |
URL | https://arxiv.org/abs/1903.11900v2 |
https://arxiv.org/pdf/1903.11900v2.pdf | |
PWC | https://paperswithcode.com/paper/model-vulnerability-to-distributional-shifts |
Repo | https://github.com/ricvolpi/domain-shift-robustness |
Framework | tf |
Metric Learning for Dynamic Text Classification
Title | Metric Learning for Dynamic Text Classification |
Authors | Jeremy Wohlwend, Ethan R. Elenberg, Samuel Altschul, Shawn Henry, Tao Lei |
Abstract | Traditional text classifiers are limited to predicting over a fixed set of labels. However, in many real-world applications the label set is frequently changing. For example, in intent classification, new intents may be added over time while others are removed. We propose to address the problem of dynamic text classification by replacing the traditional, fixed-size output layer with a learned, semantically meaningful metric space. Here the distances between textual inputs are optimized to perform nearest-neighbor classification across overlapping label sets. Changing the label set does not involve removing parameters, but rather simply adding or removing support points in the metric space. Then the learned metric can be fine-tuned with only a few additional training examples. We demonstrate that this simple strategy is robust to changes in the label space. Furthermore, our results show that learning a non-Euclidean metric can improve performance in the low data regime, suggesting that further work on metric spaces may benefit low-resource research. |
Tasks | Intent Classification, Metric Learning, Text Classification |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01026v1 |
https://arxiv.org/pdf/1911.01026v1.pdf | |
PWC | https://paperswithcode.com/paper/metric-learning-for-dynamic-text-1 |
Repo | https://github.com/asappresearch/dynamic-classification |
Framework | none |
Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling
Title | Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling |
Authors | Xiaochuang Han, Jacob Eisenstein |
Abstract | Contextualized word embeddings such as ELMo and BERT provide a foundation for strong performance across a wide range of natural language processing tasks by pretraining on large corpora of unlabeled text. However, the applicability of this approach is unknown when the target domain varies substantially from the pretraining corpus. We are specifically interested in the scenario in which labeled data is available in only a canonical source domain such as newstext, and the target domain is distinct from both the labeled and pretraining texts. To address this scenario, we propose domain-adaptive fine-tuning, in which the contextualized embeddings are adapted by masked language modeling on text from the target domain. We test this approach on sequence labeling in two challenging domains: Early Modern English and Twitter. Both domains differ substantially from existing pretraining corpora, and domain-adaptive fine-tuning yields substantial improvements over strong BERT baselines, with particularly impressive results on out-of-vocabulary words. We conclude that domain-adaptive fine-tuning offers a simple and effective approach for the unsupervised adaptation of sequence labeling to difficult new domains. |
Tasks | Domain Adaptation, Language Modelling, Part-Of-Speech Tagging, Unsupervised Domain Adaptation, Word Embeddings |
Published | 2019-04-04 |
URL | https://arxiv.org/abs/1904.02817v2 |
https://arxiv.org/pdf/1904.02817v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-domain-adaptation-of |
Repo | https://github.com/xhan77/AdaptaBERT |
Framework | pytorch |
Rethinking Zero-Shot Learning: A Conditional Visual Classification Perspective
Title | Rethinking Zero-Shot Learning: A Conditional Visual Classification Perspective |
Authors | Kai Li, Martin Renqiang Min, Yun Fu |
Abstract | Zero-shot learning (ZSL) aims to recognize instances of unseen classes solely based on the semantic descriptions of the classes. Existing algorithms usually formulate it as a semantic-visual correspondence problem, by learning mappings from one feature space to the other. Despite being reasonable, previous approaches essentially discard the highly precious discriminative power of visual features in an implicit way, and thus produce undesirable results. We instead reformulate ZSL as a conditioned visual classification problem, i.e., classifying visual features based on the classifiers learned from the semantic descriptions. With this reformulation, we develop algorithms targeting various ZSL settings: For the conventional setting, we propose to train a deep neural network that directly generates visual feature classifiers from the semantic attributes with an episode-based training scheme; For the generalized setting, we concatenate the learned highly discriminative classifiers for seen classes and the generated classifiers for unseen classes to classify visual features of all classes; For the transductive setting, we exploit unlabeled data to effectively calibrate the classifier generator using a novel learning-without-forgetting self-training mechanism and guide the process by a robust generalized cross-entropy loss. Extensive experiments show that our proposed algorithms significantly outperform state-of-the-art methods by large margins on most benchmark datasets in all the ZSL settings. Our code is available at \url{https://github.com/kailigo/cvcZSL} |
Tasks | Zero-Shot Learning |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.05995v2 |
https://arxiv.org/pdf/1909.05995v2.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-zero-shot-learning-a-conditional |
Repo | https://github.com/kailigo/cvcZSL |
Framework | pytorch |
A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers
Title | A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers |
Authors | Rafael Gomes Mantovani, André Luis Debiaso Rossi, Edesio Alcobaça, Joaquin Vanschoren, André Carlos Ponce de Leon Ferreira de Carvalho |
Abstract | For many machine learning algorithms, predictive performance is critically affected by the hyperparameter values used to train them. However, tuning these hyperparameters can come at a high computational cost, especially on larger datasets, while the tuned settings do not always significantly outperform the default values. This paper proposes a recommender system based on meta-learning to identify exactly when it is better to use default values and when to tune hyperparameters for each new dataset. Besides, an in-depth analysis is performed to understand what they take into account for their decisions, providing useful insights. An extensive analysis of different categories of meta-features, meta-learners, and setups across 156 datasets is performed. Results show that it is possible to accurately predict when tuning will significantly improve the performance of the induced models. The proposed system reduces the time spent on optimization processes, without reducing the predictive performance of the induced models (when compared with the ones obtained using tuned hyperparameters). We also explain the decision-making process of the meta-learners in terms of linear separability-based hypotheses. Although this analysis is focused on the tuning of Support Vector Machines, it can also be applied to other algorithms, as shown in experiments performed with decision trees. |
Tasks | Decision Making, Meta-Learning, Recommendation Systems |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01684v2 |
https://arxiv.org/pdf/1906.01684v2.pdf | |
PWC | https://paperswithcode.com/paper/a-meta-learning-recommender-system-for |
Repo | https://github.com/rgmantovani/mtlSuite |
Framework | none |
$360^o$ Surface Regression with a Hyper-Sphere Loss
Title | $360^o$ Surface Regression with a Hyper-Sphere Loss |
Authors | Antonis Karakottas, Nikolaos Zioulis, Stamatis Samaras, Dimitrios Ataloglou, Vasileios Gkitsas, Dimitrios Zarpalas, Petros Daras |
Abstract | Omnidirectional vision is becoming increasingly relevant as more efficient $360^o$ image acquisition is now possible. However, the lack of annotated $360^o$ datasets has hindered the application of deep learning techniques on spherical content. This is further exaggerated on tasks where ground truth acquisition is difficult, such as monocular surface estimation. While recent research approaches on the 2D domain overcome this challenge by relying on generating normals from depth cues using RGB-D sensors, this is very difficult to apply on the spherical domain. In this work, we address the unavailability of sufficient $360^o$ ground truth normal data, by leveraging existing 3D datasets and remodelling them via rendering. We present a dataset of $360^o$ images of indoor spaces with their corresponding ground truth surface normal, and train a deep convolutional neural network (CNN) on the task of monocular 360 surface estimation. We achieve this by minimizing a novel angular loss function defined on the hyper-sphere using simple quaternion algebra. We put an effort to appropriately compare with other state of the art methods trained on planar datasets and finally, present the practical applicability of our trained model on a spherical image re-lighting task using completely unseen data by qualitatively showing the promising generalization ability of our dataset and model. The dataset is available at: vcl3d.github.io/HyperSphereSurfaceRegression. |
Tasks | Surface Normals Estimation |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07043v1 |
https://arxiv.org/pdf/1909.07043v1.pdf | |
PWC | https://paperswithcode.com/paper/360o-surface-regression-with-a-hyper-sphere |
Repo | https://github.com/VCL3D/SphericalViewSynthesis |
Framework | pytorch |
Deep Reinforcement Learning Control of Quantum Cartpoles
Title | Deep Reinforcement Learning Control of Quantum Cartpoles |
Authors | Zhikang T. Wang, Yuto Ashida, Masahito Ueda |
Abstract | We generalize a standard benchmark of reinforcement learning, the classical cartpole balancing problem, to the quantum regime by stabilizing a particle in an unstable potential through measurement and feedback. We use the state-of-the-art deep reinforcement learning to stabilize the quantum cartpole and find that our deep learning approach performs comparably to or better than other strategies in standard control theory. Our approach also applies to measurement-feedback cooling of quantum oscillators, showing the applicability of deep learning to general continuous-space quantum control. |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09200v2 |
https://arxiv.org/pdf/1910.09200v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-control-of |
Repo | https://github.com/Z-T-WANG/DeepReinforcementLearningControlOfQuantumCartpoles |
Framework | none |
Reinforcement Knowledge Graph Reasoning for Explainable Recommendation
Title | Reinforcement Knowledge Graph Reasoning for Explainable Recommendation |
Authors | Yikun Xian, Zuohui Fu, S. Muthukrishnan, Gerard de Melo, Yongfeng Zhang |
Abstract | Recent advances in personalized recommendation have sparked great interest in the exploitation of rich structured information provided by knowledge graphs. Unlike most existing approaches that only focus on leveraging knowledge graphs for more accurate recommendation, we perform explicit reasoning with knowledge for decision making so that the recommendations are generated and supported by an interpretable causal inference procedure. To this end, we propose a method called Policy-Guided Path Reasoning (PGPR), which couples recommendation and interpretability by providing actual paths in a knowledge graph. Our contributions include four aspects. We first highlight the significance of incorporating knowledge graphs into recommendation to formally define and interpret the reasoning process. Second, we propose a reinforcement learning (RL) approach featuring an innovative soft reward strategy, user-conditional action pruning and a multi-hop scoring function. Third, we design a policy-guided graph search algorithm to efficiently and effectively sample reasoning paths for recommendation. Finally, we extensively evaluate our method on several large-scale real-world benchmark datasets, obtaining favorable results compared with state-of-the-art methods. |
Tasks | Causal Inference, Decision Making, Knowledge Graphs |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05237v1 |
https://arxiv.org/pdf/1906.05237v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-knowledge-graph-reasoning-for |
Repo | https://github.com/orcax/PGPR |
Framework | pytorch |
Constructing Artificial Data for Fine-tuning for Low-Resource Biomedical Text Tagging with Applications in PICO Annotation
Title | Constructing Artificial Data for Fine-tuning for Low-Resource Biomedical Text Tagging with Applications in PICO Annotation |
Authors | Gaurav Singh, Zahra Sabet, John Shawe-Taylor, James Thomas |
Abstract | Biomedical text tagging systems are plagued by the dearth of labeled training data. There have been recent attempts at using pre-trained encoders to deal with this issue. Pre-trained encoder provides representation of the input text which is then fed to task-specific layers for classification. The entire network is fine-tuned on the labeled data from the target task. Unfortunately, a low-resource biomedical task often has too few labeled instances for satisfactory fine-tuning. Also, if the label space is large, it contains few or no labeled instances for majority of the labels. Most biomedical tagging systems treat labels as indexes, ignoring the fact that these labels are often concepts expressed in natural language e.g. `Appearance of lesion on brain imaging’. To address these issues, we propose constructing extra labeled instances using label-text (i.e. label’s name) as input for the corresponding label-index (i.e. label’s index). In fact, we propose a number of strategies for manufacturing multiple artificial labeled instances from a single label. The network is then fine-tuned on a combination of real and these newly constructed artificial labeled instances. We evaluate the proposed approach on an important low-resource biomedical task called \textit{PICO annotation}, which requires tagging raw text describing clinical trials with labels corresponding to different aspects of the trial i.e. PICO (Population, Intervention/Control, Outcome) characteristics of the trial. Our empirical results show that the proposed method achieves a new state-of-the-art performance for PICO annotation with very significant improvements over competitive baselines. | |
Tasks | |
Published | 2019-10-21 |
URL | https://arxiv.org/abs/1910.09255v3 |
https://arxiv.org/pdf/1910.09255v3.pdf | |
PWC | https://paperswithcode.com/paper/constructing-artificial-data-for-fine-tuning |
Repo | https://github.com/gauravsc/pico-tagging |
Framework | pytorch |
A Multi-Phase Gammatone Filterbank for Speech Separation via TasNet
Title | A Multi-Phase Gammatone Filterbank for Speech Separation via TasNet |
Authors | David Ditter, Timo Gerkmann |
Abstract | In this work, we investigate if the learned encoder of the end-to-end convolutional time domain audio separation network (Conv-TasNet) is the key to its recent success, or if the encoder can just as well be replaced by a deterministic hand-crafted filterbank. Motivated by the resemblance of the trained encoder of Conv-TasNet to auditory filterbanks, we propose to employ a deterministic gammatone filterbank. In contrast to a common gammatone filterbank, our filters are restricted to 2 ms length to allow for low-latency processing. Inspired by the encoder learned by Conv-TasNet, in addition to the logarithmically spaced filters, the proposed filterbank holds multiple gammatone filters at the same center frequency with varying phase shifts. We show that replacing the learned encoder with our proposed multi-phase gammatone filterbank (MP-GTF) even leads to a scale-invariant source-to-noise ratio (SI-SNR) improvement of 0.7 dB. Furthermore, in contrast to using the learned encoder we show that the number of filters can be reduced from 512 to 128 without loss of performance. |
Tasks | Speech Separation |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11615v2 |
https://arxiv.org/pdf/1910.11615v2.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-phase-gammatone-filterbank-for-speech |
Repo | https://github.com/sp-uhh/mp-gtf |
Framework | none |
Understanding Attention and Generalization in Graph Neural Networks
Title | Understanding Attention and Generalization in Graph Neural Networks |
Authors | Boris Knyazev, Graham W. Taylor, Mohamed R. Amer |
Abstract | We aim to better understand attention over nodes in graph neural networks (GNNs) and identify factors influencing its effectiveness. We particularly focus on the ability of attention GNNs to generalize to larger, more complex or noisy graphs. Motivated by insights from the work on Graph Isomorphism Networks, we design simple graph reasoning tasks that allow us to study attention in a controlled environment. We find that under typical conditions the effect of attention is negligible or even harmful, but under certain conditions it provides an exceptional gain in performance of more than 60% in some of our classification tasks. Satisfying these conditions in practice is challenging and often requires optimal initialization or supervised training of attention. We propose an alternative recipe and train attention in a weakly-supervised fashion that approaches the performance of supervised models, and, compared to unsupervised models, improves results on several synthetic as well as real datasets. Source code and datasets are available at https://github.com/bknyaz/graph_attention_pool. |
Tasks | Graph Classification |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.02850v3 |
https://arxiv.org/pdf/1905.02850v3.pdf | |
PWC | https://paperswithcode.com/paper/understanding-attention-in-graph-neural |
Repo | https://github.com/bknyaz/graph_attention_pool |
Framework | pytorch |
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples
Title | Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples |
Authors | Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle |
Abstract | Few-shot classification refers to learning a classifier for new classes given only a few examples. While a plethora of models have emerged to tackle it, we find the procedure and datasets that are used to assess their progress lacking. To address this limitation, we propose Meta-Dataset: a new benchmark for training and evaluating models that is large-scale, consists of diverse datasets, and presents more realistic tasks. We experiment with popular baselines and meta-learners on Meta-Dataset, along with a competitive method that we propose. We analyze performance as a function of various characteristics of test tasks and examine the models’ ability to leverage diverse training sources for improving their generalization. We also propose a new set of baselines for quantifying the benefit of meta-learning in Meta-Dataset. Our extensive experimentation has uncovered important research challenges and we hope to inspire work in these directions. |
Tasks | Meta-Learning |
Published | 2019-03-07 |
URL | https://arxiv.org/abs/1903.03096v3 |
https://arxiv.org/pdf/1903.03096v3.pdf | |
PWC | https://paperswithcode.com/paper/meta-dataset-a-dataset-of-datasets-for |
Repo | https://github.com/cambridge-mlg/cnaps |
Framework | pytorch |
Anomaly Detection in Video Sequence with Appearance-Motion Correspondence
Title | Anomaly Detection in Video Sequence with Appearance-Motion Correspondence |
Authors | Trong Nguyen Nguyen, Jean Meunier |
Abstract | Anomaly detection in surveillance videos is currently a challenge because of the diversity of possible events. We propose a deep convolutional neural network (CNN) that addresses this problem by learning a correspondence between common object appearances (e.g. pedestrian, background, tree, etc.) and their associated motions. Our model is designed as a combination of a reconstruction network and an image translation model that share the same encoder. The former sub-network determines the most significant structures that appear in video frames and the latter one attempts to associate motion templates to such structures. The training stage is performed using only videos of normal events and the model is then capable to estimate frame-level scores for an unknown input. The experiments on 6 benchmark datasets demonstrate the competitive performance of the proposed approach with respect to state-of-the-art methods. |
Tasks | Anomaly Detection, Anomaly Detection In Surveillance Videos |
Published | 2019-08-17 |
URL | https://arxiv.org/abs/1908.06351v1 |
https://arxiv.org/pdf/1908.06351v1.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detection-in-video-sequence-with |
Repo | https://github.com/nguyetn89/Anomaly_detection_ICCV2019 |
Framework | tf |