April 2, 2020

3024 words 15 mins read

Paper Group ANR 236

Paper Group ANR 236

A Sample Selection Approach for Universal Domain Adaptation. Lifting Interpretability-Performance Trade-off via Automated Feature Engineering. Compositional ADAM: An Adaptive Compositional Solver. Local Nonparametric Meta-Learning. Geometric Dataset Distances via Optimal Transport. Revisiting Meta-Learning as Supervised Learning. Learning to Recons …

A Sample Selection Approach for Universal Domain Adaptation

Title A Sample Selection Approach for Universal Domain Adaptation
Authors Omri Lifshitz, Lior Wolf
Abstract We study the problem of unsupervised domain adaption in the universal scenario, in which only some of the classes are shared between the source and target domains. We present a scoring scheme that is effective in identifying the samples of the shared classes. The score is used to select which samples in the target domain to pseudo-label during training. Another loss term encourages diversity of labels within each batch. Taken together, our method is shown to outperform, by a sizable margin, the current state of the art on the literature benchmarks.
Tasks Domain Adaptation
Published 2020-01-14
URL https://arxiv.org/abs/2001.05071v1
PDF https://arxiv.org/pdf/2001.05071v1.pdf
PWC https://paperswithcode.com/paper/a-sample-selection-approach-for-universal
Repo
Framework

Lifting Interpretability-Performance Trade-off via Automated Feature Engineering

Title Lifting Interpretability-Performance Trade-off via Automated Feature Engineering
Authors Alicja Gosiewska, Przemyslaw Biecek
Abstract Complex black-box predictive models may have high performance, but lack of interpretability causes problems like lack of trust, lack of stability, sensitivity to concept drift. On the other hand, achieving satisfactory accuracy of interpretable models require more time-consuming work related to feature engineering. Can we train interpretable and accurate models, without timeless feature engineering? We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models. New models are created on newly engineered features extracted with the help of a surrogate model. We supply the analysis by a large-scale benchmark on several tabular data sets from the OpenML database. There are two results 1) extracting information from complex models may improve the performance of linear models, 2) questioning a common myth that complex machine learning models outperform linear models.
Tasks Automated Feature Engineering, Feature Engineering
Published 2020-02-11
URL https://arxiv.org/abs/2002.04267v1
PDF https://arxiv.org/pdf/2002.04267v1.pdf
PWC https://paperswithcode.com/paper/lifting-interpretability-performance-trade
Repo
Framework

Compositional ADAM: An Adaptive Compositional Solver

Title Compositional ADAM: An Adaptive Compositional Solver
Authors Rasul Tutunov, Minne Li, Jun Wang, Haitham Bou-Ammar
Abstract In this paper, we present C-ADAM, the first adaptive solver for compositional problems involving a non-linear functional nesting of expected values. We proof that C-ADAM converges to a stationary point in $\mathcal{O}(\delta^{-2.25})$ with $\delta$ being a precision parameter. Moreover, we demonstrate the importance of our results by bridging, for the first time, model-agnostic meta-learning (MAML) and compositional optimisation showing fastest known rates for deep network adaptation to-date. Finally, we validate our findings in a set of experiments from portfolio optimisation and meta-learning. Our results manifest significant sample complexity reductions compared to both standard and compositional solvers.
Tasks Meta-Learning
Published 2020-02-10
URL https://arxiv.org/abs/2002.03755v1
PDF https://arxiv.org/pdf/2002.03755v1.pdf
PWC https://paperswithcode.com/paper/compositional-adam-an-adaptive-compositional
Repo
Framework

Local Nonparametric Meta-Learning

Title Local Nonparametric Meta-Learning
Authors Wonjoon Goo, Scott Niekum
Abstract A central goal of meta-learning is to find a learning rule that enables fast adaptation across a set of tasks, by learning the appropriate inductive bias for that set. Most meta-learning algorithms try to find a \textit{global} learning rule that encodes this inductive bias. However, a global learning rule represented by a fixed-size representation is prone to meta-underfitting or -overfitting since the right representational power for a task set is difficult to choose a priori. Even when chosen correctly, we show that global, fixed-size representations often fail when confronted with certain types of out-of-distribution tasks, even when the same inductive bias is appropriate. To address these problems, we propose a novel nonparametric meta-learning algorithm that utilizes a meta-trained local learning rule, building on recent ideas in attention-based and functional gradient-based meta-learning. In several meta-regression problems, we show improved meta-generalization results using our local, nonparametric approach and achieve state-of-the-art results in the robotics benchmark, Omnipush.
Tasks Meta-Learning
Published 2020-02-09
URL https://arxiv.org/abs/2002.03272v1
PDF https://arxiv.org/pdf/2002.03272v1.pdf
PWC https://paperswithcode.com/paper/local-nonparametric-meta-learning
Repo
Framework

Geometric Dataset Distances via Optimal Transport

Title Geometric Dataset Distances via Optimal Transport
Authors David Alvarez-Melis, Nicolò Fusi
Abstract The notion of task similarity is at the core of various machine learning paradigms, such as domain adaptation and meta-learning. Current methods to quantify it are often heuristic, make strong assumptions on the label sets across the tasks, and many are architecture-dependent, relying on task-specific optimal parameters (e.g., require training a model on each dataset). In this work we propose an alternative notion of distance between datasets that (i) is model-agnostic, (ii) does not involve training, (iii) can compare datasets even if their label sets are completely disjoint and (iv) has solid theoretical footing. This distance relies on optimal transport, which provides it with rich geometry awareness, interpretable correspondences and well-understood properties. Our results show that this novel distance provides meaningful comparison of datasets, and correlates well with transfer learning hardness across various experimental settings and datasets.
Tasks Domain Adaptation, Meta-Learning, Transfer Learning
Published 2020-02-07
URL https://arxiv.org/abs/2002.02923v1
PDF https://arxiv.org/pdf/2002.02923v1.pdf
PWC https://paperswithcode.com/paper/geometric-dataset-distances-via-optimal
Repo
Framework

Revisiting Meta-Learning as Supervised Learning

Title Revisiting Meta-Learning as Supervised Learning
Authors Wei-Lun Chao, Han-Jia Ye, De-Chuan Zhan, Mark Campbell, Kilian Q. Weinberger
Abstract Recent years have witnessed an abundance of new publications and approaches on meta-learning. This community-wide enthusiasm has sparked great insights but has also created a plethora of seemingly different frameworks, which can be hard to compare and evaluate. In this paper, we aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning. By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning. This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning. For example, we obtain a better understanding of generalization properties, and we can readily transfer well-understood techniques, such as model ensemble, pre-training, joint training, data augmentation, and even nearest neighbor based methods. We provide an intuitive analogy of these methods in the context of meta-learning and show that they give rise to significant improvements in model performance on few-shot learning.
Tasks Data Augmentation, Few-Shot Learning, Meta-Learning
Published 2020-02-03
URL https://arxiv.org/abs/2002.00573v1
PDF https://arxiv.org/pdf/2002.00573v1.pdf
PWC https://paperswithcode.com/paper/revisiting-meta-learning-as-supervised
Repo
Framework

Learning to Reconstruct Confocal Microscopy Stacks from Single Light Field Images

Title Learning to Reconstruct Confocal Microscopy Stacks from Single Light Field Images
Authors Josue Page, Federico Saltarin, Yury Belyaev, Ruth Lyck, Paolo Favaro
Abstract We present a novel deep learning approach to reconstruct confocal microscopy stacks from single light field images. To perform the reconstruction, we introduce the LFMNet, a novel neural network architecture inspired by the U-Net design. It is able to reconstruct with high-accuracy a 112x112x57.6$\mu m^3$ volume (1287x1287x64 voxels) in 50ms given a single light field image of 1287x1287 pixels, thus dramatically reducing 720-fold the time for confocal scanning of assays at the same volumetric resolution and 64-fold the required storage. To prove the applicability in life sciences, our approach is evaluated both quantitatively and qualitatively on mouse brain slices with fluorescently labelled blood vessels. Because of the drastic reduction in scan time and storage space, our setup and method are directly applicable to real-time in vivo 3D microscopy. We provide analysis of the optical design, of the network architecture and of our training procedure to optimally reconstruct volumes for a given target depth range. To train our network, we built a data set of 362 light field images of mouse brain blood vessels and the corresponding aligned set of 3D confocal scans, which we use as ground truth. The data set will be made available for research purposes.
Tasks
Published 2020-03-24
URL https://arxiv.org/abs/2003.11004v1
PDF https://arxiv.org/pdf/2003.11004v1.pdf
PWC https://paperswithcode.com/paper/learning-to-reconstruct-confocal-microscopy
Repo
Framework

Multi-step Joint-Modality Attention Network for Scene-Aware Dialogue System

Title Multi-step Joint-Modality Attention Network for Scene-Aware Dialogue System
Authors Yun-Wei Chu, Kuan-Yen Lin, Chao-Chun Hsu, Lun-Wei Ku
Abstract Understanding dynamic scenes and dialogue contexts in order to converse with users has been challenging for multimodal dialogue systems. The 8-th Dialog System Technology Challenge (DSTC8) proposed an Audio Visual Scene-Aware Dialog (AVSD) task, which contains multiple modalities including audio, vision, and language, to evaluate how dialogue systems understand different modalities and response to users. In this paper, we proposed a multi-step joint-modality attention network (JMAN) based on recurrent neural network (RNN) to reason on videos. Our model performs a multi-step attention mechanism and jointly considers both visual and textual representations in each reasoning process to better integrate information from the two different modalities. Compared to the baseline released by AVSD organizers, our model achieves a relative 12.1% and 22.4% improvement over the baseline on ROUGE-L score and CIDEr score.
Tasks Scene-Aware Dialogue
Published 2020-01-17
URL https://arxiv.org/abs/2001.06206v1
PDF https://arxiv.org/pdf/2001.06206v1.pdf
PWC https://paperswithcode.com/paper/multi-step-joint-modality-attention-network
Repo
Framework

AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning

Title AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning
Authors Qijing Huang, Ameer Haj-Ali, William Moses, John Xiang, Ion Stoica, Krste Asanovic, John Wawrzynek
Abstract The performance of the code a compiler generates depends on the order in which it applies the optimization passes. Choosing a good order–often referred to as the phase-ordering problem, is an NP-hard problem. As a result, existing solutions rely on a variety of heuristics. In this paper, we evaluate a new technique to address the phase-ordering problem: deep reinforcement learning. To this end, we implement AutoPhase: a framework that takes a program and uses deep reinforcement learning to find a sequence of compilation passes that minimizes its execution time. Without loss of generality, we construct this framework in the context of the LLVM compiler toolchain and target high-level synthesis programs. We use random forests to quantify the correlation between the effectiveness of a given pass and the program’s features. This helps us reduce the search space by avoiding phase orderings that are unlikely to improve the performance of a given program. We compare the performance of AutoPhase to state-of-the-art algorithms that address the phase-ordering problem. In our evaluation, we show that AutoPhase improves circuit performance by 28% when compared to using the -O3 compiler flag, and achieves competitive results compared to the state-of-the-art solutions, while requiring fewer samples. Furthermore, unlike existing state-of-the-art solutions, our deep reinforcement learning solution shows promising result in generalizing to real benchmarks and 12,874 different randomly generated programs, after training on a hundred randomly generated programs.
Tasks
Published 2020-03-02
URL https://arxiv.org/abs/2003.00671v2
PDF https://arxiv.org/pdf/2003.00671v2.pdf
PWC https://paperswithcode.com/paper/autophase-juggling-hls-phase-orderings-in
Repo
Framework

Reachability Analysis for Feed-Forward Neural Networks using Face Lattices

Title Reachability Analysis for Feed-Forward Neural Networks using Face Lattices
Authors Xiaodong Yang, Hoang-Dung Tran, Weiming Xiang, Taylor Johnson
Abstract Deep neural networks have been widely applied as an effective approach to handle complex and practical problems. However, one of the most fundamental open problems is the lack of formal methods to analyze the safety of their behaviors. To address this challenge, we propose a parallelizable technique to compute exact reachable sets of a neural network to an input set. Our method currently focuses on feed-forward neural networks with ReLU activation functions. One of the primary challenges for polytope-based approaches is identifying the intersection between intermediate polytopes and hyperplanes from neurons. In this regard, we present a new approach to construct the polytopes with the face lattice, a complete combinatorial structure. The correctness and performance of our methodology are evaluated by verifying the safety of ACAS Xu networks and other benchmarks. Compared to state-of-the-art methods such as Reluplex, Marabou, and NNV, our approach exhibits a significantly higher efficiency. Additionally, our approach is capable of constructing the complete input set given an output set, so that any input that leads to safety violation can be tracked.
Tasks
Published 2020-03-02
URL https://arxiv.org/abs/2003.01226v1
PDF https://arxiv.org/pdf/2003.01226v1.pdf
PWC https://paperswithcode.com/paper/reachability-analysis-for-feed-forward-neural
Repo
Framework

Gaussian Hierarchical Latent Dirichlet Allocation: Bringing Polysemy Back

Title Gaussian Hierarchical Latent Dirichlet Allocation: Bringing Polysemy Back
Authors Takahiro Yoshida, Ryohei Hisano, Takaaki Ohnishi
Abstract Topic models are widely used to discover the latent representation of a set of documents. The two canonical models are latent Dirichlet allocation, and Gaussian latent Dirichlet allocation, where the former uses multinomial distributions over words, and the latter uses multivariate Gaussian distributions over pre-trained word embedding vectors as the latent topic representations, respectively. Compared with latent Dirichlet allocation, Gaussian latent Dirichlet allocation is limited in the sense that it does not capture the polysemy of a word such as ``bank.’’ In this paper, we show that Gaussian latent Dirichlet allocation could recover the ability to capture polysemy by introducing a hierarchical structure in the set of topics that the model can use to represent a given document. Our Gaussian hierarchical latent Dirichlet allocation significantly improves polysemy detection compared with Gaussian-based models and provides more parsimonious topic representations compared with hierarchical latent Dirichlet allocation. Our extensive quantitative experiments show that our model also achieves better topic coherence and held-out document predictive accuracy over a wide range of corpus and word embedding vectors. |
Tasks Topic Models
Published 2020-02-25
URL https://arxiv.org/abs/2002.10855v1
PDF https://arxiv.org/pdf/2002.10855v1.pdf
PWC https://paperswithcode.com/paper/gaussian-hierarchical-latent-dirichlet
Repo
Framework

Optimal estimation of sparse topic models

Title Optimal estimation of sparse topic models
Authors Xin Bing, Florentina Bunea, Marten Wegkamp
Abstract Topic models have become popular tools for dimension reduction and exploratory analysis of text data which consists in observed frequencies of a vocabulary of $p$ words in $n$ documents, stored in a $p\times n$ matrix. The main premise is that the mean of this data matrix can be factorized into a product of two non-negative matrices: a $p\times K$ word-topic matrix $A$ and a $K\times n$ topic-document matrix $W$. This paper studies the estimation of $A$ that is possibly element-wise sparse, and the number of topics $K$ is unknown. In this under-explored context, we derive a new minimax lower bound for the estimation of such $A$ and propose a new computationally efficient algorithm for its recovery. We derive a finite sample upper bound for our estimator, and show that it matches the minimax lower bound in many scenarios. Our estimate adapts to the unknown sparsity of $A$ and our analysis is valid for any finite $n$, $p$, $K$ and document lengths. Empirical results on both synthetic data and semi-synthetic data show that our proposed estimator is a strong competitor of the existing state-of-the-art algorithms for both non-sparse $A$ and sparse $A$, and has superior performance is many scenarios of interest.
Tasks Dimensionality Reduction, Topic Models
Published 2020-01-22
URL https://arxiv.org/abs/2001.07861v1
PDF https://arxiv.org/pdf/2001.07861v1.pdf
PWC https://paperswithcode.com/paper/optimal-estimation-of-sparse-topic-models
Repo
Framework

VSEC-LDA: Boosting Topic Modeling with Embedded Vocabulary Selection

Title VSEC-LDA: Boosting Topic Modeling with Embedded Vocabulary Selection
Authors Yuzhen Ding, Baoxin Li
Abstract Topic modeling has found wide application in many problems where latent structures of the data are crucial for typical inference tasks. When applying a topic model, a relatively standard pre-processing step is to first build a vocabulary of frequent words. Such a general pre-processing step is often independent of the topic modeling stage, and thus there is no guarantee that the pre-generated vocabulary can support the inference of some optimal (or even meaningful) topic models appropriate for a given task, especially for computer vision applications involving “visual words”. In this paper, we propose a new approach to topic modeling, termed Vocabulary-Selection-Embedded Correspondence-LDA (VSEC-LDA), which learns the latent model while simultaneously selecting most relevant words. The selection of words is driven by an entropy-based metric that measures the relative contribution of the words to the underlying model, and is done dynamically while the model is learned. We present three variants of VSEC-LDA and evaluate the proposed approach with experiments on both synthetic and real databases from different applications. The results demonstrate the effectiveness of built-in vocabulary selection and its importance in improving the performance of topic modeling.
Tasks Topic Models
Published 2020-01-15
URL https://arxiv.org/abs/2001.05578v1
PDF https://arxiv.org/pdf/2001.05578v1.pdf
PWC https://paperswithcode.com/paper/vsec-lda-boosting-topic-modeling-with
Repo
Framework

Compositional properties of emergent languages in deep learning

Title Compositional properties of emergent languages in deep learning
Authors Bence Keresztury, Elia Bruni
Abstract Recent findings in multi-agent deep learning systems point towards the emergence of compositional languages. These claims are often made without exact analysis or testing of the language. In this work, we analyze the emergent language resulting from two different cooperative multi-agent game with more exact measures for compositionality. Our findings suggest that solutions found by deep learning models are often lacking the ability to reason on an abstract level therefore failing to generalize the learned knowledge to out of the training distribution examples. Strategies for testing compositional capacities and emergence of human-level concepts are discussed.
Tasks
Published 2020-01-23
URL https://arxiv.org/abs/2001.08618v1
PDF https://arxiv.org/pdf/2001.08618v1.pdf
PWC https://paperswithcode.com/paper/compositional-properties-of-emergent
Repo
Framework

Pedestrian orientation dynamics from high-fidelity measurements

Title Pedestrian orientation dynamics from high-fidelity measurements
Authors Joris Willems, Alessandro Corbetta, Vlado Menkovski, Federico Toschi
Abstract We investigate in real-life conditions and with very high accuracy the dynamics of body rotation, or yawing, of walking pedestrians - an highly complex task due to the wide variety in shapes, postures and walking gestures. We propose a novel measurement method based on a deep neural architecture that we train on the basis of generic physical properties of the motion of pedestrians. Specifically, we leverage on the strong statistical correlation between individual velocity and body orientation: the velocity direction is typically orthogonal with respect to the shoulder line. We make the reasonable assumption that this approximation, although instantaneously slightly imperfect, is correct on average. This enables us to use velocity data as training labels for a highly-accurate point-estimator of individual orientation, that we can train with no dedicated annotation labor. We discuss the measurement accuracy and show the error scaling, both on synthetic and real-life data: we show that our method is capable of estimating orientation with an error as low as 7.5 degrees. This tool opens up new possibilities in the studies of human crowd dynamics where orientation is key. By analyzing the dynamics of body rotation in real-life conditions, we show that the instantaneous velocity direction can be described by the combination of orientation and a random delay, where randomness is provided by an Ornstein-Uhlenbeck process centered on an average delay of 100ms. Quantifying these dynamics could have only been possible thanks to a tool as precise as that proposed.
Tasks
Published 2020-01-14
URL https://arxiv.org/abs/2001.04646v1
PDF https://arxiv.org/pdf/2001.04646v1.pdf
PWC https://paperswithcode.com/paper/pedestrian-orientation-dynamics-from-high
Repo
Framework
comments powered by Disqus