April 1, 2020

3366 words 16 mins read

Paper Group ANR 454

Paper Group ANR 454

Bi-Decoder Augmented Network for Neural Machine Translation. A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation. Latent Patient Network Learning for Automatic Diagnosis. Regret Bounds for Noise-Free Bayesian Optimization. Deep Unsupervised Common Representation Learning for LiDAR and Camera Data using Double Siamese Networks. D …

Bi-Decoder Augmented Network for Neural Machine Translation

Title Bi-Decoder Augmented Network for Neural Machine Translation
Authors Boyuan Pan, Yazheng Yang, Zhou Zhao, Yueting Zhuang, Deng Cai
Abstract Neural Machine Translation (NMT) has become a popular technology in recent years, and the encoder-decoder framework is the mainstream among all the methods. It’s obvious that the quality of the semantic representations from encoding is very crucial and can significantly affect the performance of the model. However, existing unidirectional source-to-target architectures may hardly produce a language-independent representation of the text because they rely heavily on the specific relations of the given language pairs. To alleviate this problem, in this paper, we propose a novel Bi-Decoder Augmented Network (BiDAN) for the neural machine translation task. Besides the original decoder which generates the target language sequence, we add an auxiliary decoder to generate back the source language sequence at the training time. Since each decoder transforms the representations of the input text into its corresponding language, jointly training with two target ends can make the shared encoder has the potential to produce a language-independent semantic space. We conduct extensive experiments on several NMT benchmark datasets and the results demonstrate the effectiveness of our proposed approach.
Tasks Machine Translation
Published 2020-01-14
URL https://arxiv.org/abs/2001.04586v1
PDF https://arxiv.org/pdf/2001.04586v1.pdf
PWC https://paperswithcode.com/paper/bi-decoder-augmented-network-for-neural
Repo
Framework

A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation

Title A Balanced and Uncertainty-aware Approach for Partial Domain Adaptation
Authors Jian Liang, Yunbo Wang, Dapeng Hu, Ran He, Jiashi Feng
Abstract This work addresses the unsupervised domain adaptation problem, especially for the partial scenario where the class labels in the target domain are only a subset of those in the source domain. Such a partial transfer setting sounds realistic but challenging while existing methods always suffer from two key problems, i.e., negative transfer and uncertainty propagation. In this paper, we build on domain adversarial learning and propose a novel domain adaptation method BA$^3$US with two new techniques termed Balanced Adversarial Alignment (BAA) and Adaptive Uncertainty Suppression (AUS), respectively. On one hand, negative transfer results in that target samples are misclassified to the classes only present in the source domain. To address this issue, BAA aims to pursue the balance between label distributions across domains in a quite simple manner. Specifically, it randomly leverages a few source samples to augment the smaller target domain during domain alignment so that classes in different domains are symmetric. On the other hand, a source sample is denoted as uncertain if there is an incorrect class that has a relatively high prediction score. Such uncertainty is easily propagated to the unlabeled target data around it during alignment, which severely deteriorates the adaptation performance. Thus, AUS emphasizes uncertain samples and exploits an adaptive weighted complement entropy objective to expect that incorrect classes have the uniform and low prediction scores. Experimental results on multiple benchmarks demonstrate that BA$^3$US surpasses state-of-the-arts for partial domain adaptation tasks.
Tasks Domain Adaptation, Partial Domain Adaptation, Unsupervised Domain Adaptation
Published 2020-03-05
URL https://arxiv.org/abs/2003.02541v1
PDF https://arxiv.org/pdf/2003.02541v1.pdf
PWC https://paperswithcode.com/paper/a-balanced-and-uncertainty-aware-approach-for
Repo
Framework

Latent Patient Network Learning for Automatic Diagnosis

Title Latent Patient Network Learning for Automatic Diagnosis
Authors Luca Cosmo, Anees Kazi, Seyed-Ahmad Ahmadi, Nassir Navab, Michael Bronstein
Abstract Recently, Graph Convolutional Networks (GCNs) has proven to be a powerful machine learning tool for Computer Aided Diagnosis (CADx) and disease prediction. A key component in these models is to build a population graph, where the graph adjacency matrix represents pair-wise patient similarities. Until now, the similarity metrics have been defined manually, usually based on meta-features like demographics or clinical scores. The definition of the metric, however, needs careful tuning, as GCNs are very sensitive to the graph structure. In this paper, we demonstrate for the first time in the CADx domain that it is possible to learn a single, optimal graph towards the GCN’s downstream task of disease classification. To this end, we propose a novel, end-to-end trainable graph learning architecture for dynamic and localized graph pruning. Unlike commonly employed spectral GCN approaches, our GCN is spatial and inductive, and can thus infer previously unseen patients as well. We demonstrate significant classification improvements with our learned graph on two CADx problems in medicine. We further explain and visualize this result using an artificial dataset, underlining the importance of graph learning for more accurate and robust inference with GCNs in medical applications.
Tasks Disease Prediction
Published 2020-03-27
URL https://arxiv.org/abs/2003.13620v1
PDF https://arxiv.org/pdf/2003.13620v1.pdf
PWC https://paperswithcode.com/paper/latent-patient-network-learning-for-automatic
Repo
Framework

Regret Bounds for Noise-Free Bayesian Optimization

Title Regret Bounds for Noise-Free Bayesian Optimization
Authors Sattar Vakili, Victor Picheny, Nicolas Durrande
Abstract Bayesian optimisation is a powerful method for non-convex black-box optimization in low data regimes. However, the question of establishing tight upper bounds for common algorithms in the noiseless setting remains a largely open question. In this paper, we establish new and tightest bounds for two algorithms, namely GP-UCB and Thompson sampling, under the assumption that the objective function is smooth in terms of having a bounded norm in a Mat'ern RKHS. Importantly, unlike several related works, we do not consider perfect knowledge of the kernel of the Gaussian process emulator used within the Bayesian optimization loop. This allows us to provide results for practical algorithms that sequentially estimate the Gaussian process kernel parameters from the available data.
Tasks Bayesian Optimisation
Published 2020-02-12
URL https://arxiv.org/abs/2002.05096v1
PDF https://arxiv.org/pdf/2002.05096v1.pdf
PWC https://paperswithcode.com/paper/regret-bounds-for-noise-free-bayesian
Repo
Framework

Deep Unsupervised Common Representation Learning for LiDAR and Camera Data using Double Siamese Networks

Title Deep Unsupervised Common Representation Learning for LiDAR and Camera Data using Double Siamese Networks
Authors Andreas Bühler, Niclas Vödisch, Mathias Bürki, Lukas Schaupp
Abstract Domain gaps of sensor modalities pose a challenge for the design of autonomous robots. Taking a step towards closing this gap, we propose two unsupervised training frameworks for finding a common representation of LiDAR and camera data. The first method utilizes a double Siamese training structure to ensure consistency in the results. The second method uses a Canny edge image guiding the networks towards a desired representation. All networks are trained in an unsupervised manner, leaving room for scalability. The results are evaluated using common computer vision applications, and the limitations of the proposed approaches are outlined.
Tasks Representation Learning
Published 2020-01-03
URL https://arxiv.org/abs/2001.00762v1
PDF https://arxiv.org/pdf/2001.00762v1.pdf
PWC https://paperswithcode.com/paper/deep-unsupervised-common-representation
Repo
Framework

Design optimisation of a multi-mode wave energy converter

Title Design optimisation of a multi-mode wave energy converter
Authors Nataliia Y. Sergiienko, Mehdi Neshat, Leandro S. P. da Silva, Bradley Alexander, Markus Wagner
Abstract A wave energy converter (WEC) similar to the CETO system developed by Carnegie Clean Energy is considered for design optimisation. This WEC is able to absorb power from heave, surge and pitch motion modes, making the optimisation problem nontrivial. The WEC dynamics is simulated using the spectral-domain model taking into account hydrodynamic forces, viscous drag, and power take-off forces. The design parameters for optimisation include the buoy radius, buoy height, tether inclination angles, and control variables (damping and stiffness). The WEC design is optimised for the wave climate at Albany test site in Western Australia considering unidirectional irregular waves. Two objective functions are considered: (i) maximisation of the annual average power output, and (ii) minimisation of the levelised cost of energy (LCoE) for a given sea site. The LCoE calculation is approximated as a ratio of the produced energy to the significant mass of the system that includes the mass of the buoy and anchor system. Six different heuristic optimisation methods are applied in order to evaluate and compare the performance of the best known evolutionary algorithms, a swarm intelligence technique and a numerical optimisation approach. The results demonstrate that if we are interested in maximising energy production without taking into account the cost of manufacturing such a system, the buoy should be built as large as possible (20 m radius and 30 m height). However, if we want the system that produces cheap energy, then the radius of the buoy should be approximately 11-14~m while the height should be as low as possible. These results coincide with the overall design that Carnegie Clean Energy has selected for its CETO 6 multi-moored unit. However, it should be noted that this study is not informed by them, so this can be seen as an independent validation of the design choices.
Tasks
Published 2020-01-24
URL https://arxiv.org/abs/2001.08966v1
PDF https://arxiv.org/pdf/2001.08966v1.pdf
PWC https://paperswithcode.com/paper/design-optimisation-of-a-multi-mode-wave
Repo
Framework

FAE: A Fairness-Aware Ensemble Framework

Title FAE: A Fairness-Aware Ensemble Framework
Authors Vasileios Iosifidis, Besnik Fetahu, Eirini Ntoutsi
Abstract Automated decision making based on big data and machine learning (ML) algorithms can result in discriminatory decisions against certain protected groups defined upon personal data like gender, race, sexual orientation etc. Such algorithms designed to discover patterns in big data might not only pick up any encoded societal biases in the training data, but even worse, they might reinforce such biases resulting in more severe discrimination. The majority of thus far proposed fairness-aware machine learning approaches focus solely on the pre-, in- or post-processing steps of the machine learning process, that is, input data, learning algorithms or derived models, respectively. However, the fairness problem cannot be isolated to a single step of the ML process. Rather, discrimination is often a result of complex interactions between big data and algorithms, and therefore, a more holistic approach is required. The proposed FAE (Fairness-Aware Ensemble) framework combines fairness-related interventions at both pre- and postprocessing steps of the data analysis process. In the preprocessing step, we tackle the problems of under-representation of the protected group (group imbalance) and of class-imbalance by generating balanced training samples. In the post-processing step, we tackle the problem of class overlapping by shifting the decision boundary in the direction of fairness.
Tasks Decision Making
Published 2020-02-03
URL https://arxiv.org/abs/2002.00695v1
PDF https://arxiv.org/pdf/2002.00695v1.pdf
PWC https://paperswithcode.com/paper/fae-a-fairness-aware-ensemble-framework
Repo
Framework

Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights

Title Hierarchical Gaussian Process Priors for Bayesian Neural Network Weights
Authors Theofanis Karaletsos, Thang D. Bui
Abstract Probabilistic neural networks are typically modeled with independent weight priors, which do not capture weight correlations in the prior and do not provide a parsimonious interface to express properties in function space. A desirable class of priors would represent weights compactly, capture correlations between weights, facilitate calibrated reasoning about uncertainty, and allow inclusion of prior knowledge about the function space such as periodicity or dependence on contexts such as inputs. To this end, this paper introduces two innovations: (i) a Gaussian process-based hierarchical model for network weights based on unit embeddings that can flexibly encode correlated weight structures, and (ii) input-dependent versions of these weight priors that can provide convenient ways to regularize the function space through the use of kernels defined on contextual inputs. We show these models provide desirable test-time uncertainty estimates on out-of-distribution data, demonstrate cases of modeling inductive biases for neural networks with kernels which help both interpolation and extrapolation from training data, and demonstrate competitive predictive performance on an active learning benchmark.
Tasks Active Learning
Published 2020-02-10
URL https://arxiv.org/abs/2002.04033v1
PDF https://arxiv.org/pdf/2002.04033v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-gaussian-process-priors-for
Repo
Framework

From abstract items to latent spaces to observed data and back: Compositional Variational Auto-Encoder

Title From abstract items to latent spaces to observed data and back: Compositional Variational Auto-Encoder
Authors Victor Berger, Michèle Sebag
Abstract Conditional Generative Models are now acknowledged an essential tool in Machine Learning. This paper focuses on their control. While many approaches aim at disentangling the data through the coordinate-wise control of their latent representations, another direction is explored in this paper. The proposed CompVAE handles data with a natural multi-ensemblist structure (i.e. that can naturally be decomposed into elements). Derived from Bayesian variational principles, CompVAE learns a latent representation leveraging both observational and symbolic information. A first contribution of the approach is that this latent representation supports a compositional generative model, amenable to multi-ensemblist operations (addition or subtraction of elements in the composition). This compositional ability is enabled by the invariance and generality of the whole framework w.r.t. respectively, the order and number of the elements. The second contribution of the paper is a proof of concept on synthetic 1D and 2D problems, demonstrating the efficiency of the proposed approach.
Tasks
Published 2020-01-22
URL https://arxiv.org/abs/2001.07910v1
PDF https://arxiv.org/pdf/2001.07910v1.pdf
PWC https://paperswithcode.com/paper/from-abstract-items-to-latent-spaces-to
Repo
Framework

Algorithms in Multi-Agent Systems: A Holistic Perspective from Reinforcement Learning and Game Theory

Title Algorithms in Multi-Agent Systems: A Holistic Perspective from Reinforcement Learning and Game Theory
Authors Yunlong Lu, Kai Yan
Abstract Deep reinforcement learning (RL) has achieved outstanding results in recent years, which has led a dramatic increase in the number of methods and applications. Recent works are exploring learning beyond single-agent scenarios and considering multi-agent scenarios. However, they are faced with lots of challenges and are seeking for help from traditional game-theoretic algorithms, which, in turn, show bright application promise combined with modern algorithms and boosting computing power. In this survey, we first introduce basic concepts and algorithms in single agent RL and multi-agent systems; then, we summarize the related algorithms from three aspects. Solution concepts from game theory give inspiration to algorithms which try to evaluate the agents or find better solutions in multi-agent systems. Fictitious self-play becomes popular and has a great impact on the algorithm of multi-agent reinforcement learning. Counterfactual regret minimization is an important tool to solve games with incomplete information, and has shown great strength when combined with deep learning.
Tasks Multi-agent Reinforcement Learning
Published 2020-01-17
URL https://arxiv.org/abs/2001.06487v3
PDF https://arxiv.org/pdf/2001.06487v3.pdf
PWC https://paperswithcode.com/paper/algorithms-in-multi-agent-systems-a-holistic
Repo
Framework

Volumetric landmark detection with a multi-scale shift equivariant neural network

Title Volumetric landmark detection with a multi-scale shift equivariant neural network
Authors Tianyu Ma, Ajay Gupta, Mert R. Sabuncu
Abstract Deep neural networks yield promising results in a wide range of computer vision applications, including landmark detection. A major challenge for accurate anatomical landmark detection in volumetric images such as clinical CT scans is that large-scale data often constrain the capacity of the employed neural network architecture due to GPU memory limitations, which in turn can limit the precision of the output. We propose a multi-scale, end-to-end deep learning method that achieves fast and memory-efficient landmark detection in 3D images. Our architecture consists of blocks of shift-equivariant networks, each of which performs landmark detection at a different spatial scale. These blocks are connected from coarse to fine-scale, with differentiable resampling layers, so that all levels can be trained together. We also present a noise injection strategy that increases the robustness of the model and allows us to quantify uncertainty at test time. We evaluate our method for carotid artery bifurcations detection on 263 CT volumes and achieve a better than state-of-the-art accuracy with mean Euclidean distance error of 2.81mm.
Tasks
Published 2020-03-03
URL https://arxiv.org/abs/2003.01639v1
PDF https://arxiv.org/pdf/2003.01639v1.pdf
PWC https://paperswithcode.com/paper/volumetric-landmark-detection-with-a-multi
Repo
Framework

Causal query in observational data with hidden variables

Title Causal query in observational data with hidden variables
Authors Debo Cheng, Jiuyong Li, Lin Liu, Jixue Liu, Kui Yu, Thuc Duy Le
Abstract This paper discusses the problem of causal query in observational data with hidden variables, with the aim of seeking the change of an outcome when “manipulating” a variable while given a set of plausible confounding variables which affect the manipulated variable and the outcome. Such an “experiment on data” to estimate the causal effect of the manipulated variable is useful for validating an experiment design using historical data or for exploring confounders when studying a new relationship. However, existing data-driven methods for causal effect estimation face some major challenges, including poor scalability with high dimensional data, low estimation accuracy due to heuristics used by the global causal structure learning algorithms, and the assumption of causal sufficiency when hidden variables are inevitable in data. In this paper, we develop theorems for using local search to find a superset of the adjustment (or confounding) variables for causal effect estimation from observational data under a realistic pretreatment assumption. The theorems ensure that the unbiased estimate of causal effect is obtained in the set of causal effects estimated by the superset of adjustment variables. Based on the developed theorems, we propose a data-driven algorithm for causal query. Experiments show that the proposed algorithm is faster and produces better causal effect estimation than an existing data-driven causal effect estimation method with hidden variables. The causal effects estimated by the algorithm are as good as those by the state-of-the-art methods using domain knowledge.
Tasks
Published 2020-01-28
URL https://arxiv.org/abs/2001.10269v3
PDF https://arxiv.org/pdf/2001.10269v3.pdf
PWC https://paperswithcode.com/paper/causal-query-in-observational-data-with
Repo
Framework

SGP-DT: Semantic Genetic Programming Based on Dynamic Targets

Title SGP-DT: Semantic Genetic Programming Based on Dynamic Targets
Authors Stefano Ruberto, Valerio Terragni, Jason H. Moore
Abstract Semantic GP is a promising approach that introduces semantic awareness during genetic evolution. This paper presents a new Semantic GP approach based on Dynamic Target (SGP-DT) that divides the search problem into multiple GP runs. The evolution in each run is guided by a new (dynamic) target based on the residual errors. To obtain the final solution, SGP-DT combines the solutions of each run using linear scaling. SGP-DT presents a new methodology to produce the offspring that does not rely on the classic crossover. The synergy between such a methodology and linear scaling yields to final solutions with low approximation error and computational cost. We evaluate SGP-DT on eight well-known data sets and compare with {\epsilon}-lexicase, a state-of-the-art evolutionary technique. SGP-DT achieves small RMSE values, on average 23.19% smaller than the one of {\epsilon}-lexicase.
Tasks
Published 2020-01-30
URL https://arxiv.org/abs/2001.11535v1
PDF https://arxiv.org/pdf/2001.11535v1.pdf
PWC https://paperswithcode.com/paper/sgp-dt-semantic-genetic-programming-based-on
Repo
Framework

From Learning to Meta-Learning: Reduced Training Overhead and Complexity for Communication Systems

Title From Learning to Meta-Learning: Reduced Training Overhead and Complexity for Communication Systems
Authors Osvaldo Simeone, Sangwoo Park, Joonhyuk Kang
Abstract Machine learning methods adapt the parameters of a model, constrained to lie in a given model class, by using a fixed learning procedure based on data or active observations. Adaptation is done on a per-task basis, and retraining is needed when the system configuration changes. The resulting inefficiency in terms of data and training time requirements can be mitigated, if domain knowledge is available, by selecting a suitable model class and learning procedure, collectively known as inductive bias. However, it is generally difficult to encode prior knowledge into an inductive bias, particularly with black-box model classes such as neural networks. Meta-learning provides a way to automatize the selection of an inductive bias. Meta-learning leverages data or active observations from tasks that are expected to be related to future, and a priori unknown, tasks of interest. With a meta-trained inductive bias, training of a machine learning model can be potentially carried out with reduced training data and/or time complexity. This paper provides a high-level introduction to meta-learning with applications to communication systems.
Tasks Meta-Learning
Published 2020-01-05
URL https://arxiv.org/abs/2001.01227v1
PDF https://arxiv.org/pdf/2001.01227v1.pdf
PWC https://paperswithcode.com/paper/from-learning-to-meta-learning-reduced
Repo
Framework

$ε$-shotgun: $ε$-greedy Batch Bayesian Optimisation

Title $ε$-shotgun: $ε$-greedy Batch Bayesian Optimisation
Authors George De Ath, Richard M. Everson, Jonathan E. Fieldsend, Alma A. M. Rahat
Abstract Bayesian optimisation is a popular, surrogate model-based approach for optimising expensive black-box functions. Given a surrogate model, the next location to expensively evaluate is chosen via maximisation of a cheap-to-query acquisition function. We present an $\epsilon$-greedy procedure for Bayesian optimisation in batch settings in which the black-box function can be evaluated multiple times in parallel. Our $\epsilon$-shotgun algorithm leverages the model’s prediction, uncertainty, and the approximated rate of change of the landscape to determine the spread of batch solutions to be distributed around a putative location. The initial target location is selected either in an exploitative fashion on the mean prediction, or – with probability $\epsilon$ – from elsewhere in the design space. This results in locations that are more densely sampled in regions where the function is changing rapidly and in locations predicted to be good (i.e close to predicted optima), with more scattered samples in regions where the function is flatter and/or of poorer quality. We empirically evaluate the $\epsilon$-shotgun methods on a range of synthetic functions and two real-world problems, finding that they perform at least as well as state-of-the-art batch methods and in many cases exceed their performance.
Tasks Bayesian Optimisation
Published 2020-02-05
URL https://arxiv.org/abs/2002.01873v2
PDF https://arxiv.org/pdf/2002.01873v2.pdf
PWC https://paperswithcode.com/paper/-shotgun-greedy-batch-bayesian-optimisation
Repo
Framework
comments powered by Disqus