January 25, 2020

3031 words 15 mins read

Paper Group ANR 1692

Deep Metric Learning with Density Adaptivity. Solving machine learning optimization problems using quantum computers. A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation. Genetic Programming and Gradient Descent: A Memetic Approach to Binary Image Classification. Simultaneous Spe …

Deep Metric Learning with Density Adaptivity


Title	Deep Metric Learning with Density Adaptivity
Authors	Yehao Li, Ting Yao, Yingwei Pan, Hongyang Chao, Tao Mei
Abstract	The problem of distance metric learning is mostly considered from the perspective of learning an embedding space, where the distances between pairs of examples are in correspondence with a similarity metric. With the rise and success of Convolutional Neural Networks (CNN), deep metric learning (DML) involves training a network to learn a nonlinear transformation to the embedding space. Existing DML approaches often express the supervision through maximizing inter-class distance and minimizing intra-class variation. However, the results can suffer from overfitting problem, especially when the training examples of each class are embedded together tightly and the density of each class is very high. In this paper, we integrate density, i.e., the measure of data concentration in the representation, into the optimization of DML frameworks to adaptively balance inter-class similarity and intra-class variation by training the architecture in an end-to-end manner. Technically, the knowledge of density is employed as a regularizer, which is pluggable to any DML architecture with different objective functions such as contrastive loss, N-pair loss and triplet loss. Extensive experiments on three public datasets consistently demonstrate clear improvements by amending three types of embedding with the density adaptivity. More remarkably, our proposal increases Recall@1 from 67.95% to 77.62%, from 52.01% to 55.64% and from 68.20% to 70.56% on Cars196, CUB-200-2011 and Stanford Online Products dataset, respectively.
Tasks	Metric Learning
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03909v1
PDF	https://arxiv.org/pdf/1909.03909v1.pdf
PWC	https://paperswithcode.com/paper/deep-metric-learning-with-density-adaptivity
Repo
Framework

Solving machine learning optimization problems using quantum computers


Title	Solving machine learning optimization problems using quantum computers
Authors	Venkat R. Dasari, Mee Seong Im, Lubjana Beshaj
Abstract	Classical optimization algorithms in machine learning often take a long time to compute when applied to a multi-dimensional problem and require a huge amount of CPU and GPU resource. Quantum parallelism has a potential to speed up machine learning algorithms. We describe a generic mathematical model to leverage quantum parallelism to speed-up machine learning algorithms. We also apply quantum machine learning and quantum parallelism applied to a $3$-dimensional image that vary with time.
Tasks	Quantum Machine Learning
Published	2019-11-17
URL	https://arxiv.org/abs/1911.08587v1
PDF	https://arxiv.org/pdf/1911.08587v1.pdf
PWC	https://paperswithcode.com/paper/solving-machine-learning-optimization
Repo
Framework

A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation


Title	A cooperative game for automated learning of elasto-plasticity knowledge graphs and models with AI-guided experimentation
Authors	Kun Wang, WaiChing Sun, Qiang Du
Abstract	We introduce a multi-agent meta-modeling game to generate data, knowledge, and models that make predictions on constitutive responses of elasto-plastic materials. We introduce a new concept from graph theory where a modeler agent is tasked with evaluating all the modeling options recast as a directed multigraph and find the optimal path that links the source of the directed graph (e.g. strain history) to the target (e.g. stress) measured by an objective function. Meanwhile, the data agent, which is tasked with generating data from real or virtual experiments (e.g. molecular dynamics, discrete element simulations), interacts with the modeling agent sequentially and uses reinforcement learning to design new experiments to optimize the prediction capacity. Consequently, this treatment enables us to emulate an idealized scientific collaboration as selections of the optimal choices in a decision tree search done automatically via deep reinforcement learning.
Tasks	Knowledge Graphs
Published	2019-03-08
URL	http://arxiv.org/abs/1903.04307v1
PDF	http://arxiv.org/pdf/1903.04307v1.pdf
PWC	https://paperswithcode.com/paper/a-cooperative-game-for-automated-learning-of
Repo
Framework

Genetic Programming and Gradient Descent: A Memetic Approach to Binary Image Classification


Title	Genetic Programming and Gradient Descent: A Memetic Approach to Binary Image Classification
Authors	Benjamin Patrick Evans, Harith Al-Sahaf, Bing Xue, Mengjie Zhang
Abstract	Image classification is an essential task in computer vision, which aims to categorise a set of images into different groups based on some visual criteria. Existing methods, such as convolutional neural networks, have been successfully utilised to perform image classification. However, such methods often require human intervention to design a model. Furthermore, such models are difficult to interpret and it is challenging to analyse the patterns of different classes. This paper presents a hybrid (memetic) approach combining genetic programming (GP) and Gradient-based optimisation for image classification to overcome the limitations mentioned. The performance of the proposed method is compared to a baseline version (without local search) on four binary classification image datasets to provide an insight into the usefulness of local search mechanisms for enhancing the performance of GP.
Tasks	Image Classification
Published	2019-09-28
URL	https://arxiv.org/abs/1909.13030v1
PDF	https://arxiv.org/pdf/1909.13030v1.pdf
PWC	https://paperswithcode.com/paper/genetic-programming-and-gradient-descent-a
Repo
Framework

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models


Title	Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Authors	Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe
Abstract	This paper investigates the use of target-speaker automatic speech recognition (TS-ASR) for simultaneous speech recognition and speaker diarization of single-channel dialogue recordings. TS-ASR is a technique to automatically extract and recognize only the speech of a target speaker given a short sample utterance of that speaker. One obvious drawback of TS-ASR is that it cannot be used when the speakers in the recordings are unknown because it requires a sample of the target speakers in advance of decoding. To remove this limitation, we propose an iterative method, in which (i) the estimation of speaker embeddings and (ii) TS-ASR based on the estimated speaker embeddings are alternately executed. We evaluated the proposed method by using very challenging dialogue recordings in which the speaker overlap ratio was over 20%. We confirmed that the proposed method significantly reduced both the word error rate (WER) and diarization error rate (DER). Our proposed method combined with i-vector speaker embeddings ultimately achieved a WER that differed by only 2.1 % from that of TS-ASR given oracle speaker embeddings. Furthermore, our method can solve speaker diarization simultaneously as a by-product and achieved better DER than that of the conventional clustering-based speaker diarization method based on i-vector.
Tasks	Speaker Diarization, Speech Recognition
Published	2019-09-17
URL	https://arxiv.org/abs/1909.08103v1
PDF	https://arxiv.org/pdf/1909.08103v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-speech-recognition-and-speaker
Repo
Framework

Generative Mask Pyramid Network for CT/CBCT Metal Artifact Reduction with Joint Projection-Sinogram Correction


Title	Generative Mask Pyramid Network for CT/CBCT Metal Artifact Reduction with Joint Projection-Sinogram Correction
Authors	Haofu Liao, Wei-An Lin, Zhimin Huo, Levon Vogelsang, William J. Sehnert, S. Kevin Zhou, Jiebo Luo
Abstract	A conventional approach to computed tomography (CT) or cone beam CT (CBCT) metal artifact reduction is to replace the X-ray projection data within the metal trace with synthesized data. However, existing projection or sinogram completion methods cannot always produce anatomically consistent information to fill the metal trace, and thus, when the metallic implant is large, significant secondary artifacts are often introduced. In this work, we propose to replace metal artifact affected regions with anatomically consistent content through joint projection-sinogram correction as well as adversarial learning. To handle the metallic implants of diverse shapes and large sizes, we also propose a novel mask pyramid network that enforces the mask information across the network’s encoding layers and a mask fusion loss that reduces early saturation of adversarial training. Our experimental results show that the proposed projection-sinogram correction designs are effective and our method recovers information from the metal traces better than the state-of-the-art methods.
Tasks	Computed Tomography (CT), Metal Artifact Reduction
Published	2019-06-29
URL	https://arxiv.org/abs/1907.00294v3
PDF	https://arxiv.org/pdf/1907.00294v3.pdf
PWC	https://paperswithcode.com/paper/generative-mask-pyramid-network-forctcbct
Repo
Framework

Noise-Assisted Variational Hybrid Quantum-Classical Optimization


Title	Noise-Assisted Variational Hybrid Quantum-Classical Optimization
Authors	Laura Gentini, Alessandro Cuccoli, Stefano Pirandola, Paola Verrucchi, Leonardo Banchi
Abstract	Variational hybrid quantum-classical optimization represents one the most promising avenue to show the advantage of nowadays noisy intermediate-scale quantum computers in solving hard problems, such as finding the minimum-energy state of a Hamiltonian or solving some machine-learning tasks. In these devices noise is unavoidable and impossible to error-correct, yet its role in the optimization process is not much understood, especially from the theoretical viewpoint. Here we consider a minimization problem with respect to a variational state, iteratively obtained via a parametric quantum circuit, taking into account both the role of noise and the stochastic nature of quantum measurement outcomes. We show that the accuracy of the result obtained for a fixed number of iterations is bounded by a quantity related to the Quantum Fisher Information of the variational state. Using this bound, we find the unexpected result that, in some regimes, noise can be beneficial, allowing a faster solution to the optimization problem.
Tasks
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06744v1
PDF	https://arxiv.org/pdf/1912.06744v1.pdf
PWC	https://paperswithcode.com/paper/noise-assisted-variational-hybrid-quantum
Repo
Framework

Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning


Title	Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning
Authors	Hanchen Xu, Xiao Li, Xiangyu Zhang, Junbo Zhang
Abstract	In this letter, we address the problem of controlling energy storage systems (ESSs) for arbitrage in real-time electricity markets under price uncertainty. We first formulate this problem as a Markov decision process, and then develop a deep reinforcement learning based algorithm to learn a stochastic control policy that maps a set of available information processed by a recurrent neural network to ESSs’ charging/discharging actions. Finally, we verify the effectiveness of our algorithm using real-time electricity prices from PJM.
Tasks
Published	2019-04-28
URL	https://arxiv.org/abs/1904.12232v2
PDF	https://arxiv.org/pdf/1904.12232v2.pdf
PWC	https://paperswithcode.com/paper/arbitrage-of-energy-storage-in-electricity
Repo
Framework

Population-Guided Large Margin Classifier for High-Dimension Low -Sample-Size Problems


Title	Population-Guided Large Margin Classifier for High-Dimension Low -Sample-Size Problems
Authors	Qingbo Yin, Ehsan Adeli, Liran Shen, Dinggang Shen
Abstract	Various applications in different fields, such as gene expression analysis or computer vision, suffer from data sets with high-dimensional low-sample-size (HDLSS), which has posed significant challenges for standard statistical and modern machine learning methods. In this paper, we propose a novel linear binary classifier, denoted by population-guided large margin classifier (PGLMC), which is applicable to any sorts of data, including HDLSS. PGLMC is conceived with a projecting direction w given by the comprehensive consideration of local structural information of the hyperplane and the statistics of the training samples. Our proposed model has several advantages compared to those widely used approaches. First, it is not sensitive to the intercept term b. Second, it operates well with imbalanced data. Third, it is relatively simple to be implemented based on Quadratic Programming. Fourth, it is robust to the model specification for various real applications. The theoretical properties of PGLMC are proven. We conduct a series of evaluations on two simulated and six real-world benchmark data sets, including DNA classification, digit recognition, medical image analysis, and face recognition. PGLMC outperforms the state-of-the-art classification methods in most cases, or at least obtains comparable results.
Tasks	Face Recognition
Published	2019-01-05
URL	http://arxiv.org/abs/1901.01377v1
PDF	http://arxiv.org/pdf/1901.01377v1.pdf
PWC	https://paperswithcode.com/paper/population-guided-large-margin-classifier-for
Repo
Framework

Non-native Speaker Verification for Spoken Language Assessment


Title	Non-native Speaker Verification for Spoken Language Assessment
Authors	Linlin Wang, Yu Wang, Mark J. F. Gales
Abstract	Automatic spoken language assessment systems are becoming more popular in order to handle increasing interests in second language learning. One challenge for these systems is to detect malpractice. Malpractice can take a range of forms, this paper focuses on detecting when a candidate attempts to impersonate another in a speaking test. This form of malpractice is closely related to speaker verification, but applied in the specific domain of spoken language assessment. Advanced speaker verification systems, which leverage deep-learning approaches to extract speaker representations, have been successfully applied to a range of native speaker verification tasks. These systems are explored for non-native spoken English data in this paper. The data used for speaker enrolment and verification is mainly taken from the BULATS test, which assesses English language skills for business. Performance of systems trained on relatively limited amounts of BULATS data, and standard large speaker verification corpora, is compared. Experimental results on large-scale test sets with millions of trials show that the best performance is achieved by adapting the imported model to non-native data. Breakdown of impostor trials across different first languages (L1s) and grades is analysed, which shows that inter-L1 impostors are more challenging for speaker verification systems.
Tasks	Speaker Verification
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13695v1
PDF	https://arxiv.org/pdf/1909.13695v1.pdf
PWC	https://paperswithcode.com/paper/non-native-speaker-verification-for-spoken
Repo
Framework

Fair Generative Modeling via Weak Supervision


Title	Fair Generative Modeling via Weak Supervision
Authors	Aditya Grover, Kristy Choi, Rui Shu, Stefano Ermon
Abstract	Real-world datasets are often biased with respect to key demographic factors such as race and gender. Due to the latent nature of the underlying factors, detecting and mitigating bias is especially challenging for unsupervised machine learning. We present a weakly supervised algorithm for overcoming dataset bias for deep generative models. Our approach requires access to an additional small, unlabeled but unbiased dataset as the supervision signal, thus sidestepping the need for explicit labels on the underlying bias factors. Using this supplementary dataset, we detect the bias in existing datasets via a density ratio technique and learn generative models which efficiently achieve the twin goals of: 1) data efficiency by using training examples from both biased and unbiased datasets for learning, 2) unbiased data generation at test time. Empirically, we demonstrate the efficacy of our approach which reduces bias w.r.t. latent factors by 57.1% on average over baselines for comparable image generation using generative adversarial networks.
Tasks	Image Generation
Published	2019-10-26
URL	https://arxiv.org/abs/1910.12008v1
PDF	https://arxiv.org/pdf/1910.12008v1.pdf
PWC	https://paperswithcode.com/paper/fair-generative-modeling-via-weak-supervision
Repo
Framework

Improving RNN Transducer Modeling for End-to-End Speech Recognition


Title	Improving RNN Transducer Modeling for End-to-End Speech Recognition
Authors	Jinyu Li, Rui Zhao, Hu Hu, Yifan Gong
Abstract	In the last few years, an emerging trend in automatic speech recognition research is the study of end-to-end (E2E) systems. Connectionist Temporal Classification (CTC), Attention Encoder-Decoder (AED), and RNN Transducer (RNN-T) are the most popular three methods. Among these three methods, RNN-T has the advantages to do online streaming which is challenging to AED and it doesn’t have CTC’s frame-independence assumption. In this paper, we improve the RNN-T training in two aspects. First, we optimize the training algorithm of RNN-T to reduce the memory consumption so that we can have larger training minibatch for faster training speed. Second, we propose better model structures so that we obtain RNN-T models with the very good accuracy but small footprint. Trained with 30 thousand hours anonymized and transcribed Microsoft production data, the best RNN-T model with even smaller model size (216 Megabytes) achieves up-to 11.8% relative word error rate (WER) reduction from the baseline RNN-T model. This best RNN-T model is significantly better than the device hybrid model with similar size by achieving up-to 15.0% relative WER reduction, and obtains similar WERs as the server hybrid model of 5120 Megabytes in size.
Tasks	End-To-End Speech Recognition, Speech Recognition
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12415v1
PDF	https://arxiv.org/pdf/1909.12415v1.pdf
PWC	https://paperswithcode.com/paper/improving-rnn-transducer-modeling-for-end-to
Repo
Framework

Model-Augmented Estimation of Conditional Mutual Information for Feature Selection


Title	Model-Augmented Estimation of Conditional Mutual Information for Feature Selection
Authors	Alan Yang, AmirEmad Ghassami, Maxim Raginsky, Negar Kiyavash, Elyse Rosenbaum
Abstract	Markov blanket feature selection, while theoretically optimal, generally is challenging to implement. This is due to the shortcomings of existing approaches to conditional independence (CI) testing, which tend to struggle either with the curse of dimensionality or computational complexity. We propose a novel two-step approach which facilitates Markov blanket feature selection in high dimensions. First, neural networks are used to map features to low-dimensional representations. In the second step, CI testing is performed by applying the $k$-NN conditional mutual information estimator to the learned feature maps. The mappings are designed to ensure that mapped samples both preserve information and share similar information about the target variable if and only if they are close in Euclidean distance. We show that these properties boost the performance of the $k$-NN estimator in the second step. The performance of the proposed method is evaluated on both synthetic and real data.
Tasks	Feature Selection
Published	2019-11-12
URL	https://arxiv.org/abs/1911.04628v2
PDF	https://arxiv.org/pdf/1911.04628v2.pdf
PWC	https://paperswithcode.com/paper/model-augmented-nearest-neighbor-estimation
Repo
Framework

Distilling Black-Box Travel Mode Choice Model for Behavioral Interpretation


Title	Distilling Black-Box Travel Mode Choice Model for Behavioral Interpretation
Authors	Xilei Zhao, Zhengze Zhou, Xiang Yan, Pascal Van Hentenryck
Abstract	Machine learning has proved to be very successful for making predictions in travel behavior modeling. However, most machine-learning models have complex model structures and offer little or no explanation as to how they arrive at these predictions. Interpretations about travel behavior models are essential for decision makers to understand travelers’ preferences and plan policy interventions accordingly. Therefore, this paper proposes to apply and extend the model distillation approach, a model-agnostic machine-learning interpretation method, to explain how a black-box travel mode choice model makes predictions for the entire population and subpopulations of interest. Model distillation aims at compressing knowledge from a complex model (teacher) into an understandable and interpretable model (student). In particular, the paper integrates model distillation with market segmentation to generate more insights by accounting for heterogeneity. Furthermore, the paper provides a comprehensive comparison of student models with the benchmark model (decision tree) and the teacher model (gradient boosting trees) to quantify the fidelity and accuracy of the students’ interpretations.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13930v1
PDF	https://arxiv.org/pdf/1910.13930v1.pdf
PWC	https://paperswithcode.com/paper/distilling-black-box-travel-mode-choice-model
Repo
Framework

Bridging the Gap Between $f$-GANs and Wasserstein GANs


Title	Bridging the Gap Between $f$-GANs and Wasserstein GANs
Authors	Jiaming Song, Stefano Ermon
Abstract	Generative adversarial networks (GANs) have enjoyed much success in learning high-dimensional distributions. Learning objectives approximately minimize an $f$-divergence ($f$-GANs) or an integral probability metric (Wasserstein GANs) between the model and the data distribution using a discriminator. Wasserstein GANs enjoy superior empirical performance, but in $f$-GANs the discriminator can be interpreted as a density ratio estimator which is necessary in some GAN applications. In this paper, we bridge the gap between $f$-GANs and Wasserstein GANs (WGANs). First, we list two constraints over variational $f$-divergence estimation objectives that preserves the optimal solution. Next, we minimize over a Lagrangian relaxation of the constrained objective, and show that it generalizes critic objectives of both $f$-GAN and WGAN. Based on this generalization, we propose a novel practical objective, named KL-Wasserstein GAN (KL-WGAN). We demonstrate empirical success of KL-WGAN on synthetic datasets and real-world image generation benchmarks, and achieve state-of-the-art FID scores on CIFAR10 image generation.
Tasks	Image Generation
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09779v1
PDF	https://arxiv.org/pdf/1910.09779v1.pdf
PWC	https://paperswithcode.com/paper/bridging-the-gap-between-f-gans-and
Repo
Framework