April 1, 2020

2929 words 14 mins read

Paper Group ANR 477

MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding. Clustering and Classification with Non-Existence Attributes: A Sentenced Discrepancy Measure Based Technique. Belief Base Revision for Further Improvement of Unified Answer Set Programming. Self-supervised visual feature learning with curriculum. Tensor Networks for Language Modeling. Are …

MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding


Title	MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding
Authors	Chaoqi Yang, Junwei Lu, Xiaofeng Gao, Haishan Liu, Qiong Chen, Gongshen Liu, Guihai Chen
Abstract	Online real-time bidding (RTB) is known as a complex auction game where ad platforms seek to consider various influential key performance indicators (KPIs), like revenue and return on investment (ROI). The trade-off among these competing goals needs to be balanced on a massive scale. To address the problem, we propose a multi-objective reinforcement learning algorithm, named MoTiAC, for the problem of bidding optimization with various goals. Specifically, in MoTiAC, instead of using a fixed and linear combination of multiple objectives, we compute adaptive weights overtime on the basis of how well the current state agrees with the agent’s prior. In addition, we provide interesting properties of model updating and further prove that Pareto optimality could be guaranteed. We demonstrate the effectiveness of our method on a real-world commercial dataset. Experiments show that the model outperforms all state-of-the-art baselines.
Tasks
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07408v1
PDF	https://arxiv.org/pdf/2002.07408v1.pdf
PWC	https://paperswithcode.com/paper/motiac-multi-objective-actor-critics-for-real
Repo
Framework

Clustering and Classification with Non-Existence Attributes: A Sentenced Discrepancy Measure Based Technique


Title	Clustering and Classification with Non-Existence Attributes: A Sentenced Discrepancy Measure Based Technique
Authors	Y. A. Joarder, Emran Hossain, Al Faisal Mahmud
Abstract	For some or all of the data instances a number of independent-world clustering issues suffer from incomplete data characterization due to losing or absent attributes. Typical clustering approaches cannot be applied directly to such data unless pre-processing by techniques like imputation or marginalization. We have overcome this drawback by utilizing a Sentenced Discrepancy Measure which we refer to as the Attribute Weighted Penalty based Discrepancy (AWPD). Using the AWPD measure, we modified the K-MEANS++ and Scalable K-MEANS++ for clustering algorithm and k Nearest Neighbor (kNN) for classification so as to make them directly applicable to datasets with non-existence attributes. We have presented a detailed theoretical analysis which shows that the new AWPD based K-MEANS++, Scalable K-MEANS++ and kNN algorithm merge into a local prime among the number of iterations is finite. We have reported in depth experiments on numerous benchmark datasets for various forms of Non-Existence showing that the projected clustering and classification techniques usually show better results in comparison to some of the renowned imputation methods that are generally used to process such insufficient data. This technique is designed to trace invaluable data to: directly apply our method on the datasets which have Non-Existence attributes and establish a method for detecting unstructured Non-Existence attributes with the best accuracy rate and minimum cost.
Tasks	Imputation
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10411v1
PDF	https://arxiv.org/pdf/2002.10411v1.pdf
PWC	https://paperswithcode.com/paper/clustering-and-classification-with-non
Repo
Framework

Belief Base Revision for Further Improvement of Unified Answer Set Programming


Title	Belief Base Revision for Further Improvement of Unified Answer Set Programming
Authors	Kumar Sankar Ray, Sandip Paul, Diganta Saha
Abstract	A belief base revision is developed. The belief base is represented using Unified Answer Set Programs which is capable of representing imprecise and uncertain information and perform nonomonotonic reasoning with them. The base revision operator is developed using Removed Set Revision strategy. The operator is characterized with respect to the postulates for base revisions operator satisfies.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2003.04369v1
PDF	https://arxiv.org/pdf/2003.04369v1.pdf
PWC	https://paperswithcode.com/paper/belief-base-revision-for-further-improvement
Repo
Framework

Self-supervised visual feature learning with curriculum


Title	Self-supervised visual feature learning with curriculum
Authors	Vishal Keshav, Fabien Delattre
Abstract	Self-supervised learning techniques have shown their abilities to learn meaningful feature representation. This is made possible by training a model on pretext tasks that only requires to find correlations between inputs or parts of inputs. However, such pretext tasks need to be carefully hand selected to avoid low level signals that could make those pretext tasks trivial. Moreover, removing those shortcuts often leads to the loss of some semantically valuable information. We show that it directly impacts the speed of learning of the downstream task. In this paper we took inspiration from curriculum learning to progressively remove low level signals and show that it significantly increase the speed of convergence of the downstream task.
Tasks
Published	2020-01-16
URL	https://arxiv.org/abs/2001.05634v1
PDF	https://arxiv.org/pdf/2001.05634v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-visual-feature-learning-with-1
Repo
Framework

Tensor Networks for Language Modeling


Title	Tensor Networks for Language Modeling
Authors	Jacob Miller, Guillaume Rabusseau, John Terilla
Abstract	The tensor network formalism has enjoyed over two decades of success in modeling the behavior of complex quantum-mechanical systems, but has only recently and sporadically been leveraged in machine learning. Here we introduce a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data. We identify several distinctive features of this recurrent generative model, notably the ability to condition or marginalize sampling on characters at arbitrary locations within a sequence, with no need for approximate sampling methods. Despite the sequential architecture of u-MPS, we show that a recursive evaluation algorithm can be used to parallelize its inference and training, with a string of length n only requiring parallel time $\mathcal{O}(\log n)$ to evaluate. Experiments on a context-free language demonstrate a strong capacity to learn grammatical structure from limited data, pointing towards the potential of tensor networks for language modeling applications.
Tasks	Language Modelling, Tensor Networks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.01039v1
PDF	https://arxiv.org/pdf/2003.01039v1.pdf
PWC	https://paperswithcode.com/paper/tensor-networks-for-language-modeling
Repo
Framework

Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction


Title	Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
Authors	Taeuk Kim, Jihun Choi, Daniel Edmiston, Sang-goo Lee
Abstract	With the recent success and popularity of pre-trained language models (LMs) in natural language processing, there has been a rise in efforts to understand their inner workings. In line with such interest, we propose a novel method that assists us in investigating the extent to which pre-trained LMs capture the syntactic notion of constituency. Our method provides an effective way of extracting constituency trees from the pre-trained LMs without training. In addition, we report intriguing findings in the induced trees, including the fact that pre-trained LMs outperform other approaches in correctly demarcating adverb phrases in sentences.
Tasks
Published	2020-01-30
URL	https://arxiv.org/abs/2002.00737v1
PDF	https://arxiv.org/pdf/2002.00737v1.pdf
PWC	https://paperswithcode.com/paper/are-pre-trained-language-models-aware-of-1
Repo
Framework

A review on outlier/anomaly detection in time series data


Title	A review on outlier/anomaly detection in time series data
Authors	Ane Blázquez-García, Angel Conde, Usue Mori, Jose A. Lozano
Abstract	Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for researchers and practitioners in the past few years, including the detection of outliers or anomalies that may represent errors or events of interest. This review aims to provide a structured and comprehensive state-of-the-art on outlier detection techniques in the context of time series. To this end, a taxonomy is presented based on the main aspects that characterize an outlier detection technique.
Tasks	Anomaly Detection, Outlier Detection, Time Series
Published	2020-02-11
URL	https://arxiv.org/abs/2002.04236v1
PDF	https://arxiv.org/pdf/2002.04236v1.pdf
PWC	https://paperswithcode.com/paper/a-review-on-outlieranomaly-detection-in-time
Repo
Framework

Optimal statistical inference in the presence of systematic uncertainties using neural network optimization based on binned Poisson likelihoods with nuisance parameters


Title	Optimal statistical inference in the presence of systematic uncertainties using neural network optimization based on binned Poisson likelihoods with nuisance parameters
Authors	Stefan Wunsch, Simon Jörger, Roger Wolf, Günter Quast
Abstract	Data analysis in science, e.g., high-energy particle physics, is often subject to an intractable likelihood if the observables and observations span a high-dimensional input space. Typically the problem is solved by reducing the dimensionality using feature engineering and histograms, whereby the latter technique allows to build the likelihood using Poisson statistics. However, in the presence of systematic uncertainties represented by nuisance parameters in the likelihood, the optimal dimensionality reduction with a minimal loss of information about the parameters of interest is not known. This work presents a novel strategy to construct the dimensionality reduction with neural networks for feature engineering and a differential formulation of histograms so that the full workflow can be optimized with the result of the statistical inference, e.g., the variance of a parameter of interest, as objective. We discuss how this approach results in an estimate of the parameters of interest that is close to optimal and the applicability of the technique is demonstrated with a simple example based on pseudo-experiments and a more complex example from high-energy particle physics.
Tasks	Dimensionality Reduction, Feature Engineering
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07186v1
PDF	https://arxiv.org/pdf/2003.07186v1.pdf
PWC	https://paperswithcode.com/paper/optimal-statistical-inference-in-the-presence
Repo
Framework

Outlier Detection Ensemble with Embedded Feature Selection


Title	Outlier Detection Ensemble with Embedded Feature Selection
Authors	Li Cheng, Yijie Wang, Xinwang Liu, Bin Li
Abstract	Feature selection places an important role in improving the performance of outlier detection, especially for noisy data. Existing methods usually perform feature selection and outlier scoring separately, which would select feature subsets that may not optimally serve for outlier detection, leading to unsatisfying performance. In this paper, we propose an outlier detection ensemble framework with embedded feature selection (ODEFS), to address this issue. Specifically, for each random sub-sampling based learning component, ODEFS unifies feature selection and outlier detection into a pairwise ranking formulation to learn feature subsets that are tailored for the outlier detection method. Moreover, we adopt the thresholded self-paced learning to simultaneously optimize feature selection and example selection, which is helpful to improve the reliability of the training set. After that, we design an alternate algorithm with proved convergence to solve the resultant optimization problem. In addition, we analyze the generalization error bound of the proposed framework, which provides theoretical guarantee on the method and insightful practical guidance. Comprehensive experimental results on 12 real-world datasets from diverse domains validate the superiority of the proposed ODEFS.
Tasks	Feature Selection, Outlier Detection
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05492v1
PDF	https://arxiv.org/pdf/2001.05492v1.pdf
PWC	https://paperswithcode.com/paper/outlier-detection-ensemble-with-embedded
Repo
Framework

High-dimensional mixed-frequency IV regression


Title	High-dimensional mixed-frequency IV regression
Authors	Andrii Babii
Abstract	This paper introduces a high-dimensional linear IV regression for the data sampled at mixed frequencies. We show that the high-dimensional slope parameter of a high-frequency covariate can be identified and accurately estimated leveraging on a low-frequency instrumental variable. The distinguishing feature of the model is that it allows handing high-dimensional datasets without imposing the approximate sparsity restrictions. We propose a Tikhonov-regularized estimator and derive the convergence rate of its mean-integrated squared error for time series data. The estimator has a closed-form expression that is easy to compute and demonstrates excellent performance in our Monte Carlo experiments. We estimate the real-time price elasticity of supply on the Australian electricity spot market. Our estimates suggest that the supply is relatively inelastic and that its elasticity is heterogeneous throughout the day.
Tasks	Time Series
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13478v1
PDF	https://arxiv.org/pdf/2003.13478v1.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-mixed-frequency-iv
Repo
Framework

Interpreting a Penalty as the Influence of a Bayesian Prior


Title	Interpreting a Penalty as the Influence of a Bayesian Prior
Authors	Pierre Wolinski, Guillaume Charpiat, Yann Ollivier
Abstract	In machine learning, it is common to optimize the parameters of a probabilistic model, modulated by a somewhat ad hoc regularization term that penalizes some values of the parameters. Regularization terms appear naturally in Variational Inference (VI), a tractable way to approximate Bayesian posteriors: the loss to optimize contains a Kullback–Leibler divergence term between the approximate posterior and a Bayesian prior. We fully characterize which regularizers can arise this way, and provide a systematic way to compute the corresponding prior. This viewpoint also provides a prediction for useful values of the regularization factor in neural networks. We apply this framework to regularizers such as L1 or group-Lasso.
Tasks
Published	2020-02-01
URL	https://arxiv.org/abs/2002.00178v1
PDF	https://arxiv.org/pdf/2002.00178v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-a-penalty-as-the-influence-of-a
Repo
Framework

Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data


Title	Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data
Authors	Kun Zhou, Berrak Sisman, Haizhou Li
Abstract	Emotional voice conversion is to convert the spectrum and prosody to change the emotional patterns of speech, while preserving the speaker identity and linguistic content. Many studies require parallel speech data between different emotional patterns, which is not practical in real life. Moreover, they often model the conversion of fundamental frequency (F0) with a simple linear transform. As F0 is a key aspect of intonation that is hierarchical in nature, we believe that it is more adequate to model F0 in different temporal scales by using wavelet transform. We propose a CycleGAN network to find an optimal pseudo pair from non-parallel training data by learning forward and inverse mappings simultaneously using adversarial and cycle-consistency losses. We also study the use of continuous wavelet transform (CWT) to decompose F0 into ten temporal scales, that describes speech prosody at different time resolution, for effective F0 conversion. Experimental results show that our proposed framework outperforms the baselines both in objective and subjective evaluations.
Tasks	Voice Conversion
Published	2020-02-01
URL	https://arxiv.org/abs/2002.00198v1
PDF	https://arxiv.org/pdf/2002.00198v1.pdf
PWC	https://paperswithcode.com/paper/transforming-spectrum-and-prosody-for
Repo
Framework

The Data Representativeness Criterion: Predicting the Performance of Supervised Classification Based on Data Set Similarity


Title	The Data Representativeness Criterion: Predicting the Performance of Supervised Classification Based on Data Set Similarity
Authors	Evelien Schat, Rens van de Schoot, Wouter M. Kouw, Duco Veen, Adriënne M. Mendrik
Abstract	In a broad range of fields it may be desirable to reuse a supervised classification algorithm and apply it to a new data set. However, generalization of such an algorithm and thus achieving a similar classification performance is only possible when the training data used to build the algorithm is similar to new unseen data one wishes to apply it to. It is often unknown in advance how an algorithm will perform on new unseen data, being a crucial reason for not deploying an algorithm at all. Therefore, tools are needed to measure the similarity of data sets. In this paper, we propose the Data Representativeness Criterion (DRC) to determine how representative a training data set is of a new unseen data set. We present a proof of principle, to see whether the DRC can quantify the similarity of data sets and whether the DRC relates to the performance of a supervised classification algorithm. We compared a number of magnetic resonance imaging (MRI) data sets, ranging from subtle to severe difference is acquisition parameters. Results indicate that, based on the similarity of data sets, the DRC is able to give an indication as to when the performance of a supervised classifier decreases. The strictness of the DRC can be set by the user, depending on what one considers to be an acceptable underperformance.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12105v1
PDF	https://arxiv.org/pdf/2002.12105v1.pdf
PWC	https://paperswithcode.com/paper/the-data-representativeness-criterion
Repo
Framework

ICE-BeeM: Identifiable Conditional Energy-Based Deep Models


Title	ICE-BeeM: Identifiable Conditional Energy-Based Deep Models
Authors	Ilyes Khemakhem, Ricardo Pio Monti, Diederik P. Kingma, Aapo Hyvärinen
Abstract	Despite the growing popularity of energy-based models, their identifiability properties are not well-understood. In this paper we establish sufficient conditions under which a large family of conditional energy-based models is identifiable in function space, up to a simple transformation. Our results build on recent developments in the theory of nonlinear ICA, showing that the latent representations in certain families of deep latent-variable models are identifiable. We extend these results to a very broad family of conditional energy-based models. In this family, the energy function is simply the dot-product between two feature extractors, one for the dependent variable, and one for the conditioning variable. We show that under mild conditions, the features are unique up to scaling and permutation. Second, we propose the framework of independently modulated component analysis (IMCA), a new form of nonlinear ICA where the indepencence assumption is relaxed. Importantly, we show that our energy-based model can be used for the estimation of the components: the features learned are a simple and often trivial transformation of the latents.
Tasks	Latent Variable Models
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11537v1
PDF	https://arxiv.org/pdf/2002.11537v1.pdf
PWC	https://paperswithcode.com/paper/ice-beem-identifiable-conditional-energy
Repo
Framework

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning


Title	Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Authors	Zhiyuan Fang, Tejas Gokhale, Pratyay Banerjee, Chitta Baral, Yezhou Yang
Abstract	Captioning is a crucial and challenging task for video understanding. In videos that involve active agents such as humans, the agent’s actions can bring about myriad changes in the scene. These changes can be observable, such as movements, manipulations, and transformations of the objects in the scene – these are reflected in conventional video captioning. However, unlike images, actions in videos are also inherently linked to social and commonsense aspects such as intentions (why the action is taking place), attributes (such as who is doing the action, on whom, where, using what etc.) and effects (how the world changes due to the action, the effect of the action on other agents). Thus for video understanding, such as when captioning videos or when answering question about videos, one must have an understanding of these commonsense aspects. We present the first work on generating \textit{commonsense} captions directly from videos, in order to describe latent aspects such as intentions, attributes, and effects. We present a new dataset “Video-to-Commonsense (V2C)” that contains 9k videos of human agents performing various actions, annotated with 3 types of commonsense descriptions. Additionally we explore the use of open-ended video-based commonsense question answering (V2C-QA) as a way to enrich our captions. We finetune our commonsense generation models on the V2C-QA task where we ask questions about the latent aspects in the video. Both the generation task and the QA task can be used to enrich video captions.
Tasks	Question Answering, Video Captioning, Video Understanding
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05162v2
PDF	https://arxiv.org/pdf/2003.05162v2.pdf
PWC	https://paperswithcode.com/paper/video2commonsense-generating-commonsense
Repo
Framework