January 27, 2020

3045 words 15 mins read

Paper Group ANR 1280

MaLTESE: Large-Scale Simulation-Driven Machine Learning for Transient Driving Cycles. Variational Auto-Decoder. Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation. Modeling Embedding Dimension Correlations via Convolutional Neural Collaborative Filtering. Designing Evaluations of Machine Learni …

MaLTESE: Large-Scale Simulation-Driven Machine Learning for Transient Driving Cycles


Title	MaLTESE: Large-Scale Simulation-Driven Machine Learning for Transient Driving Cycles
Authors	Shashi M. Aithal, Prasanna Balaprakash
Abstract	Optimal engine operation during a transient driving cycle is the key to achieving greater fuel economy, engine efficiency, and reduced emissions. In order to achieve continuously optimal engine operation, engine calibration methods use a combination of static correlations obtained from dynamometer tests for steady-state operating points and road and/or track performance data. As the parameter space of control variables, design variable constraints, and objective functions increases, the cost and duration for optimal calibration become prohibitively large. In order to reduce the number of dynamometer tests required for calibrating modern engines, a large-scale simulation-driven machine learning approach is presented in this work. A parallel, fast, robust, physics-based reduced-order engine simulator is used to obtain performance and emission characteristics of engines over a wide range of control parameters under various transient driving conditions (drive cycles). We scale the simulation up to 3,906 nodes of the Theta supercomputer at the Argonne Leadership Computing Facility to generate data required to train a machine learning model. The trained model is then used to predict various engine parameters of interest. Our results show that a deep-neural-network-based surrogate model achieves high accuracy for various engine parameters such as exhaust temperature, exhaust pressure, nitric oxide, and engine torque. Once trained, the deep-neural-network-based surrogate model is fast for inference: it requires about 16 micro sec for predicting the engine performance and emissions for a single design configuration compared with about 0.5 s per configuration with the engine simulator. Moreover, we demonstrate that transfer learning and retraining can be leveraged to incrementally retrain the surrogate model to cope with new configurations that fall outside the training data space.
Tasks	Calibration, Transfer Learning
Published	2019-09-22
URL	https://arxiv.org/abs/1909.09929v1
PDF	https://arxiv.org/pdf/1909.09929v1.pdf
PWC	https://paperswithcode.com/paper/190909929
Repo
Framework

Variational Auto-Decoder


Title	Variational Auto-Decoder
Authors	Amir Zadeh, Yao-Chong Lim, Paul Pu Liang, Louis-Philippe Morency
Abstract	Learning a generative model from partial data (data with missingness) is a challenging area of machine learning research. We study a specific implementation of the Auto-Encoding Variational Bayes (AEVB) algorithm, named in this paper as a Variational Auto-Decoder (VAD). VAD is a generic framework which uses Variational Bayes and Markov Chain Monte Carlo (MCMC) methods to learn a generative model from partial data. The main distinction between VAD and Variational Auto-Encoder (VAE) is the encoder component, as VAD does not have one. Using a proposed efficient inference method from a multivariate Gaussian approximate posterior, VAD models allow inference to be performed via simple gradient ascent rather than MCMC sampling from a probabilistic decoder. This technique reduces the inference computational cost, allows for using more complex optimization techniques during latent space inference (which are shown to be crucial due to a high degree of freedom in the VAD latent space), and keeps the framework simple to implement. Through extensive experiments over several datasets and different missing ratios, we show that encoders cannot efficiently marginalize the input volatility caused by imputed missing values. We study multimodal datasets in this paper, which is a particular area of impact for VAD models.
Tasks
Published	2019-03-03
URL	https://arxiv.org/abs/1903.00840v5
PDF	https://arxiv.org/pdf/1903.00840v5.pdf
PWC	https://paperswithcode.com/paper/variational-auto-decoder-neural-generative
Repo
Framework

Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation


Title	Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation
Authors	Yun-Chun Chen, Yen-Yu Lin, Ming-Hsuan Yang, Jia-Bin Huang
Abstract	We present an approach for jointly matching and segmenting object instances of the same category within a collection of images. In contrast to existing algorithms that tackle the tasks of semantic matching and object co-segmentation in isolation, our method exploits the complementary nature of the two tasks. The key insights of our method are two-fold. First, the estimated dense correspondence fields from semantic matching provide supervision for object co-segmentation by enforcing consistency between the predicted masks from a pair of images. Second, the predicted object masks from object co-segmentation in turn allow us to reduce the adverse effects due to background clutters for improving semantic matching. Our model is end-to-end trainable and does not require supervision from manually annotated correspondences and object masks. We validate the efficacy of our approach on five benchmark datasets: TSS, Internet, PF-PASCAL, PF-WILLOW, and SPair-71k, and show that our algorithm performs favorably against the state-of-the-art methods on both semantic matching and object co-segmentation tasks.
Tasks
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05857v2
PDF	https://arxiv.org/pdf/1906.05857v2.pdf
PWC	https://paperswithcode.com/paper/show-match-and-segment-joint-learning-of
Repo
Framework

Modeling Embedding Dimension Correlations via Convolutional Neural Collaborative Filtering


Title	Modeling Embedding Dimension Correlations via Convolutional Neural Collaborative Filtering
Authors	Xiaoyu Du, Xiangnan He, Fajie Yuan, Jinhui Tang, Zhiguang Qin, Tat-Seng Chua
Abstract	As the core of recommender system, collaborative filtering (CF) models the affinity between a user and an item from historical user-item interactions, such as clicks, purchases, and so on. Benefited from the strong representation power, neural networks have recently revolutionized the recommendation research, setting up a new standard for CF. However, existing neural recommender models do not explicitly consider the correlations among embedding dimensions, making them less effective in modeling the interaction function between users and items. In this work, we emphasize on modeling the correlations among embedding dimensions in neural networks to pursue higher effectiveness for CF. We propose a novel and general neural collaborative filtering framework, namely ConvNCF, which is featured with two designs: 1) applying outer product on user embedding and item embedding to explicitly model the pairwise correlations between embedding dimensions, and 2) employing convolutional neural network above the outer product to learn the high-order correlations among embedding dimensions. To justify our proposal, we present three instantiations of ConvNCF by using different inputs to represent a user and conduct experiments on two real-world datasets. Extensive results verify the utility of modeling embedding dimension correlations with ConvNCF, which outperforms several competitive CF methods.
Tasks	Recommendation Systems
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11171v1
PDF	https://arxiv.org/pdf/1906.11171v1.pdf
PWC	https://paperswithcode.com/paper/modeling-embedding-dimension-correlations-via
Repo
Framework

Designing Evaluations of Machine Learning Models for Subjective Inference: The Case of Sentence Toxicity


Title	Designing Evaluations of Machine Learning Models for Subjective Inference: The Case of Sentence Toxicity
Authors	Agathe Balayn, Alessandro Bozzon
Abstract	Machine Learning (ML) is increasingly applied in real-life scenarios, raising concerns about bias in automatic decision making. We focus on bias as a notion of opinion exclusion, that stems from the direct application of traditional ML pipelines to infer subjective properties. We argue that such ML systems should be evaluated with subjectivity and bias in mind. Considering the lack of evaluation standards yet to create evaluation benchmarks, we propose an initial list of specifications to define prior to creating evaluation datasets, in order to later accurately evaluate the biases. With the example of a sentence toxicity inference system, we illustrate how the specifications support the analysis of biases related to subjectivity. We highlight difficulties in instantiating these specifications and list future work for the crowdsourcing community to help the creation of appropriate evaluation datasets.
Tasks	Decision Making
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02471v1
PDF	https://arxiv.org/pdf/1911.02471v1.pdf
PWC	https://paperswithcode.com/paper/designing-evaluations-of-machine-learning
Repo
Framework

A Unified Framework for Nonmonotonic Reasoning with Vagueness and Uncertainty


Title	A Unified Framework for Nonmonotonic Reasoning with Vagueness and Uncertainty
Authors	Sandip Paul, Kumar Sankar Ray, Diganta Saha
Abstract	An answer set programming paradigm is proposed that supports nonmonotonic reasoning with vague and uncertain information. The system can represent and reason with prioritized rules, rules with exceptions. An iterative method for answer set computation is proposed. The terminating conditions are identified for a class of logic programs using the notion of difference equations. In order to obtain the difference equations the set of rules are depicted by a signal-flow-graph like structure.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.06902v3
PDF	https://arxiv.org/pdf/1910.06902v3.pdf
PWC	https://paperswithcode.com/paper/a-unified-framework-for-nonmonotonic
Repo
Framework

PhaseDNN - A Parallel Phase Shift Deep Neural Network for Adaptive Wideband Learning


Title	PhaseDNN - A Parallel Phase Shift Deep Neural Network for Adaptive Wideband Learning
Authors	Wei Cai, Xiaoguang Li, Lizuo Liu
Abstract	In this paper, we propose a phase shift deep neural network (PhaseDNN) which provides a wideband convergence in approximating a high dimensional function during its training of the network. The PhaseDNN utilizes the fact that many DNN achieves convergence in the low frequency range first, thus, a series of moderately-sized of DNNs are constructed and trained in parallel for ranges of higher frequencies. With the help of phase shifts in the frequency domain, implemented through a simple phase factor multiplication on the training data, each DNN in the series will be trained to approximate the target function’s higher frequency content over a specific range. Due to the phase shift, each DNN achieves the speed of convergence as in the low frequency range. As a result, the proposed PhaseDNN system is able to convert wideband frequency learning to low frequency learning, thus allowing a uniform learning to wideband high dimensional functions with frequency adaptive training. Numerical results have demonstrated the capability of PhaseDNN in learning information of a target function from low to high frequency uniformly.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01389v2
PDF	https://arxiv.org/pdf/1905.01389v2.pdf
PWC	https://paperswithcode.com/paper/phasednn-a-parallel-phase-shift-deep-neural
Repo
Framework

Towards Understanding Neural Machine Translation with Word Importance


Title	Towards Understanding Neural Machine Translation with Word Importance
Authors	Shilin He, Zhaopeng Tu, Xing Wang, Longyue Wang, Michael R. Lyu, Shuming Shi
Abstract	Although neural machine translation (NMT) has advanced the state-of-the-art on various language pairs, the interpretability of NMT remains unsatisfactory. In this work, we propose to address this gap by focusing on understanding the input-output behavior of NMT models. Specifically, we measure the word importance by attributing the NMT output to every input word through a gradient-based method. We validate the approach on a couple of perturbation operations, language pairs, and model architectures, demonstrating its superiority on identifying input words with higher influence on translation performance. Encouragingly, the calculated importance can serve as indicators of input words that are under-translated by NMT models. Furthermore, our analysis reveals that words of certain syntactic categories have higher importance while the categories vary across language pairs, which can inspire better design principles of NMT architectures for multi-lingual translation.
Tasks	Machine Translation
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00326v2
PDF	https://arxiv.org/pdf/1909.00326v2.pdf
PWC	https://paperswithcode.com/paper/towards-understanding-neural-machine
Repo
Framework

Deep Multi-Index Hashing for Person Re-Identification


Title	Deep Multi-Index Hashing for Person Re-Identification
Authors	Ming-Wei Li, Qing-Yuan Jiang, Wu-Jun Li
Abstract	Traditional person re-identification (ReID) methods typically represent person images as real-valued features, which makes ReID inefficient when the gallery set is extremely large. Recently, some hashing methods have been proposed to make ReID more efficient. However, these hashing methods will deteriorate the accuracy in general, and the efficiency of them is still not high enough. In this paper, we propose a novel hashing method, called deep multi-index hashing (DMIH), to improve both efficiency and accuracy for ReID. DMIH seamlessly integrates multi-index hashing and multi-branch based networks into the same framework. Furthermore, a novel block-wise multi-index hashing table construction approach and a search-aware multi-index (SAMI) loss are proposed in DMIH to improve the search efficiency. Experiments on three widely used datasets show that DMIH can outperform other state-of-the-art baselines, including both hashing methods and real-valued methods, in terms of both efficiency and accuracy.
Tasks	Person Re-Identification
Published	2019-05-27
URL	https://arxiv.org/abs/1905.10980v1
PDF	https://arxiv.org/pdf/1905.10980v1.pdf
PWC	https://paperswithcode.com/paper/deep-multi-index-hashing-for-person-re
Repo
Framework

Discrete Infomax Codes for Supervised Representation Learning


Title	Discrete Infomax Codes for Supervised Representation Learning
Authors	Yoonho Lee, Wonjae Kim, Wonpyo Park, Seungjin Choi
Abstract	Learning compact discrete representations of data is a key task on its own or for facilitating subsequent processing of data. In this paper we present a model that produces Discrete InfoMax Codes (DIMCO); we learn a probabilistic encoder that yields k-way d-dimensional codes associated with input data. Our model’s learning objective is to maximize the mutual information between codes and labels with a regularization, which enforces entries of a codeword to be as independent as possible. We show that the infomax principle also justifies previous loss functions (e.g., cross-entropy) as its special cases. Our analysis also shows that using shorter codes, as DIMCO does, reduces overfitting in the context of few-shot classification. Through experiments in various domains, we observe this implicit meta-regularization effect of DIMCO. Furthermore, we show that the codes learned by DIMCO are efficient in terms of both memory and retrieval time compared to previous methods.
Tasks	Meta-Learning, Metric Learning, Representation Learning
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11656v2
PDF	https://arxiv.org/pdf/1905.11656v2.pdf
PWC	https://paperswithcode.com/paper/discrete-infomax-codes-for-meta-learning
Repo
Framework

Contextual Local Explanation for Black Box Classifiers


Title	Contextual Local Explanation for Black Box Classifiers
Authors	Zijian Zhang, Fan Yang, Haofan Wang, Xia Hu
Abstract	We introduce a new model-agnostic explanation technique which explains the prediction of any classifier called CLE. CLE gives an faithful and interpretable explanation to the prediction, by approximating the model locally using an interpretable model. We demonstrate the flexibility of CLE by explaining different models for text, tabular and image classification, and the fidelity of it by doing simulated user experiments.
Tasks	Image Classification
Published	2019-10-02
URL	https://arxiv.org/abs/1910.00768v1
PDF	https://arxiv.org/pdf/1910.00768v1.pdf
PWC	https://paperswithcode.com/paper/contextual-local-explanation-for-black-box
Repo
Framework

Real-time 2019 Portuguese Parliament Election Results Dataset


Title	Real-time 2019 Portuguese Parliament Election Results Dataset
Authors	Nuno Moniz
Abstract	This paper presents a data set describing the evolution of results in the Portuguese Parliamentary Elections of October 6$^{th}$ 2019. The data spans a time interval of 4 hours and 25 minutes, in intervals of 5 minutes, concerning the results of the 27 parties involved in the electoral event. The data set is tailored for predictive modelling tasks, mostly focused on numerical forecasting tasks. Regardless, it allows for other tasks such as ordinal regression or learn-to-rank.
Tasks
Published	2019-12-05
URL	https://arxiv.org/abs/1912.08922v1
PDF	https://arxiv.org/pdf/1912.08922v1.pdf
PWC	https://paperswithcode.com/paper/real-time-2019-portuguese-parliament-election
Repo
Framework

On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces


Title	On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces
Authors	Satoshi Hayakawa, Taiji Suzuki
Abstract	Deep learning has been applied to various tasks in the field of machine learning and has shown superiority to other common procedures such as kernel methods. To provide a better theoretical understanding of the reasons for its success, we discuss the performance of deep learning and other methods on a nonparametric regression problem with a Gaussian noise. Whereas existing theoretical studies of deep learning have been based mainly on mathematical theories of well-known function classes such as H"{o}lder and Besov classes, we focus on function classes with discontinuity and sparsity, which are those naturally assumed in practice. To highlight the effectiveness of deep learning, we compare deep learning with a class of linear estimators representative of a class of shallow estimators. It is shown that the minimax risk of a linear estimator on the convex hull of a target function class does not differ from that of the original target function class. This results in the suboptimality of linear methods over a simple but non-convex function class, on which deep learning can attain nearly the minimax-optimal rate. In addition to this extreme case, we consider function classes with sparse wavelet coefficients. On these function classes, deep learning also attains the minimax rate up to log factors of the sample size, and linear methods are still suboptimal if the assumed sparsity is strong. We also point out that the parameter sharing of deep neural networks can remarkably reduce the complexity of the model in our setting.
Tasks
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09195v2
PDF	https://arxiv.org/pdf/1905.09195v2.pdf
PWC	https://paperswithcode.com/paper/on-the-minimax-optimality-and-superiority-of
Repo
Framework

Stochastic Feedforward Neural Networks: Universal Approximation


Title	Stochastic Feedforward Neural Networks: Universal Approximation
Authors	Thomas Merkh, Guido Montúfar
Abstract	In this chapter we take a look at the universal approximation question for stochastic feedforward neural networks. In contrast to deterministic networks, which represent mappings from a set of inputs to a set of outputs, stochastic networks represent mappings from a set of inputs to a set of probability distributions over the set of outputs. In particular, even if the sets of inputs and outputs are finite, the class of stochastic mappings in question is not finite. Moreover, while for a deterministic function the values of all output variables can be computed independently of each other given the values of the inputs, in the stochastic setting the values of the output variables may need to be correlated, which requires that their values are computed jointly. A prominent class of stochastic feedforward networks which has played a key role in the resurgence of deep learning are deep belief networks. The representational power of these networks has been studied mainly in the generative setting, as models of probability distributions without an input, or in the discriminative setting for the special case of deterministic mappings. We study the representational power of deep sigmoid belief networks in terms of compositions of linear transformations of probability distributions, Markov kernels, that can be expressed by the layers of the network. We investigate different types of shallow and deep architectures, and the minimal number of layers and units per layer that are sufficient and necessary in order for the network to be able to approximate any given stochastic mapping from the set of inputs to the set of outputs arbitrarily well.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09763v1
PDF	https://arxiv.org/pdf/1910.09763v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-feedforward-neural-networks
Repo
Framework


Title	Adversarial reconstruction for Multi-modal Machine Translation
Authors	Jean-Benoit Delbrouck, Stéphane Dupont
Abstract	Even with the growing interest in problems at the intersection of Computer Vision and Natural Language, grounding (i.e. identifying) the components of a structured description in an image still remains a challenging task. This contribution aims to propose a model which learns grounding by reconstructing the visual features for the Multi-modal translation task. Previous works have partially investigated standard approaches such as regression methods to approximate the reconstruction of a visual input. In this paper, we propose a different and novel approach which learns grounding by adversarial feedback. To do so, we modulate our network following the recent promising adversarial architectures and evaluate how the adversarial response from a visual reconstruction as an auxiliary task helps the model in its learning. We report the highest scores in term of BLEU and METEOR metrics on the different datasets.
Tasks	Machine Translation
Published	2019-10-07
URL	https://arxiv.org/abs/1910.02766v1
PDF	https://arxiv.org/pdf/1910.02766v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-reconstruction-for-multi-modal
Repo
Framework