Paper Group ANR 297
Deep Speech Denoising with Vector Space Projections. Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning. Accumulating Knowledge for Lifelong Online Learning. Efficient Bayesian Inference of Sigmoidal Gaussian Cox Processes. Towards Principled Uncertainty Estimation for Deep Neural Networks. Generalizability of pred …
Deep Speech Denoising with Vector Space Projections
Title | Deep Speech Denoising with Vector Space Projections |
Authors | Jeff Hetherly, Paul Gamble, Maria Barrios, Cory Stephenson, Karl Ni |
Abstract | We propose an algorithm to denoise speakers from a single microphone in the presence of non-stationary and dynamic noise. Our approach is inspired by the recent success of neural network models separating speakers from other speakers and singers from instrumental accompaniment. Unlike prior art, we leverage embedding spaces produced with source-contrastive estimation, a technique derived from negative sampling techniques in natural language processing, while simultaneously obtaining a continuous inference mask. Our embedding space directly optimizes for the discrimination of speaker and noise by jointly modeling their characteristics. This space is generalizable in that it is not speaker or noise specific and is capable of denoising speech even if the model has not seen the speaker in the training set. Parameters are trained with dual objectives: one that promotes a selective bandpass filter that eliminates noise at time-frequency positions that exceed signal power, and another that proportionally splits time-frequency content between signal and noise. We compare to state of the art algorithms as well as traditional sparse non-negative matrix factorization solutions. The resulting algorithm avoids severe computational burden by providing a more intuitive and easily optimized approach, while achieving competitive accuracy. |
Tasks | Denoising |
Published | 2018-04-27 |
URL | http://arxiv.org/abs/1804.10669v1 |
http://arxiv.org/pdf/1804.10669v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-speech-denoising-with-vector-space |
Repo | |
Framework | |
Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning
Title | Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning |
Authors | Yixing Li, Fengbo Ren |
Abstract | Convolutional neural network (CNN) has been widely used for vision-based tasks. Due to the high computational complexity and memory storage requirement, it is hard to directly deploy a full-precision CNN on embedded devices. The hardware-friendly designs are needed for re-source-limited and energy-constrained embed-ded devices. Emerging solutions are adopted for the neural network compression, e.g., bina-ry/ternary weight network, pruned network and quantized network. Among them, Binarized Neural Network (BNN) is believed to be the most hardware-friendly framework due to its small network size and low computational com-plexity. No existing work has further shrunk the size of BNN. In this work, we explore the redun-dancy in BNN and build a compact BNN (CBNN) based on the bit-level sensitivity analy-sis and bit-level data pruning. The input data is converted to a high dimensional bit-sliced for-mat. In post-training stage, we analyze the im-pact of different bit slices to the accuracy. By pruning the redundant input bit slices and shrinking the network size, we are able to build a more compact BNN. Our result shows that we can further scale down the network size of the BNN up to 3.9x with no more than 1% accuracy drop. The actual runtime can be reduced up to 2x and 9.9x compared with the baseline BNN and its full-precision counterpart, respectively. |
Tasks | Neural Network Compression |
Published | 2018-02-03 |
URL | http://arxiv.org/abs/1802.00904v1 |
http://arxiv.org/pdf/1802.00904v1.pdf | |
PWC | https://paperswithcode.com/paper/build-a-compact-binary-neural-network-through |
Repo | |
Framework | |
Accumulating Knowledge for Lifelong Online Learning
Title | Accumulating Knowledge for Lifelong Online Learning |
Authors | Changjian Shui, Ihsen Hedhli, Christian Gagné |
Abstract | Lifelong learning can be viewed as a continuous transfer learning procedure over consecutive tasks, where learning a given task depends on accumulated knowledge — the so-called knowledge base. Most published work on lifelong learning makes a batch processing of each task, implying that a data collection step is required beforehand. We are proposing a new framework, lifelong online learning, in which the learning procedure for each task is interactive. This is done through a computationally efficient algorithm where the predicted result for a given task is made by combining two intermediate predictions: by using only the information from the current task and by relying on the accumulated knowledge. In this work, two challenges are tackled: making no assumption on the task generation distribution, and processing with a possibly unknown number of instances for each task. We are providing a theoretical analysis of this algorithm, with a cumulative error upper bound for each task. We find that under some mild conditions, the algorithm can still benefit from a small cumulative error even when facing few interactions. Moreover, we provide experimental results on both synthetic and real datasets that validate the correct behavior and practical usefulness of the proposed algorithm. |
Tasks | Transfer Learning |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11479v1 |
http://arxiv.org/pdf/1810.11479v1.pdf | |
PWC | https://paperswithcode.com/paper/accumulating-knowledge-for-lifelong-online |
Repo | |
Framework | |
Efficient Bayesian Inference of Sigmoidal Gaussian Cox Processes
Title | Efficient Bayesian Inference of Sigmoidal Gaussian Cox Processes |
Authors | Christian Donner, Manfred Opper |
Abstract | We present an approximate Bayesian inference approach for estimating the intensity of an inhomogeneous Poisson process, where the intensity function is modelled using a Gaussian process (GP) prior via a sigmoid link function. Augmenting the model using a latent marked Poisson process and P'olya–Gamma random variables we obtain a representation of the likelihood which is conjugate to the GP prior. We estimate the posterior using a variational free–form mean field optimisation together with the framework of sparse GPs. Furthermore, as alternative approximation we suggest a sparse Laplace’s method for the posterior, for which an efficient expectation–maximisation algorithm is derived to find the posterior’s mode. Both algorithms compare well against exact inference obtained by a Markov Chain Monte Carlo sampler and standard variational Gauss approach solving the same model, while being one order of magnitude faster. Furthermore, the performance and speed of our method is competitive with that of another recently proposed Poisson process model based on a quadratic link function, while not being limited to GPs with squared exponential kernels and rectangular domains. |
Tasks | Bayesian Inference |
Published | 2018-08-02 |
URL | https://arxiv.org/abs/1808.00831v2 |
https://arxiv.org/pdf/1808.00831v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-bayesian-inference-of-sigmoidal |
Repo | |
Framework | |
Towards Principled Uncertainty Estimation for Deep Neural Networks
Title | Towards Principled Uncertainty Estimation for Deep Neural Networks |
Authors | Richard Harang, Ethan M. Rudd |
Abstract | When the cost of misclassifying a sample is high, it is useful to have an accurate estimate of uncertainty in the prediction for that sample. There are also multiple types of uncertainty which are best estimated in different ways, for example, uncertainty that is intrinsic to the training set may be well-handled by a Bayesian approach, while uncertainty introduced by shifts between training and query distributions may be better-addressed by density/support estimation. In this paper, we examine three types of uncertainty: model capacity uncertainty, intrinsic data uncertainty, and open set uncertainty, and review techniques that have been derived to address each one. We then introduce a unified hierarchical model, which combines methods from Bayesian inference, invertible latent density inference, and discriminative classification in a single end-to-end deep neural network topology to yield efficient per-sample uncertainty estimation in a detection context. This approach addresses all three uncertainty types and can readily accommodate prior/base rates for binary detection. We then discuss how to extend this model to a more generic multiclass recognition context. |
Tasks | Bayesian Inference |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12278v2 |
http://arxiv.org/pdf/1810.12278v2.pdf | |
PWC | https://paperswithcode.com/paper/principled-uncertainty-estimation-for-deep |
Repo | |
Framework | |
Generalizability of predictive models for intensive care unit patients
Title | Generalizability of predictive models for intensive care unit patients |
Authors | Alistair E. W. Johnson, Tom J. Pollard, Tristan Naumann |
Abstract | A large volume of research has considered the creation of predictive models for clinical data; however, much existing literature reports results using only a single source of data. In this work, we evaluate the performance of models trained on the publicly-available eICU Collaborative Research Database. We show that cross-validation using many distinct centers provides a reasonable estimate of model performance in new centers. We further show that a single model trained across centers transfers well to distinct hospitals, even compared to a model retrained using hospital-specific data. Our results motivate the use of multi-center datasets for model development and highlight the need for data sharing among hospitals to maximize model performance. |
Tasks | |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02275v1 |
http://arxiv.org/pdf/1812.02275v1.pdf | |
PWC | https://paperswithcode.com/paper/generalizability-of-predictive-models-for |
Repo | |
Framework | |
High-Dimensional Poisson DAG Model Learning Using $\ell_1$-Regularized Regression
Title | High-Dimensional Poisson DAG Model Learning Using $\ell_1$-Regularized Regression |
Authors | Gunwoong Park, Sion Park |
Abstract | In this paper, we develop a new approach to learning high-dimensional Poisson directed acyclic graphical (DAG) models from only observational data without strong assumptions such as faithfulness and strong sparsity. A key component of our method is to decouple the ordering estimation or parent search where the problems can be efficiently addressed using $\ell_1$-regularized regression and the mean-variance relationship. We show that sample size $n = \Omega( d^{2} \log^{9} p)$ is sufficient for our polynomial time Mean-variance Ratio Scoring (MRS) algorithm to recover the true directed graph, where $p$ is the number of nodes and $d$ is the maximum indegree. We verify through simulations that our algorithm is statistically consistent in the high-dimensional $p>n$ setting, and performs well compared to state-of-the-art ODS, GES, and MMHC algorithms. We also demonstrate through multivariate real count data that our MRS algorithm is well-suited to estimating DAG models for multivariate count data in comparison to other methods used for discrete data. |
Tasks | |
Published | 2018-10-05 |
URL | https://arxiv.org/abs/1810.02501v3 |
https://arxiv.org/pdf/1810.02501v3.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-poisson-dag-model-learning |
Repo | |
Framework | |
$β$-VAEs can retain label information even at high compression
Title | $β$-VAEs can retain label information even at high compression |
Authors | Emily Fertig, Aryan Arbabi, Alexander A. Alemi |
Abstract | In this paper, we investigate the degree to which the encoding of a $\beta$-VAE captures label information across multiple architectures on Binary Static MNIST and Omniglot. Even though they are trained in a completely unsupervised manner, we demonstrate that a $\beta$-VAE can retain a large amount of label information, even when asked to learn a highly compressed representation. |
Tasks | Omniglot |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02682v1 |
http://arxiv.org/pdf/1812.02682v1.pdf | |
PWC | https://paperswithcode.com/paper/-vaes-can-retain-label-information-even-at |
Repo | |
Framework | |
Constructing Narrative Event Evolutionary Graph for Script Event Prediction
Title | Constructing Narrative Event Evolutionary Graph for Script Event Prediction |
Authors | Zhongyang Li, Xiao Ding, Ting Liu |
Abstract | Script event prediction requires a model to predict the subsequent event given an existing event context. Previous models based on event pairs or event chains cannot make full use of dense event connections, which may limit their capability of event prediction. To remedy this, we propose constructing an event graph to better utilize the event network information for script event prediction. In particular, we first extract narrative event chains from large quantities of news corpus, and then construct a narrative event evolutionary graph (NEEG) based on the extracted chains. NEEG can be seen as a knowledge base that describes event evolutionary principles and patterns. To solve the inference problem on NEEG, we present a scaled graph neural network (SGNN) to model event interactions and learn better event representations. Instead of computing the representations on the whole graph, SGNN processes only the concerned nodes each time, which makes our model feasible to large-scale graphs. By comparing the similarity between input context event representations and candidate event representations, we can choose the most reasonable subsequent event. Experimental results on widely used New York Times corpus demonstrate that our model significantly outperforms state-of-the-art baseline methods, by using standard multiple choice narrative cloze evaluation. |
Tasks | |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05081v2 |
http://arxiv.org/pdf/1805.05081v2.pdf | |
PWC | https://paperswithcode.com/paper/constructing-narrative-event-evolutionary |
Repo | |
Framework | |
Generating Ontologies from Templates: A Rule-Based Approach for Capturing Regularity
Title | Generating Ontologies from Templates: A Rule-Based Approach for Capturing Regularity |
Authors | Henrik Forssell, Christian Kindermann, Daniel P. Lupp, Uli Sattler, Evgenij Thorstensen |
Abstract | We present a second-order language that can be used to succinctly specify ontologies in a consistent and transparent manner. This language is based on ontology templates (OTTR), a framework for capturing recurring patterns of axioms in ontological modelling. The language and our results are independent of any specific DL. We define the language and its semantics, including the case of negation-as-failure, investigate reasoning over ontologies specified using our language, and show results about the decidability of useful reasoning tasks about the language itself. We also state and discuss some open problems that we believe to be of interest. |
Tasks | |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10436v1 |
http://arxiv.org/pdf/1809.10436v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-ontologies-from-templates-a-rule |
Repo | |
Framework | |
Taking the Scenic Route: Automatic Exploration for Videogames
Title | Taking the Scenic Route: Automatic Exploration for Videogames |
Authors | Zeping Zhan, Batu Aytemiz, Adam M. Smith |
Abstract | Machine playtesting tools and game moment search engines require exposure to the diversity of a game’s state space if they are to report on or index the most interesting moments of possible play. Meanwhile, mobile app distribution services would like to quickly determine if a freshly-uploaded game is fit to be published. Having access to a semantic map of reachable states in the game would enable efficient inference in these applications. However, human gameplay data is expensive to acquire relative to the coverage of a game that it provides. We show that off-the-shelf automatic exploration strategies can explore with an effectiveness comparable to human gameplay on the same timescale. We contribute generic methods for quantifying exploration quality as a function of time and demonstrate our metric on several elementary techniques and human players on a collection of commercial games sampled from multiple game platforms (from Atari 2600 to Nintendo 64). Emphasizing the diversity of states reached and the semantic map extracted, this work makes productive contrast with the focus on finding a behavior policy or optimizing game score used in most automatic game playing research. |
Tasks | |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03125v1 |
http://arxiv.org/pdf/1812.03125v1.pdf | |
PWC | https://paperswithcode.com/paper/taking-the-scenic-route-automatic-exploration |
Repo | |
Framework | |
Asymptotic Properties of Recursive Maximum Likelihood Estimation in Non-Linear State-Space Models
Title | Asymptotic Properties of Recursive Maximum Likelihood Estimation in Non-Linear State-Space Models |
Authors | Vladislav Z. B. Tadic, Arnaud Doucet |
Abstract | Using stochastic gradient search and the optimal filter derivative, it is possible to perform recursive (i.e., online) maximum likelihood estimation in a non-linear state-space model. As the optimal filter and its derivative are analytically intractable for such a model, they need to be approximated numerically. In [Poyiadjis, Doucet and Singh, Biometrika 2018], a recursive maximum likelihood algorithm based on a particle approximation to the optimal filter derivative has been proposed and studied through numerical simulations. Here, this algorithm and its asymptotic behavior are analyzed theoretically. We show that the algorithm accurately estimates maxima to the underlying (average) log-likelihood when the number of particles is sufficiently large. We also derive (relatively) tight bounds on the estimation error. The obtained results hold under (relatively) mild conditions and cover several classes of non-linear state-space models met in practice. |
Tasks | |
Published | 2018-06-25 |
URL | http://arxiv.org/abs/1806.09571v2 |
http://arxiv.org/pdf/1806.09571v2.pdf | |
PWC | https://paperswithcode.com/paper/asymptotic-properties-of-recursive-maximum |
Repo | |
Framework | |
Kernel-Based Training of Generative Networks
Title | Kernel-Based Training of Generative Networks |
Authors | Kalliopi Basioti, George V. Moustakides, Emmanouil Z. Psarakis |
Abstract | Generative adversarial networks (GANs) are designed with the help of min-max optimization problems that are solved with stochastic gradient-type algorithms which are known to be non-robust. In this work we revisit a non-adversarial method based on kernels which relies on a pure minimization problem and propose a simple stochastic gradient algorithm for the computation of its solution. Using simplified tools from Stochastic Approximation theory we demonstrate that batch versions of the algorithm or smoothing of the gradient do not improve convergence. These observations allow for the development of a training algorithm that enjoys reduced computational complexity and increased robustness while exhibiting similar synthesis characteristics as classical GANs. |
Tasks | |
Published | 2018-11-23 |
URL | http://arxiv.org/abs/1811.09568v1 |
http://arxiv.org/pdf/1811.09568v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-based-training-of-generative-networks |
Repo | |
Framework | |
Risk-Averse Stochastic Convex Bandit
Title | Risk-Averse Stochastic Convex Bandit |
Authors | Adrian Rivera Cardoso, Huan Xu |
Abstract | Motivated by applications in clinical trials and finance, we study the problem of online convex optimization (with bandit feedback) where the decision maker is risk-averse. We provide two algorithms to solve this problem. The first one is a descent-type algorithm which is easy to implement. The second algorithm, which combines the ellipsoid method and a center point device, achieves (almost) optimal regret bounds with respect to the number of rounds. To the best of our knowledge this is the first attempt to address risk-aversion in the online convex bandit problem. |
Tasks | |
Published | 2018-10-01 |
URL | http://arxiv.org/abs/1810.00737v1 |
http://arxiv.org/pdf/1810.00737v1.pdf | |
PWC | https://paperswithcode.com/paper/risk-averse-stochastic-convex-bandit |
Repo | |
Framework | |
Automatic Rule Learning for Autonomous Driving Using Semantic Memory
Title | Automatic Rule Learning for Autonomous Driving Using Semantic Memory |
Authors | Dmitriy Korchev, Aruna Jammalamadaka, Rajan Bhattacharyya |
Abstract | This paper presents a novel approach for automatic rule learning applicable to an autonomous driving system using real driving data. |
Tasks | Autonomous Driving |
Published | 2018-09-21 |
URL | http://arxiv.org/abs/1809.07904v2 |
http://arxiv.org/pdf/1809.07904v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-rule-learning-for-autonomous |
Repo | |
Framework | |