April 1, 2020

2696 words 13 mins read

Paper Group NANR 129

Paper Group NANR 129

At Your Fingertips: Automatic Piano Fingering Detection. Rigging the Lottery: Making All Tickets Winners. A Uniform Generalization Error Bound for Generative Adversarial Networks. Set Functions for Time Series. The fairness-accuracy landscape of neural classifiers. A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation. Phy …

At Your Fingertips: Automatic Piano Fingering Detection

Title At Your Fingertips: Automatic Piano Fingering Detection
Authors Anonymous
Abstract Automatic Piano Fingering is a hard task which computers can learn using data. As data collection is hard and expensive, we propose to automate this process by automatically extracting fingerings from public videos and MIDI files, using computer-vision techniques. Running this process on 90 videos results in the largest dataset for piano fingering with more than 150K notes. We show that when running a previously proposed model for automatic piano fingering on our dataset and then fine-tuning it on manually labeled piano fingering data, we achieve state-of-the-art results. In addition to the fingering extraction method, we also introduce a novel method for transferring deep-learning computer-vision models to work on out-of-domain data, by fine-tuning it on out-of-domain augmentation proposed by a Generative Adversarial Network (GAN). For demonstration, we anonymously release a visualization of the output of our process for a single video on https://youtu.be/Gfs1UWQhr5Q
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=H1MOqeHYvB
PDF https://openreview.net/pdf?id=H1MOqeHYvB
PWC https://paperswithcode.com/paper/at-your-fingertips-automatic-piano-fingering
Repo
Framework

Rigging the Lottery: Making All Tickets Winners

Title Rigging the Lottery: Making All Tickets Winners
Authors Anonymous
Abstract Sparse neural networks have been shown to yield computationally efficient networks with improved inference times. There is a large body of work on training dense networks to yield sparse networks for inference (Molchanov et al., 2017;Zhu & Gupta, 2018; Louizos et al., 2017; Li et al., 2016; Guo et al., 2016). This limits the size of the largest trainable sparse model to that of the largest trainable dense model. In this paper we introduce a method to train sparse neural networks with a fixed parameter count and a fixed computational cost throughout training, without sacrificing accuracy relative to existing dense-to-sparse training methods. Our method updates the topology of the network during training by using parameter magnitudes and infrequent gradient calculations. We show that this approach requires less floating-point operations (FLOPs) to achieve a given level of accuracy compared to prior techniques. We demonstrate state-of-the-art sparse training results with ResNet-50, MobileNet v1 and MobileNet v2 on the ImageNet-2012 dataset. Finally, we provide some insights into why allowing the topology to change during the optimization can overcome local minima encountered when the topology remains static.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=ryg7vA4tPB
PDF https://openreview.net/pdf?id=ryg7vA4tPB
PWC https://paperswithcode.com/paper/rigging-the-lottery-making-all-tickets
Repo
Framework

A Uniform Generalization Error Bound for Generative Adversarial Networks

Title A Uniform Generalization Error Bound for Generative Adversarial Networks
Authors Anonymous
Abstract This paper focuses on the theoretical investigation of unsupervised generalization theory of generative adversarial networks (GANs). We first formulate a more reasonable definition of general error and generalization bounds for GANs. On top of that, we establish a bound for generalization error with a fixed generator in a general weight normalization context. Then, we obtain a width-independent bound by applying $\ell_{p,q}$ and spectral norm weight normalization. To better understand the unsupervised model, GANs, we establish the generalization bound, which uniformly holds with respect to the choice of generators. Hence, we can explain how the complexity of discriminators and generators contribute to generalization error. For $\ell_{p,q}$ and spectral weight normalization, we provide explicit guidance on how to design parameters to train robust generators. Our numerical simulations also verify that our generalization bound is reasonable.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Skek-TVYvr
PDF https://openreview.net/pdf?id=Skek-TVYvr
PWC https://paperswithcode.com/paper/a-uniform-generalization-error-bound-for
Repo
Framework

Set Functions for Time Series

Title Set Functions for Time Series
Authors Anonymous
Abstract Despite the eminent successes of deep neural networks, many architectures are often hard to transfer to irregularly-sampled and asynchronous time series that occur in many real-world datasets, such as healthcare applications. This paper proposes a novel framework for classifying irregularly sampled time series with unaligned measurements, focusing on high scalability and data efficiency. Our method SeFT (Set Functions for Time Series) is based on recent advances in differentiable set function learning, extremely parallelizable, and scales well to very large datasets and online monitoring scenarios. We extensively compare our method to competitors on multiple healthcare time series datasets and show that it performs competitively whilst significantly reducing runtime.
Tasks Time Series
Published 2020-01-01
URL https://openreview.net/forum?id=ByxCrerKvS
PDF https://openreview.net/pdf?id=ByxCrerKvS
PWC https://paperswithcode.com/paper/set-functions-for-time-series
Repo
Framework

The fairness-accuracy landscape of neural classifiers

Title The fairness-accuracy landscape of neural classifiers
Authors Anonymous
Abstract That machine learning algorithms can demonstrate bias is well-documented by now. This work confronts the challenge of bias mitigation in feedforward fully-connected neural nets from the lens of causal inference and multiobjective optimisation. Regarding the former, a new causal notion of fairness is introduced that is particularly suited to giving a nuanced treatment of datasets collected under unfair practices. In particular, special attention is paid to subjects whose covariates could appear with substantial probability in either value of the sensitive attribute. Next, recognising that fairness and accuracy are competing objectives, the proposed methodology uses techniques from multiobjective optimisation to ascertain the fairness-accuracy landscape of a neural net classifier. Experimental results suggest that the proposed method produces neural net classifiers that distribute evenly across the Pareto front of the fairness-accuracy space and is more efficient at finding non-dominated points than an adversarial approach.
Tasks Causal Inference
Published 2020-01-01
URL https://openreview.net/forum?id=S1e3g1rtwB
PDF https://openreview.net/pdf?id=S1e3g1rtwB
PWC https://paperswithcode.com/paper/the-fairness-accuracy-landscape-of-neural
Repo
Framework

A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation

Title A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
Authors Anonymous
Abstract Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning remains virtually unknown. In this paper, we present a finite-time analysis of a neural Q-learning algorithm, where the data are generated from a Markov decision process and the action-value function is approximated by a deep ReLU neural network. We prove that neural Q-learning finds the optimal policy with $O(1/T)$ convergence rate if the neural function approximator is sufficiently overparameterized, where $T$ is the number of iterations. To our best knowledge, our result is the first finite-time analysis of neural Q-learning under non-i.i.d. data assumption.
Tasks Q-Learning
Published 2020-01-01
URL https://openreview.net/forum?id=B1xxAJHFwS
PDF https://openreview.net/pdf?id=B1xxAJHFwS
PWC https://paperswithcode.com/paper/a-finite-time-analysis-of-q-learning-with
Repo
Framework

Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics

Title Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics
Authors Sungyong Seo*, Chuizheng Meng*, Yan Liu
Abstract Sparsely available data points cause a numerical error on finite differences which hinder to modeling the dynamics of physical systems. The discretization error becomes even larger when the sparse data are irregularly distributed so that the data defined on an unstructured grid, making it hard to build deep learning models to handle physics-governing observations on the unstructured grid. In this paper, we propose a novel architecture named Physics-aware Difference Graph Networks (PA-DGN) that exploits neighboring information to learn finite differences inspired by physics equations. PA-DGN further leverages data-driven end-to-end learning to discover underlying dynamical relations between the spatial and temporal differences in given observations. We demonstrate the superiority of PA-DGN in the approximation of directional derivatives and the prediction of graph signals on the synthetic data and the real-world climate observations from weather stations.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=r1gelyrtwH
PDF https://openreview.net/pdf?id=r1gelyrtwH
PWC https://paperswithcode.com/paper/physics-aware-difference-graph-networks-for
Repo
Framework

Learning to Prove Theorems by Learning to Generate Theorems

Title Learning to Prove Theorems by Learning to Generate Theorems
Authors Anonymous
Abstract We consider the task of automated theorem proving, a key AI task. Deep learning has shown promise for training theorem provers, but there are limited human-written theorems and proofs available for supervised learning. To address this limitation, we propose to learn a neural generator that automatically synthesizes theorems and proofs for the purpose of training a theorem prover. Experiments on real-world tasks demonstrate that synthetic data from our approach significantly improves the theorem prover and advances the state of the art of automated theorem proving in Metamath.
Tasks Automated Theorem Proving
Published 2020-01-01
URL https://openreview.net/forum?id=BJxiqxSYPB
PDF https://openreview.net/pdf?id=BJxiqxSYPB
PWC https://paperswithcode.com/paper/learning-to-prove-theorems-by-learning-to
Repo
Framework

Improving Multi-Manifold GANs with a Learned Noise Prior

Title Improving Multi-Manifold GANs with a Learned Noise Prior
Authors Anonymous
Abstract Generative adversarial networks (GANs) learn to map samples from a noise distribution to a chosen data distribution. Recent work has demonstrated that GANs are consequently sensitive to, and limited by, the shape of the noise distribution. For example, a single generator struggles to map continuous noise (e.g. a uniform distribution) to discontinuous output (e.g. separate Gaussians) or complex output (e.g. intersecting parabolas). We address this problem by learning to generate from multiple models such that the generator’s output is actually the combination of several distinct networks. We contribute a novel formulation of multi-generator models where we learn a prior over the generators conditioned on the noise, parameterized by a neural network. Thus, this network not only learns the optimal rate to sample from each generator but also optimally shapes the noise received by each generator. The resulting Noise Prior GAN (NPGAN) achieves expressivity and flexibility that surpasses both single generator models and previous multi-generator models.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HJlISCEKvB
PDF https://openreview.net/pdf?id=HJlISCEKvB
PWC https://paperswithcode.com/paper/improving-multi-manifold-gans-with-a-learned
Repo
Framework

Four Things Everyone Should Know to Improve Batch Normalization

Title Four Things Everyone Should Know to Improve Batch Normalization
Authors Anonymous
Abstract A key component of most neural network architectures is the use of normalization layers, such as Batch Normalization. Despite its common use and large utility in optimizing deep architectures that are otherwise intractable, it has been challenging both to generically improve upon Batch Normalization and to understand the circumstances that lend themselves to other enhancements. In this paper, we identify four improvements to the generic form of Batch Normalization and the circumstances under which they work, yielding performance gains across all batch sizes while requiring no additional computation during training. These contributions include proposing a method for reasoning about the current example in inference normalization statistics, fixing a training vs. inference discrepancy; recognizing and validating the powerful regularization effect of Ghost Batch Normalization for small and medium batch sizes; examining the effect of weight decay regularization on the scaling and shifting parameters gamma and beta; and identifying a new normalization algorithm for very small batch sizes by combining the strengths of Batch and Group Normalization. We validate our results empirically on five datasets: CIFAR-100, SVHN, Caltech-256, Oxford Flowers102, and ImageNet.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=HJx8HANFDH
PDF https://openreview.net/pdf?id=HJx8HANFDH
PWC https://paperswithcode.com/paper/four-things-everyone-should-know-to-improve-1
Repo
Framework

One-way prototypical networks

Title One-way prototypical networks
Authors Anonymous
Abstract Few-shot models have become a popular topic of research in the past years. They offer the possibility to determine class belongings for unseen examples using just a handful of examples for each class. Such models are trained on a wide range of classes and their respective examples, learning a decision metric in the process. Types of few-shot models include matching networks and prototypical networks. We show a new way of training prototypical few-shot models for just a single class. These models have the ability to predict the likelihood of an unseen query belonging to a group of examples without any given counterexamples. The difficulty here lies in the fact that no relative distance to other classes can be calculated via softmax. We solve this problem by introducing a “null class” centered around zero, and enforcing centering with batch normalization. Trained on the commonly used Omniglot data set, we obtain a classification accuracy of .98 on the matched test set, and of .8 on unmatched MNIST data. On the more complex MiniImageNet data set, test accuracy is .8. In addition, we propose a novel Gaussian layer for distance calculation in a prototypical network, which takes the support examples’ distribution rather than just their centroid into account. This extension shows promising results when a higher number of support examples is available.
Tasks Omniglot
Published 2020-01-01
URL https://openreview.net/forum?id=BJgWbpEtPr
PDF https://openreview.net/pdf?id=BJgWbpEtPr
PWC https://paperswithcode.com/paper/one-way-prototypical-networks
Repo
Framework

Supervised learning with incomplete data via sparse representations

Title Supervised learning with incomplete data via sparse representations
Authors Anonymous
Abstract This paper addresses the problem of training a classifier on incomplete data and its application to a complete or incomplete test dataset. A supervised learning method is developed to train a general classifier, such as a logistic regression or a deep neural network, using only a limited number of observed entries, assuming sparse representations of data vectors on an unknown dictionary. The proposed method simultaneously learns the classifier, the dictionary and the corresponding sparse representations of each input data sample. A theoretical analysis is also provided comparing this method with the standard imputation approach, which consists on performing data completion followed by training the classifier based on their reconstructions. The limitations of this last “sequential” approach are identified, and a description of how the proposed new “simultaneous” method can overcome the problem of indiscernible observations is provided. Additionally, it is shown that, if it is possible to train a classifier on incomplete observations so that its reconstructions are well separated by a hyperplane, then the same classifier also correctly separates the original (unobserved) data samples. Extensive simulation results are presented on synthetic and well-known reference datasets that demonstrate the effectiveness of the proposed method compared to traditional data imputation methods.
Tasks Imputation
Published 2020-01-01
URL https://openreview.net/forum?id=Syx_f6EFPr
PDF https://openreview.net/pdf?id=Syx_f6EFPr
PWC https://paperswithcode.com/paper/supervised-learning-with-incomplete-data-via
Repo
Framework

Robust Natural Language Representation Learning for Natural Language Inference by Projecting Superficial Words out

Title Robust Natural Language Representation Learning for Natural Language Inference by Projecting Superficial Words out
Authors Anonymous
Abstract In natural language inference, the semantics of some words do not affect the inference. Such information is considered superficial and brings overfitting. How can we represent and discard such superficial information? In this paper, we use first order logic (FOL) - a classic technique from meaning representation language – to explain what information is superficial for a given sentence pair. Such explanation also suggests two inductive biases according to its properties. We proposed a neural network-based approach that utilizes the two inductive biases. We obtain substantial improvements over extensive experiments.
Tasks Natural Language Inference, Representation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=HkxQzlHFPr
PDF https://openreview.net/pdf?id=HkxQzlHFPr
PWC https://paperswithcode.com/paper/robust-natural-language-representation
Repo
Framework

Learning Calibratable Policies using Programmatic Style-Consistency

Title Learning Calibratable Policies using Programmatic Style-Consistency
Authors Anonymous
Abstract We study the important and challenging problem of controllable generation of long-term sequential behaviors. Solutions to this problem would impact many applications, such as calibrating behaviors of AI agents in games or predicting player trajectories in sports. In contrast to the well-studied areas of controllable generation of images, text, and speech, there are significant challenges that are unique to or exacerbated by generating long-term behaviors: how should we specify the factors of variation to control, and how can we ensure that the generated temporal behavior faithfully demonstrates diverse styles? In this paper, we leverage large amounts of raw behavioral data to learn policies that can be calibrated to generate a diverse range of behavior styles (e.g., aggressive versus passive play in sports). Inspired by recent work on leveraging programmatic labeling functions, we present a novel framework that combines imitation learning with data programming to learn style-calibratable policies. Our primary technical contribution is a formal notion of style-consistency as a learning objective, and its integration with conventional imitation learning approaches. We evaluate our framework using demonstrations from professional basketball players and agents in the MuJoCo physics environment, and show that our learned policies can be accurately calibrated to generate interesting behavior styles in both domains.
Tasks Imitation Learning
Published 2020-01-01
URL https://openreview.net/forum?id=Byx5R0NKPr
PDF https://openreview.net/pdf?id=Byx5R0NKPr
PWC https://paperswithcode.com/paper/learning-calibratable-policies-using-1
Repo
Framework

Neural Video Encoding

Title Neural Video Encoding
Authors Anonymous
Abstract Deep neural networks have had unprecedented success in computer vision, natural language processing, and speech largely due to the ability to search for suitable task algorithms via differentiable programming. In this paper, we borrow ideas from Kolmogorov complexity theory and normalizing flows to explore the possibilities of finding arbitrary algorithms that represent data. In particular, algorithms which encode sequences of video image frames. Ultimately, we demonstrate neural video encoded using convolutional neural networks to transform autoregressive noise processes and show that this method has surprising cryptographic analogs for information security.
Tasks
Published 2020-01-01
URL https://openreview.net/forum?id=Byeq_xHtwS
PDF https://openreview.net/pdf?id=Byeq_xHtwS
PWC https://paperswithcode.com/paper/neural-video-encoding
Repo
Framework
comments powered by Disqus