February 2, 2020

# Paper Group AWR 27

GeBioToolkit: Automatic Extraction of Gender-Balanced Multilingual Corpus of Wikipedia Biographies. Detecting and Simulating Artifacts in GAN Fake Images. Neural network models and deep learning - a primer for biologists. Scalable Dictionary Classifiers for Time Series Classification. Improving Policies via Search in Cooperative Partially Observabl …

#### GeBioToolkit: Automatic Extraction of Gender-Balanced Multilingual Corpus of Wikipedia Biographies

Title GeBioToolkit: Automatic Extraction of Gender-Balanced Multilingual Corpus of Wikipedia Biographies
Authors Marta R. Costa-jussà, Pau Li Lin, Cristina España-Bonet
Abstract We introduce GeBioToolkit, a tool for extracting multilingual parallel corpora at sentence level, with document and gender information from Wikipedia biographies. Despite thegender inequalitiespresent in Wikipedia, the toolkit has been designed to extract corpus balanced in gender. While our toolkit is customizable to any number of languages (and different domains), in this work we present a corpus of 2,000 sentences in English, Spanish and Catalan, which has been post-edited by native speakers to become a high-quality dataset for machinetranslation evaluation. While GeBioCorpus aims at being one of the first non-synthetic gender-balanced test datasets, GeBioToolkit aims at paving the path to standardize procedures to produce gender-balanced datasets
Published 2019-12-10
URL https://arxiv.org/abs/1912.04778v1
PDF https://arxiv.org/pdf/1912.04778v1.pdf
PWC https://paperswithcode.com/paper/gebiotoolkit-automatic-extraction-of-gender
Repo https://github.com/PLXIV/Gebiotoolkit
Framework none

#### Detecting and Simulating Artifacts in GAN Fake Images

Title Detecting and Simulating Artifacts in GAN Fake Images
Authors Xu Zhang, Svebor Karaman, Shih-Fu Chang
Abstract To detect GAN generated images, conventional supervised machine learning algorithms require collection of a number of real and fake images from the targeted GAN model. However, the specific model used by the attacker is often unavailable. To address this, we propose a GAN simulator, AutoGAN, which can simulate the artifacts produced by the common pipeline shared by several popular GAN models. Additionally, we identify a unique artifact caused by the up-sampling component included in the common GAN pipeline. We show theoretically such artifacts are manifested as replications of spectra in the frequency domain and thus propose a classifier model based on the spectrum input, rather than the pixel input. By using the simulated images to train a spectrum based classifier, even without seeing the fake images produced by the targeted GAN model during training, our approach achieves state-of-the-art performances on detecting fake images generated by popular GAN models such as CycleGAN.
Published 2019-07-15
URL https://arxiv.org/abs/1907.06515v2
PDF https://arxiv.org/pdf/1907.06515v2.pdf
PWC https://paperswithcode.com/paper/detecting-and-simulating-artifacts-in-gan
Framework pytorch

#### Neural network models and deep learning - a primer for biologists

Title Neural network models and deep learning - a primer for biologists
Authors Nikolaus Kriegeskorte, Tal Golan
Abstract Originally inspired by neurobiology, deep neural network models have become a powerful tool of machine learning and artificial intelligence, where they are used to approximate functions and dynamics by learning from examples. Here we give a brief introduction to neural network models and deep learning for biologists. We introduce feedforward and recurrent networks and explain the expressive power of this modeling framework and the backpropagation algorithm for setting the parameters. Finally, we consider how deep neural networks might help us understand the brain’s computations.
Published 2019-02-13
URL http://arxiv.org/abs/1902.04704v2
PDF http://arxiv.org/pdf/1902.04704v2.pdf
PWC https://paperswithcode.com/paper/neural-network-models-and-deep-learning-a
Repo https://github.com/impredicative/urltitle
Framework pytorch

#### Scalable Dictionary Classifiers for Time Series Classification

Title Scalable Dictionary Classifiers for Time Series Classification
Authors Matthew Middlehurst, William Vickers, Anthony Bagnall
Abstract Dictionary based classifiers are a family of algorithms for time series classification (TSC), that focus on capturing the frequency of pattern occurrences in a time series. The ensemble based Bag of Symbolic Fourier Approximation Symbols (BOSS) was found to be a top performing TSC algorithm in a recent evaluation, as well as the best performing dictionary based classifier. A recent addition to the category, the Word Extraction for Time Series Classification (WEASEL), claims an improvement on this performance. Both of these algorithms however have non-trivial scalability issues, taking a considerable amount of build time and space on larger datasets. We evaluate changes to the way BOSS chooses classifiers for its ensemble, replacing its parameter search with random selection. This change allows for the easy implementation of contracting, setting a build time limit for the classifier and check-pointing, saving progress during the classifiers build. To differentiate between the two BOSS ensemble methods we refer to our randomised version as RBOSS. Additionally we test the application of common ensembling techniques to help retain accuracy from the loss of the BOSS parameter search. We achieve a significant reduction in build time without a significant change in accuracy on average when compared to BOSS by creating a size $n$ weighted ensemble selecting the best performers from $k$ randomly chosen parameter sets. Our experiments are conducted on datasets from the recently expanded UCR time series archive. We demonstrate the usability improvements to RBOSS with a case study using a large whale acoustics dataset for which BOSS proved infeasible.
Tasks Time Series, Time Series Classification
Published 2019-07-26
URL https://arxiv.org/abs/1907.11815v1
PDF https://arxiv.org/pdf/1907.11815v1.pdf
PWC https://paperswithcode.com/paper/scalable-dictionary-classifiers-for-time
Repo https://github.com/uea-machine-learning/tsml
Framework none

#### Improving Policies via Search in Cooperative Partially Observable Games

Title Improving Policies via Search in Cooperative Partially Observable Games
Authors Adam Lerer, Hengyuan Hu, Jakob Foerster, Noam Brown
Abstract Recent superhuman results in games have largely been achieved in a variety of zero-sum settings, such as Go and Poker, in which agents need to compete against others. However, just like humans, real-world AI systems have to coordinate and communicate with other agents in cooperative partially observable environments as well. These settings commonly require participants to both interpret the actions of others and to act in a way that is informative when being interpreted. Those abilities are typically summarized as theory f mind and are seen as crucial for social interactions. In this paper we propose two different search techniques that can be applied to improve an arbitrary agreed-upon policy in a cooperative partially observable game. The first one, single-agent search, effectively converts the problem into a single agent setting by making all but one of the agents play according to the agreed-upon policy. In contrast, in multi-agent search all agents carry out the same common-knowledge search procedure whenever doing so is computationally feasible, and fall back to playing according to the agreed-upon policy otherwise. We prove that these search procedures are theoretically guaranteed to at least maintain the original performance of the agreed-upon policy (up to a bounded approximation error). In the benchmark challenge problem of Hanabi, our search technique greatly improves the performance of every agent we tested and when applied to a policy trained using RL achieves a new state-of-the-art score of 24.61 / 25 in the game, compared to a previous-best of 24.08 / 25.
Published 2019-12-05
URL https://arxiv.org/abs/1912.02318v1
PDF https://arxiv.org/pdf/1912.02318v1.pdf
PWC https://paperswithcode.com/paper/191202318
Framework pytorch

#### Machine learning in policy evaluation: new tools for causal inference

Title Machine learning in policy evaluation: new tools for causal inference
Authors Noemi Kreif, Karla DiazOrdaz
Abstract While machine learning (ML) methods have received a lot of attention in recent years, these methods are primarily for prediction. Empirical researchers conducting policy evaluations are, on the other hand, pre-occupied with causal problems, trying to answer counterfactual questions: what would have happened in the absence of a policy? Because these counterfactuals can never be directly observed (described as the “fundamental problem of causal inference”) prediction tools from the ML literature cannot be readily used for causal inference. In the last decade, major innovations have taken place incorporating supervised ML tools into estimators for causal parameters such as the average treatment effect (ATE). This holds the promise of attenuating model misspecification issues, and increasing of transparency in model selection. One particularly mature strand of the literature include approaches that incorporate supervised ML approaches in the estimation of the ATE of a binary treatment, under the \textit{unconfoundedness} and positivity assumptions (also known as exchangeability and overlap assumptions). This article reviews popular supervised machine learning algorithms, including the Super Learner. Then, some specific uses of machine learning for treatment effect estimation are introduced and illustrated, namely (1) to create balance among treated and control groups, (2) to estimate so-called nuisance models (e.g. the propensity score, or conditional expectations of the outcome) in semi-parametric estimators that target causal parameters (e.g. targeted maximum likelihood estimation or the double ML estimator), and (3) the use of machine learning for variable selection in situations with a high number of covariates.
Published 2019-03-01
URL http://arxiv.org/abs/1903.00402v1
PDF http://arxiv.org/pdf/1903.00402v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-in-policy-evaluation-new
Repo https://github.com/KDiazOrdaz/Machine-learning-in-policy-evaluation-new-tools-for-causal-inference
Framework none

#### Learning Temporal Attention in Dynamic Graphs with Bilinear Interactions

Title Learning Temporal Attention in Dynamic Graphs with Bilinear Interactions
Authors Boris Knyazev, Carolyn Augusta, Graham W. Taylor
Abstract Graphs evolving over time are a natural way to represent data in many domains, such as social networks, bioinformatics, physics and finance. Machine learning methods for graphs, which leverage such data for various prediction tasks, have seen a recent surge of interest and capability. In practice, ground truth edges between nodes in these graphs can be unknown or suboptimal, which hurts the quality of features propagated through the network. Building on recent progress in modeling temporal graphs and learning latent graphs, we extend two methods, Dynamic Representation (DyRep) and Neural Relational Inference (NRI), for the task of dynamic link prediction. We explore the effect of learning temporal attention edges using NRI without requiring the ground truth graph. In experiments on the Social Evolution dataset, we show semantic interpretability of learned attention, often outperforming the baseline DyRep model that uses a ground truth graph to compute attention. In addition, we consider functions acting on pairs of nodes, which are used to predict link or edge representations. We demonstrate that in all cases, our bilinear transformation is superior to feature concatenation, typically employed in prior work. Source code is available at https://github.com/uoguelph-mlrg/LDG.
Published 2019-09-23
URL https://arxiv.org/abs/1909.10367v1
PDF https://arxiv.org/pdf/1909.10367v1.pdf
PWC https://paperswithcode.com/paper/190910367
Repo https://github.com/uoguelph-mlrg/LDG
Framework pytorch

#### ColluEagle: Collusive review spammer detection using Markov random fields

Title ColluEagle: Collusive review spammer detection using Markov random fields
Authors Zhuo Wang, Runlong Hu, Qian Chen, Pei Gao, Xiaowei Xu
Abstract Product reviews are extremely valuable for online shoppers in providing purchase decisions. Driven by immense profit incentives, fraudsters deliberately fabricate untruthful reviews to distort the reputation of online products. As online reviews become more and more important, group spamming, i.e., a team of fraudsters working collaboratively to attack a set of target products, becomes a new fashion. Previous works use review network effects, i.e. the relationships among reviewers, reviews, and products, to detect fake reviews or review spammers, but ignore time effects, which are critical in characterizing group spamming. In this paper, we propose a novel Markov random field (MRF)-based method (ColluEagle) to detect collusive review spammers, as well as review spam campaigns, considering both network effects and time effects. First we identify co-review pairs, a review phenomenon that happens between two reviewers who review a common product in a similar way, and then model reviewers and their co-review pairs as a pairwise-MRF, and use loopy belief propagation to evaluate the suspiciousness of reviewers. We further design a high quality yet easy-to-compute node prior for ColluEagle, through which the review spammer groups can also be subsequently identified. Experiments show that ColluEagle can not only detect collusive spammers with high precision, significantly outperforming state-of-the-art baselines — FraudEagle and SpEagle, but also identify highly suspicious review spammer campaigns.
Published 2019-11-05
URL https://arxiv.org/abs/1911.01690v1
PDF https://arxiv.org/pdf/1911.01690v1.pdf
PWC https://paperswithcode.com/paper/collueagle-collusive-review-spammer-detection
Repo https://github.com/zhuowangsylu/ColluEagle
Framework none

#### Parallel Neural Text-to-Speech

Title Parallel Neural Text-to-Speech
Authors Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao
Abstract In this work, we propose a non-autoregressive seq2seq model that converts text to spectrogram. It is fully convolutional and obtains about 46.7 times speed-up over Deep Voice 3 at synthesis while maintaining comparable speech quality using a WaveNet vocoder. Interestingly, it has even fewer attention errors than the autoregressive model on the challenging test sentences. Furthermore, we build the first fully parallel neural text-to-speech system by applying the inverse autoregressive flow~(IAF) as the parallel neural vocoder. Our system can synthesize speech from text through a single feed-forward pass. We also explore a novel approach to train the IAF from scratch as a generative model for raw waveform, which avoids the need for distillation from a separately trained WaveNet.
Published 2019-05-21
URL https://arxiv.org/abs/1905.08459v2
PDF https://arxiv.org/pdf/1905.08459v2.pdf
PWC https://paperswithcode.com/paper/parallel-neural-text-to-speech
Repo https://github.com/ksw0306/WaveVAE
Framework pytorch

#### Natural and Realistic Single Image Super-Resolution with Explicit Natural Manifold Discrimination

Title Natural and Realistic Single Image Super-Resolution with Explicit Natural Manifold Discrimination
Authors Jae Woong Soh, Gu Yong Park, Junho Jo, Nam Ik Cho
Abstract Recently, many convolutional neural networks for single image super-resolution (SISR) have been proposed, which focus on reconstructing the high-resolution images in terms of objective distortion measures. However, the networks trained with objective loss functions generally fail to reconstruct the realistic fine textures and details that are essential for better perceptual quality. Recovering the realistic details remains a challenging problem, and only a few works have been proposed which aim at increasing the perceptual quality by generating enhanced textures. However, the generated fake details often make undesirable artifacts and the overall image looks somewhat unnatural. Therefore, in this paper, we present a new approach to reconstructing realistic super-resolved images with high perceptual quality, while maintaining the naturalness of the result. In particular, we focus on the domain prior properties of SISR problem. Specifically, we define the naturalness prior in the low-level domain and constrain the output image in the natural manifold, which eventually generates more natural and realistic images. Our results show better naturalness compared to the recent super-resolution algorithms including perception-oriented ones.
Published 2019-11-09
URL https://arxiv.org/abs/1911.03624v1
PDF https://arxiv.org/pdf/1911.03624v1.pdf
PWC https://paperswithcode.com/paper/natural-and-realistic-single-image-super-1
Repo https://github.com/JWSoh/NatSR
Framework tf

#### Parameter-Efficient Transfer Learning for NLP

Title Parameter-Efficient Transfer Learning for NLP
Authors Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly
Abstract Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones. The parameters of the original network remain fixed, yielding a high degree of parameter sharing. To demonstrate adapter’s effectiveness, we transfer the recently proposed BERT Transformer model to 26 diverse text classification tasks, including the GLUE benchmark. Adapters attain near state-of-the-art performance, whilst adding only a few parameters per task. On GLUE, we attain within 0.4% of the performance of full fine-tuning, adding only 3.6% parameters per task. By contrast, fine-tuning trains 100% of the parameters per task.
Published 2019-02-02
URL https://arxiv.org/abs/1902.00751v2
PDF https://arxiv.org/pdf/1902.00751v2.pdf
PWC https://paperswithcode.com/paper/parameter-efficient-transfer-learning-for-nlp
Framework tf

#### Stochastic Optimization of Sorting Networks via Continuous Relaxations

Title Stochastic Optimization of Sorting Networks via Continuous Relaxations
Authors Aditya Grover, Eric Wang, Aaron Zweig, Stefano Ermon
Abstract Sorting input objects is an important step in many machine learning pipelines. However, the sorting operator is non-differentiable with respect to its inputs, which prohibits end-to-end gradient-based optimization. In this work, we propose NeuralSort, a general-purpose continuous relaxation of the output of the sorting operator from permutation matrices to the set of unimodal row-stochastic matrices, where every row sums to one and has a distinct arg max. This relaxation permits straight-through optimization of any computational graph involve a sorting operation. Further, we use this relaxation to enable gradient-based stochastic optimization over the combinatorially large space of permutations by deriving a reparameterized gradient estimator for the Plackett-Luce family of distributions over permutations. We demonstrate the usefulness of our framework on three tasks that require learning semantic orderings of high-dimensional objects, including a fully differentiable, parameterized extension of the k-nearest neighbors algorithm.
Published 2019-03-21
URL http://arxiv.org/abs/1903.08850v2
PDF http://arxiv.org/pdf/1903.08850v2.pdf
PWC https://paperswithcode.com/paper/stochastic-optimization-of-sorting-networks-1
Repo https://github.com/ermongroup/neuralsort
Framework pytorch

#### Super-realtime facial landmark detection and shape fitting by deep regression of shape model parameters

Title Super-realtime facial landmark detection and shape fitting by deep regression of shape model parameters
Authors Marcin Kopaczka, Justus Schock, Dorit Merhof
Abstract We present a method for highly efficient landmark detection that combines deep convolutional neural networks with well established model-based fitting algorithms. Motivated by established model-based fitting methods such as active shapes, we use a PCA of the landmark positions to allow generative modeling of facial landmarks. Instead of computing the model parameters using iterative optimization, the PCA is included in a deep neural network using a novel layer type. The network predicts model parameters in a single forward pass, thereby allowing facial landmark detection at several hundreds of frames per second. Our architecture allows direct end-to-end training of a model-based landmark detection method and shows that deep neural networks can be used to reliably predict model parameters directly without the need for an iterative optimization. The method is evaluated on different datasets for facial landmark detection and medical image segmentation. PyTorch code is freely available at https://github.com/justusschock/shapenet
Tasks Facial Landmark Detection, Medical Image Segmentation, Semantic Segmentation
Published 2019-02-09
URL http://arxiv.org/abs/1902.03459v1
PDF http://arxiv.org/pdf/1902.03459v1.pdf
PWC https://paperswithcode.com/paper/super-realtime-facial-landmark-detection-and
Repo https://github.com/justusschock/shapenet
Framework pytorch

#### DeepOBS: A Deep Learning Optimizer Benchmark Suite

Title DeepOBS: A Deep Learning Optimizer Benchmark Suite
Authors Frank Schneider, Lukas Balles, Philipp Hennig
Abstract Because the choice and tuning of the optimizer affects the speed, and ultimately the performance of deep learning, there is significant past and recent research in this area. Yet, perhaps surprisingly, there is no generally agreed-upon protocol for the quantitative and reproducible evaluation of optimization strategies for deep learning. We suggest routines and benchmarks for stochastic optimization, with special focus on the unique aspects of deep learning, such as stochasticity, tunability and generalization. As the primary contribution, we present DeepOBS, a Python package of deep learning optimization benchmarks. The package addresses key challenges in the quantitative assessment of stochastic optimizers, and automates most steps of benchmarking. The library includes a wide and extensible set of ready-to-use realistic optimization problems, such as training Residual Networks for image classification on ImageNet or character-level language prediction models, as well as popular classics like MNIST and CIFAR-10. The package also provides realistic baseline results for the most popular optimizers on these test problems, ensuring a fair comparison to the competition when benchmarking new optimizers, and without having to run costly experiments. It comes with output back-ends that directly produce LaTeX code for inclusion in academic publications. It supports TensorFlow and is available open source.
Published 2019-03-13
URL http://arxiv.org/abs/1903.05499v1
PDF http://arxiv.org/pdf/1903.05499v1.pdf
PWC https://paperswithcode.com/paper/deepobs-a-deep-learning-optimizer-benchmark-1
Repo https://github.com/fsschneider/deepobs
Framework tf

#### Cyanure: An Open-Source Toolbox for Empirical Risk Minimization for Python, C++, and soon more

Title Cyanure: An Open-Source Toolbox for Empirical Risk Minimization for Python, C++, and soon more
Authors Julien Mairal
Abstract Cyanure is an open-source C++ software package with a Python interface. The goal of Cyanure is to provide state-of-the-art solvers for learning linear models, based on stochastic variance-reduced stochastic optimization with acceleration mechanisms. Cyanure can handle a large variety of loss functions (logistic, square, squared hinge, multinomial logistic) and regularization functions (l_2, l_1, elastic-net, fused Lasso, multi-task group Lasso). It provides a simple Python API, which is very close to that of scikit-learn, which should be extended to other languages such as R or Matlab in a near future.