October 21, 2019

2938 words 14 mins read

Paper Group AWR 69

Bleaching Text: Abstract Features for Cross-lingual Gender Prediction. Interoceptive robustness through environment-mediated morphological development. Deep learning with differential Gaussian process flows. CSI-Net: Unified Human Body Characterization and Pose Recognition. Aequitas: A Bias and Fairness Audit Toolkit. Creative Invention Benchmark. …

Bleaching Text: Abstract Features for Cross-lingual Gender Prediction


Title	Bleaching Text: Abstract Features for Cross-lingual Gender Prediction
Authors	Rob van der Goot, Nikola Ljubešić, Ian Matroos, Malvina Nissim, Barbara Plank
Abstract	Gender prediction has typically focused on lexical and social network features, yielding good performance, but making systems highly language-, topic-, and platform-dependent. Cross-lingual embeddings circumvent some of these limitations, but capture gender-specific style less. We propose an alternative: bleaching text, i.e., transforming lexical strings into more abstract features. This study provides evidence that such features allow for better transfer across languages. Moreover, we present a first study on the ability of humans to perform cross-lingual gender prediction. We find that human predictive power proves similar to that of our bleached models, and both perform better than lexical models.
Tasks	Gender Prediction
Published	2018-05-08
URL	http://arxiv.org/abs/1805.03122v1
PDF	http://arxiv.org/pdf/1805.03122v1.pdf
PWC	https://paperswithcode.com/paper/bleaching-text-abstract-features-for-cross
Repo	https://github.com/bplank/bleaching-text
Framework	none

Interoceptive robustness through environment-mediated morphological development


Title	Interoceptive robustness through environment-mediated morphological development
Authors	Sam Kriegman, Nick Cheney, Francesco Corucci, Josh C. Bongard
Abstract	Typically, AI researchers and roboticists try to realize intelligent behavior in machines by tuning parameters of a predefined structure (body plan and/or neural network architecture) using evolutionary or learning algorithms. Another but not unrelated longstanding property of these systems is their brittleness to slight aberrations, as highlighted by the growing deep learning literature on adversarial examples. Here we show robustness can be achieved by evolving the geometry of soft robots, their control systems, and how their material properties develop in response to one particular interoceptive stimulus (engineering stress) during their lifetimes. By doing so we realized robots that were equally fit but more robust to extreme material defects (such as might occur during fabrication or by damage thereafter) than robots that did not develop during their lifetimes, or developed in response to a different interoceptive stimulus (pressure). This suggests that the interplay between changes in the containing systems of agents (body plan and/or neural architecture) at different temporal scales (evolutionary and developmental) along different modalities (geometry, material properties, synaptic weights) and in response to different signals (interoceptive and external perception) all dictate those agents’ abilities to evolve or learn capable and robust strategies.
Tasks
Published	2018-04-06
URL	http://arxiv.org/abs/1804.02257v2
PDF	http://arxiv.org/pdf/1804.02257v2.pdf
PWC	https://paperswithcode.com/paper/interoceptive-robustness-through-environment
Repo	https://github.com/skriegman/2018-gecco
Framework	none

Deep learning with differential Gaussian process flows


Title	Deep learning with differential Gaussian process flows
Authors	Pashupati Hegde, Markus Heinonen, Harri Lähdesmäki, Samuel Kaski
Abstract	We propose a novel deep learning paradigm of differential flows that learn a stochastic differential equation transformations of inputs prior to a standard classification or regression function. The key property of differential Gaussian processes is the warping of inputs through infinitely deep, but infinitesimal, differential fields, that generalise discrete layers into a dynamical system. We demonstrate state-of-the-art results that exceed the performance of deep Gaussian processes and neural networks
Tasks	Gaussian Processes
Published	2018-10-09
URL	http://arxiv.org/abs/1810.04066v2
PDF	http://arxiv.org/pdf/1810.04066v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-with-differential-gaussian
Repo	https://github.com/hegdepashupati/differential-dgp
Framework	tf

CSI-Net: Unified Human Body Characterization and Pose Recognition


Title	CSI-Net: Unified Human Body Characterization and Pose Recognition
Authors	Fei Wang, Jinsong Han, Shiyuan Zhang, Xu He, Dong Huang
Abstract	We build CSI-Net, a unified Deep Neural Network~(DNN), to learn the representation of WiFi signals. Using CSI-Net, we jointly solved two body characterization problems: biometrics estimation (including body fat, muscle, water, and bone rates) and person recognition. We also demonstrated the application of CSI-Net on two distinctive pose recognition tasks: the hand sign recognition (fine-scaled action of the hand) and falling detection (coarse-scaled motion of the body).
Tasks	Person Recognition, Temporal Action Localization
Published	2018-10-07
URL	http://arxiv.org/abs/1810.03064v2
PDF	http://arxiv.org/pdf/1810.03064v2.pdf
PWC	https://paperswithcode.com/paper/csi-net-unified-human-body-characterization
Repo	https://github.com/geekfeiw/CSI-Net
Framework	pytorch

Aequitas: A Bias and Fairness Audit Toolkit


Title	Aequitas: A Bias and Fairness Audit Toolkit
Authors	Pedro Saleiro, Benedict Kuester, Loren Hinkson, Jesse London, Abby Stevens, Ari Anisfeld, Kit T. Rodolfa, Rayid Ghani
Abstract	Recent work has raised concerns on the risk of unintended bias in AI systems being used nowadays that can affect individuals unfairly based on race, gender or religion, among other possible characteristics. While a lot of bias metrics and fairness definitions have been proposed in recent years, there is no consensus on which metric/definition should be used and there are very few available resources to operationalize them. Therefore, despite recent awareness, auditing for bias and fairness when developing and deploying AI systems is not yet a standard practice. We present Aequitas, an open source bias and fairness audit toolkit that is an intuitive and easy to use addition to the machine learning workflow, enabling users to seamlessly test models for several bias and fairness metrics in relation to multiple population sub-groups. Aequitas facilitates informed and equitable decisions around developing and deploying algorithmic decision making systems for both data scientists, machine learning researchers and policymakers.
Tasks	Decision Making
Published	2018-11-14
URL	http://arxiv.org/abs/1811.05577v2
PDF	http://arxiv.org/pdf/1811.05577v2.pdf
PWC	https://paperswithcode.com/paper/aequitas-a-bias-and-fairness-audit-toolkit
Repo	https://github.com/dssg/aequitas
Framework	none

Creative Invention Benchmark


Title	Creative Invention Benchmark
Authors	Matthew Guzdial, Nicholas Liao, Vishwa Shah, Mark O. Riedl
Abstract	In this paper we present the Creative Invention Benchmark (CrIB), a 2000-problem benchmark for evaluating a particular facet of computational creativity. Specifically, we address combinational p-creativity, the creativity at play when someone combines existing knowledge to achieve a solution novel to that individual. We present generation strategies for the five problem categories of the benchmark and a set of initial baselines.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03720v1
PDF	http://arxiv.org/pdf/1805.03720v1.pdf
PWC	https://paperswithcode.com/paper/creative-invention-benchmark
Repo	https://github.com/mguzdial3/CrIB
Framework	none

Learning to Sample


Title	Learning to Sample
Authors	Oren Dovrat, Itai Lang, Shai Avidan
Abstract	Processing large point clouds is a challenging task. Therefore, the data is often sampled to a size that can be processed more easily. The question is how to sample the data? A popular sampling technique is Farthest Point Sampling (FPS). However, FPS is agnostic to a downstream application (classification, retrieval, etc.). The underlying assumption seems to be that minimizing the farthest point distance, as done by FPS, is a good proxy to other objective functions. We show that it is better to learn how to sample. To do that, we propose a deep network to simplify 3D point clouds. The network, termed S-NET, takes a point cloud and produces a smaller point cloud that is optimized for a particular task. The simplified point cloud is not guaranteed to be a subset of the original point cloud. Therefore, we match it to a subset of the original points in a post-processing step. We contrast our approach with FPS by experimenting on two standard data sets and show significantly better results for a variety of applications. Our code is publicly available at: https://github.com/orendv/learning_to_sample
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01659v2
PDF	http://arxiv.org/pdf/1812.01659v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-sample
Repo	https://github.com/orendv/learning_to_sample
Framework	tf

Quantized Densely Connected U-Nets for Efficient Landmark Localization


Title	Quantized Densely Connected U-Nets for Efficient Landmark Localization
Authors	Zhiqiang Tang, Xi Peng, Shijie Geng, Lingfei Wu, Shaoting Zhang, Dimitris Metaxas
Abstract	In this paper, we propose quantized densely connected U-Nets for efficient visual landmark localization. The idea is that features of the same semantic meanings are globally reused across the stacked U-Nets. This dense connectivity largely improves the information flow, yielding improved localization accuracy. However, a vanilla dense design would suffer from critical efficiency issue in both training and testing. To solve this problem, we first propose order-K dense connectivity to trim off long-distance shortcuts; then, we use a memory-efficient implementation to significantly boost the training efficiency and investigate an iterative refinement that may slice the model size in half. Finally, to reduce the memory consumption and high precision operations both in training and testing, we further quantize weights, inputs, and gradients of our localization network to low bit-width numbers. We validate our approach in two tasks: human pose estimation and face alignment. The results show that our approach achieves state-of-the-art localization accuracy, but using ~70% fewer parameters, ~98% less model size and saving ~75% training memory compared with other benchmark localizers. The code is available at https://github.com/zhiqiangdon/CU-Net.
Tasks	Face Alignment, Pose Estimation
Published	2018-08-07
URL	http://arxiv.org/abs/1808.02194v2
PDF	http://arxiv.org/pdf/1808.02194v2.pdf
PWC	https://paperswithcode.com/paper/quantized-densely-connected-u-nets-for
Repo	https://github.com/zhiqiangdon/CU-Net
Framework	pytorch

SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning


Title	SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning
Authors	Wei Wen, Yandan Wang, Feng Yan, Cong Xu, Chunpeng Wu, Yiran Chen, Hai Li
Abstract	In Deep Learning, Stochastic Gradient Descent (SGD) is usually selected as a training method because of its efficiency; however, recently, a problem in SGD gains research interest: sharp minima in Deep Neural Networks (DNNs) have poor generalization; especially, large-batch SGD tends to converge to sharp minima. It becomes an open question whether escaping sharp minima can improve the generalization. To answer this question, we propose SmoothOut framework to smooth out sharp minima in DNNs and thereby improve generalization. In a nutshell, SmoothOut perturbs multiple copies of the DNN by noise injection and averages these copies. Injecting noises to SGD is widely used in the literature, but SmoothOut differs in lots of ways: (1) a de-noising process is applied before parameter updating; (2) noise strength is adapted to filter norm; (3) an alternative interpretation on the advantage of noise injection, from the perspective of sharpness and generalization; (4) usage of uniform noise instead of Gaussian noise. We prove that SmoothOut can eliminate sharp minima. Training multiple DNN copies is inefficient, we further propose an unbiased stochastic SmoothOut which only introduces the overhead of noise injecting and de-noising per batch. An adaptive variant of SmoothOut, AdaSmoothOut, is also proposed to improve generalization. In a variety of experiments, SmoothOut and AdaSmoothOut consistently improve generalization in both small-batch and large-batch training on the top of state-of-the-art solutions.
Tasks
Published	2018-05-21
URL	http://arxiv.org/abs/1805.07898v3
PDF	http://arxiv.org/pdf/1805.07898v3.pdf
PWC	https://paperswithcode.com/paper/smoothout-smoothing-out-sharp-minima-to
Repo	https://github.com/wenwei202/smoothout
Framework	pytorch

Optimizing groups of colluding strong attackers in mobile urban communication networks with evolutionary algorithms


Title	Optimizing groups of colluding strong attackers in mobile urban communication networks with evolutionary algorithms
Authors	D. Bucur, G. Iacca, M. Gaudesi, G. Squillero, A. Tonda
Abstract	In novel forms of the Social Internet of Things, any mobile user within communication range may help routing messages for another user in the network. The resulting message delivery rate depends both on the users’ mobility patterns and the message load in the network. This new type of configuration, however, poses new challenges to security, amongst them, assessing the effect that a group of colluding malicious participants can have on the global message delivery rate in such a network is far from trivial. In this work, after modeling such a question as an optimization problem, we are able to find quite interesting results by coupling a network simulator with an evolutionary algorithm. The chosen algorithm is specifically designed to solve problems whose solutions can be decomposed into parts sharing the same structure. We demonstrate the effectiveness of the proposed approach on two medium-sized Delay-Tolerant Networks, realistically simulated in the urban contexts of two cities with very different route topology: Venice and San Francisco. In all experiments, our methodology produces attack patterns that greatly lower network performance with respect to previous studies on the subject, as the evolutionary core is able to exploit the specific weaknesses of each target configuration.
Tasks
Published	2018-10-05
URL	http://arxiv.org/abs/1810.02713v1
PDF	http://arxiv.org/pdf/1810.02713v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-groups-of-colluding-strong
Repo	https://github.com/doinab/DTN-security
Framework	none

Adversarial Audio Synthesis


Title	Adversarial Audio Synthesis
Authors	Chris Donahue, Julian McAuley, Miller Puckette
Abstract	Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales. Generative adversarial networks (GANs) have seen wide success at generating images that are both locally and globally coherent, but they have seen little application to audio generation. In this paper we introduce WaveGAN, a first attempt at applying GANs to unsupervised synthesis of raw-waveform audio. WaveGAN is capable of synthesizing one second slices of audio waveforms with global coherence, suitable for sound effect generation. Our experiments demonstrate that, without labels, WaveGAN learns to produce intelligible words when trained on a small-vocabulary speech dataset, and can also synthesize audio from other domains such as drums, bird vocalizations, and piano. We compare WaveGAN to a method which applies GANs designed for image generation on image-like audio feature representations, finding both approaches to be promising.
Tasks	Audio Generation, Image Generation
Published	2018-02-12
URL	http://arxiv.org/abs/1802.04208v3
PDF	http://arxiv.org/pdf/1802.04208v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-audio-synthesis
Repo	https://github.com/MurreyCode/wavegan
Framework	tf

Dynamic Weights in Multi-Objective Deep Reinforcement Learning


Title	Dynamic Weights in Multi-Objective Deep Reinforcement Learning
Authors	Axel Abels, Diederik M. Roijers, Tom Lenaerts, Ann Nowé, Denis Steckelmacher
Abstract	Many real-world decision problems are characterized by multiple conflicting objectives which must be balanced based on their relative importance. In the dynamic weights setting the relative importance changes over time and specialized algorithms that deal with such change, such as a tabular Reinforcement Learning (RL) algorithm by Natarajan and Tadepalli (2005), are required. However, this earlier work is not feasible for RL settings that necessitate the use of function approximators. We generalize across weight changes and high-dimensional inputs by proposing a multi-objective Q-network whose outputs are conditioned on the relative importance of objectives and we introduce Diverse Experience Replay (DER) to counter the inherent non-stationarity of the Dynamic Weights setting. We perform an extensive experimental evaluation and compare our methods to adapted algorithms from Deep Multi-Task/Multi-Objective Reinforcement Learning and show that our proposed network in combination with DER dominates these adapted algorithms across weight change scenarios and problem domains.
Tasks
Published	2018-09-20
URL	https://arxiv.org/abs/1809.07803v2
PDF	https://arxiv.org/pdf/1809.07803v2.pdf
PWC	https://paperswithcode.com/paper/dynamic-weights-in-multi-objective-deep
Repo	https://github.com/axelabels/DynMORL
Framework	none

A Dual Approach to Scalable Verification of Deep Networks


Title	A Dual Approach to Scalable Verification of Deep Networks
Authors	Krishnamurthy, Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann, Pushmeet Kohli
Abstract	This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.
Tasks
Published	2018-03-17
URL	http://arxiv.org/abs/1803.06567v2
PDF	http://arxiv.org/pdf/1803.06567v2.pdf
PWC	https://paperswithcode.com/paper/a-dual-approach-to-scalable-verification-of
Repo	https://github.com/deepmind/deep-verify
Framework	tf

A Stable and Effective Learning Strategy for Trainable Greedy Decoding


Title	A Stable and Effective Learning Strategy for Trainable Greedy Decoding
Authors	Yun Chen, Victor O. K. Li, Kyunghyun Cho, Samuel R. Bowman
Abstract	Beam search is a widely used approximate search strategy for neural network decoders, and it generally outperforms simple greedy decoding on tasks like machine translation. However, this improvement comes at substantial computational cost. In this paper, we propose a flexible new method that allows us to reap nearly the full benefits of beam search with nearly no additional computational cost. The method revolves around a small neural network actor that is trained to observe and manipulate the hidden state of a previously-trained decoder. To train this actor network, we introduce the use of a pseudo-parallel corpus built using the output of beam search on a base model, ranked by a target quality metric like BLEU. Our method is inspired by earlier work on this problem, but requires no reinforcement learning, and can be trained reliably on a range of models. Experiments on three parallel corpora and three architectures show that the method yields substantial improvements in translation quality and speed over each base system.
Tasks	Machine Translation
Published	2018-04-21
URL	http://arxiv.org/abs/1804.07915v2
PDF	http://arxiv.org/pdf/1804.07915v2.pdf
PWC	https://paperswithcode.com/paper/a-stable-and-effective-learning-strategy-for
Repo	https://github.com/vadimkantorov/ctc
Framework	pytorch

Sparse-Group Bayesian Feature Selection Using Expectation Propagation for Signal Recovery and Network Reconstruction


Title	Sparse-Group Bayesian Feature Selection Using Expectation Propagation for Signal Recovery and Network Reconstruction
Authors	Edgar Steiger, Martin Vingron
Abstract	We present a Bayesian method for feature selection in the presence of grouping information with sparsity on the between- and within group level. Instead of using a stochastic algorithm for parameter inference, we employ expectation propagation, which is a deterministic and fast algorithm. Available methods for feature selection in the presence of grouping information have a number of short-comings: on one hand, lasso methods, while being fast, underestimate the regression coefficients and do not make good use of the grouping information, and on the other hand, Bayesian approaches, while accurate in parameter estimation, often rely on the stochastic and slow Gibbs sampling procedure to recover the parameters, rendering them infeasible e.g. for gene network reconstruction. Our approach of a Bayesian sparse-group framework with expectation propagation enables us to not only recover accurate parameter estimates in signal recovery problems, but also makes it possible to apply this Bayesian framework to large-scale network reconstruction problems. The presented method is generic but in terms of application we focus on gene regulatory networks. We show on simulated and experimental data that the method constitutes a good choice for network reconstruction regarding the number of correctly selected features, prediction on new data and reasonable computing time.
Tasks	Feature Selection
Published	2018-09-25
URL	http://arxiv.org/abs/1809.09367v1
PDF	http://arxiv.org/pdf/1809.09367v1.pdf
PWC	https://paperswithcode.com/paper/sparse-group-bayesian-feature-selection-using
Repo	https://github.com/edgarst/dogss
Framework	none