Paper Group AWR 69
Bleaching Text: Abstract Features for Cross-lingual Gender Prediction. Interoceptive robustness through environment-mediated morphological development. Deep learning with differential Gaussian process flows. CSI-Net: Unified Human Body Characterization and Pose Recognition. Aequitas: A Bias and Fairness Audit Toolkit. Creative Invention Benchmark. …
Bleaching Text: Abstract Features for Cross-lingual Gender Prediction
Title | Bleaching Text: Abstract Features for Cross-lingual Gender Prediction |
Authors | Rob van der Goot, Nikola Ljubešić, Ian Matroos, Malvina Nissim, Barbara Plank |
Abstract | Gender prediction has typically focused on lexical and social network features, yielding good performance, but making systems highly language-, topic-, and platform-dependent. Cross-lingual embeddings circumvent some of these limitations, but capture gender-specific style less. We propose an alternative: bleaching text, i.e., transforming lexical strings into more abstract features. This study provides evidence that such features allow for better transfer across languages. Moreover, we present a first study on the ability of humans to perform cross-lingual gender prediction. We find that human predictive power proves similar to that of our bleached models, and both perform better than lexical models. |
Tasks | Gender Prediction |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.03122v1 |
http://arxiv.org/pdf/1805.03122v1.pdf | |
PWC | https://paperswithcode.com/paper/bleaching-text-abstract-features-for-cross |
Repo | https://github.com/bplank/bleaching-text |
Framework | none |
Interoceptive robustness through environment-mediated morphological development
Title | Interoceptive robustness through environment-mediated morphological development |
Authors | Sam Kriegman, Nick Cheney, Francesco Corucci, Josh C. Bongard |
Abstract | Typically, AI researchers and roboticists try to realize intelligent behavior in machines by tuning parameters of a predefined structure (body plan and/or neural network architecture) using evolutionary or learning algorithms. Another but not unrelated longstanding property of these systems is their brittleness to slight aberrations, as highlighted by the growing deep learning literature on adversarial examples. Here we show robustness can be achieved by evolving the geometry of soft robots, their control systems, and how their material properties develop in response to one particular interoceptive stimulus (engineering stress) during their lifetimes. By doing so we realized robots that were equally fit but more robust to extreme material defects (such as might occur during fabrication or by damage thereafter) than robots that did not develop during their lifetimes, or developed in response to a different interoceptive stimulus (pressure). This suggests that the interplay between changes in the containing systems of agents (body plan and/or neural architecture) at different temporal scales (evolutionary and developmental) along different modalities (geometry, material properties, synaptic weights) and in response to different signals (interoceptive and external perception) all dictate those agents’ abilities to evolve or learn capable and robust strategies. |
Tasks | |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02257v2 |
http://arxiv.org/pdf/1804.02257v2.pdf | |
PWC | https://paperswithcode.com/paper/interoceptive-robustness-through-environment |
Repo | https://github.com/skriegman/2018-gecco |
Framework | none |
Deep learning with differential Gaussian process flows
Title | Deep learning with differential Gaussian process flows |
Authors | Pashupati Hegde, Markus Heinonen, Harri Lähdesmäki, Samuel Kaski |
Abstract | We propose a novel deep learning paradigm of differential flows that learn a stochastic differential equation transformations of inputs prior to a standard classification or regression function. The key property of differential Gaussian processes is the warping of inputs through infinitely deep, but infinitesimal, differential fields, that generalise discrete layers into a dynamical system. We demonstrate state-of-the-art results that exceed the performance of deep Gaussian processes and neural networks |
Tasks | Gaussian Processes |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.04066v2 |
http://arxiv.org/pdf/1810.04066v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-with-differential-gaussian |
Repo | https://github.com/hegdepashupati/differential-dgp |
Framework | tf |
CSI-Net: Unified Human Body Characterization and Pose Recognition
Title | CSI-Net: Unified Human Body Characterization and Pose Recognition |
Authors | Fei Wang, Jinsong Han, Shiyuan Zhang, Xu He, Dong Huang |
Abstract | We build CSI-Net, a unified Deep Neural Network~(DNN), to learn the representation of WiFi signals. Using CSI-Net, we jointly solved two body characterization problems: biometrics estimation (including body fat, muscle, water, and bone rates) and person recognition. We also demonstrated the application of CSI-Net on two distinctive pose recognition tasks: the hand sign recognition (fine-scaled action of the hand) and falling detection (coarse-scaled motion of the body). |
Tasks | Person Recognition, Temporal Action Localization |
Published | 2018-10-07 |
URL | http://arxiv.org/abs/1810.03064v2 |
http://arxiv.org/pdf/1810.03064v2.pdf | |
PWC | https://paperswithcode.com/paper/csi-net-unified-human-body-characterization |
Repo | https://github.com/geekfeiw/CSI-Net |
Framework | pytorch |
Aequitas: A Bias and Fairness Audit Toolkit
Title | Aequitas: A Bias and Fairness Audit Toolkit |
Authors | Pedro Saleiro, Benedict Kuester, Loren Hinkson, Jesse London, Abby Stevens, Ari Anisfeld, Kit T. Rodolfa, Rayid Ghani |
Abstract | Recent work has raised concerns on the risk of unintended bias in AI systems being used nowadays that can affect individuals unfairly based on race, gender or religion, among other possible characteristics. While a lot of bias metrics and fairness definitions have been proposed in recent years, there is no consensus on which metric/definition should be used and there are very few available resources to operationalize them. Therefore, despite recent awareness, auditing for bias and fairness when developing and deploying AI systems is not yet a standard practice. We present Aequitas, an open source bias and fairness audit toolkit that is an intuitive and easy to use addition to the machine learning workflow, enabling users to seamlessly test models for several bias and fairness metrics in relation to multiple population sub-groups. Aequitas facilitates informed and equitable decisions around developing and deploying algorithmic decision making systems for both data scientists, machine learning researchers and policymakers. |
Tasks | Decision Making |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.05577v2 |
http://arxiv.org/pdf/1811.05577v2.pdf | |
PWC | https://paperswithcode.com/paper/aequitas-a-bias-and-fairness-audit-toolkit |
Repo | https://github.com/dssg/aequitas |
Framework | none |
Creative Invention Benchmark
Title | Creative Invention Benchmark |
Authors | Matthew Guzdial, Nicholas Liao, Vishwa Shah, Mark O. Riedl |
Abstract | In this paper we present the Creative Invention Benchmark (CrIB), a 2000-problem benchmark for evaluating a particular facet of computational creativity. Specifically, we address combinational p-creativity, the creativity at play when someone combines existing knowledge to achieve a solution novel to that individual. We present generation strategies for the five problem categories of the benchmark and a set of initial baselines. |
Tasks | |
Published | 2018-05-09 |
URL | http://arxiv.org/abs/1805.03720v1 |
http://arxiv.org/pdf/1805.03720v1.pdf | |
PWC | https://paperswithcode.com/paper/creative-invention-benchmark |
Repo | https://github.com/mguzdial3/CrIB |
Framework | none |
Learning to Sample
Title | Learning to Sample |
Authors | Oren Dovrat, Itai Lang, Shai Avidan |
Abstract | Processing large point clouds is a challenging task. Therefore, the data is often sampled to a size that can be processed more easily. The question is how to sample the data? A popular sampling technique is Farthest Point Sampling (FPS). However, FPS is agnostic to a downstream application (classification, retrieval, etc.). The underlying assumption seems to be that minimizing the farthest point distance, as done by FPS, is a good proxy to other objective functions. We show that it is better to learn how to sample. To do that, we propose a deep network to simplify 3D point clouds. The network, termed S-NET, takes a point cloud and produces a smaller point cloud that is optimized for a particular task. The simplified point cloud is not guaranteed to be a subset of the original point cloud. Therefore, we match it to a subset of the original points in a post-processing step. We contrast our approach with FPS by experimenting on two standard data sets and show significantly better results for a variety of applications. Our code is publicly available at: https://github.com/orendv/learning_to_sample |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01659v2 |
http://arxiv.org/pdf/1812.01659v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-sample |
Repo | https://github.com/orendv/learning_to_sample |
Framework | tf |
Quantized Densely Connected U-Nets for Efficient Landmark Localization
Title | Quantized Densely Connected U-Nets for Efficient Landmark Localization |
Authors | Zhiqiang Tang, Xi Peng, Shijie Geng, Lingfei Wu, Shaoting Zhang, Dimitris Metaxas |
Abstract | In this paper, we propose quantized densely connected U-Nets for efficient visual landmark localization. The idea is that features of the same semantic meanings are globally reused across the stacked U-Nets. This dense connectivity largely improves the information flow, yielding improved localization accuracy. However, a vanilla dense design would suffer from critical efficiency issue in both training and testing. To solve this problem, we first propose order-K dense connectivity to trim off long-distance shortcuts; then, we use a memory-efficient implementation to significantly boost the training efficiency and investigate an iterative refinement that may slice the model size in half. Finally, to reduce the memory consumption and high precision operations both in training and testing, we further quantize weights, inputs, and gradients of our localization network to low bit-width numbers. We validate our approach in two tasks: human pose estimation and face alignment. The results show that our approach achieves state-of-the-art localization accuracy, but using ~70% fewer parameters, ~98% less model size and saving ~75% training memory compared with other benchmark localizers. The code is available at https://github.com/zhiqiangdon/CU-Net. |
Tasks | Face Alignment, Pose Estimation |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02194v2 |
http://arxiv.org/pdf/1808.02194v2.pdf | |
PWC | https://paperswithcode.com/paper/quantized-densely-connected-u-nets-for |
Repo | https://github.com/zhiqiangdon/CU-Net |
Framework | pytorch |
SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning
Title | SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning |
Authors | Wei Wen, Yandan Wang, Feng Yan, Cong Xu, Chunpeng Wu, Yiran Chen, Hai Li |
Abstract | In Deep Learning, Stochastic Gradient Descent (SGD) is usually selected as a training method because of its efficiency; however, recently, a problem in SGD gains research interest: sharp minima in Deep Neural Networks (DNNs) have poor generalization; especially, large-batch SGD tends to converge to sharp minima. It becomes an open question whether escaping sharp minima can improve the generalization. To answer this question, we propose SmoothOut framework to smooth out sharp minima in DNNs and thereby improve generalization. In a nutshell, SmoothOut perturbs multiple copies of the DNN by noise injection and averages these copies. Injecting noises to SGD is widely used in the literature, but SmoothOut differs in lots of ways: (1) a de-noising process is applied before parameter updating; (2) noise strength is adapted to filter norm; (3) an alternative interpretation on the advantage of noise injection, from the perspective of sharpness and generalization; (4) usage of uniform noise instead of Gaussian noise. We prove that SmoothOut can eliminate sharp minima. Training multiple DNN copies is inefficient, we further propose an unbiased stochastic SmoothOut which only introduces the overhead of noise injecting and de-noising per batch. An adaptive variant of SmoothOut, AdaSmoothOut, is also proposed to improve generalization. In a variety of experiments, SmoothOut and AdaSmoothOut consistently improve generalization in both small-batch and large-batch training on the top of state-of-the-art solutions. |
Tasks | |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.07898v3 |
http://arxiv.org/pdf/1805.07898v3.pdf | |
PWC | https://paperswithcode.com/paper/smoothout-smoothing-out-sharp-minima-to |
Repo | https://github.com/wenwei202/smoothout |
Framework | pytorch |
Optimizing groups of colluding strong attackers in mobile urban communication networks with evolutionary algorithms
Title | Optimizing groups of colluding strong attackers in mobile urban communication networks with evolutionary algorithms |
Authors | D. Bucur, G. Iacca, M. Gaudesi, G. Squillero, A. Tonda |
Abstract | In novel forms of the Social Internet of Things, any mobile user within communication range may help routing messages for another user in the network. The resulting message delivery rate depends both on the users’ mobility patterns and the message load in the network. This new type of configuration, however, poses new challenges to security, amongst them, assessing the effect that a group of colluding malicious participants can have on the global message delivery rate in such a network is far from trivial. In this work, after modeling such a question as an optimization problem, we are able to find quite interesting results by coupling a network simulator with an evolutionary algorithm. The chosen algorithm is specifically designed to solve problems whose solutions can be decomposed into parts sharing the same structure. We demonstrate the effectiveness of the proposed approach on two medium-sized Delay-Tolerant Networks, realistically simulated in the urban contexts of two cities with very different route topology: Venice and San Francisco. In all experiments, our methodology produces attack patterns that greatly lower network performance with respect to previous studies on the subject, as the evolutionary core is able to exploit the specific weaknesses of each target configuration. |
Tasks | |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.02713v1 |
http://arxiv.org/pdf/1810.02713v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-groups-of-colluding-strong |
Repo | https://github.com/doinab/DTN-security |
Framework | none |
Adversarial Audio Synthesis
Title | Adversarial Audio Synthesis |
Authors | Chris Donahue, Julian McAuley, Miller Puckette |
Abstract | Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales. Generative adversarial networks (GANs) have seen wide success at generating images that are both locally and globally coherent, but they have seen little application to audio generation. In this paper we introduce WaveGAN, a first attempt at applying GANs to unsupervised synthesis of raw-waveform audio. WaveGAN is capable of synthesizing one second slices of audio waveforms with global coherence, suitable for sound effect generation. Our experiments demonstrate that, without labels, WaveGAN learns to produce intelligible words when trained on a small-vocabulary speech dataset, and can also synthesize audio from other domains such as drums, bird vocalizations, and piano. We compare WaveGAN to a method which applies GANs designed for image generation on image-like audio feature representations, finding both approaches to be promising. |
Tasks | Audio Generation, Image Generation |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.04208v3 |
http://arxiv.org/pdf/1802.04208v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-audio-synthesis |
Repo | https://github.com/MurreyCode/wavegan |
Framework | tf |
Dynamic Weights in Multi-Objective Deep Reinforcement Learning
Title | Dynamic Weights in Multi-Objective Deep Reinforcement Learning |
Authors | Axel Abels, Diederik M. Roijers, Tom Lenaerts, Ann Nowé, Denis Steckelmacher |
Abstract | Many real-world decision problems are characterized by multiple conflicting objectives which must be balanced based on their relative importance. In the dynamic weights setting the relative importance changes over time and specialized algorithms that deal with such change, such as a tabular Reinforcement Learning (RL) algorithm by Natarajan and Tadepalli (2005), are required. However, this earlier work is not feasible for RL settings that necessitate the use of function approximators. We generalize across weight changes and high-dimensional inputs by proposing a multi-objective Q-network whose outputs are conditioned on the relative importance of objectives and we introduce Diverse Experience Replay (DER) to counter the inherent non-stationarity of the Dynamic Weights setting. We perform an extensive experimental evaluation and compare our methods to adapted algorithms from Deep Multi-Task/Multi-Objective Reinforcement Learning and show that our proposed network in combination with DER dominates these adapted algorithms across weight change scenarios and problem domains. |
Tasks | |
Published | 2018-09-20 |
URL | https://arxiv.org/abs/1809.07803v2 |
https://arxiv.org/pdf/1809.07803v2.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-weights-in-multi-objective-deep |
Repo | https://github.com/axelabels/DynMORL |
Framework | none |
A Dual Approach to Scalable Verification of Deep Networks
Title | A Dual Approach to Scalable Verification of Deep Networks |
Authors | Krishnamurthy, Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann, Pushmeet Kohli |
Abstract | This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks. |
Tasks | |
Published | 2018-03-17 |
URL | http://arxiv.org/abs/1803.06567v2 |
http://arxiv.org/pdf/1803.06567v2.pdf | |
PWC | https://paperswithcode.com/paper/a-dual-approach-to-scalable-verification-of |
Repo | https://github.com/deepmind/deep-verify |
Framework | tf |
A Stable and Effective Learning Strategy for Trainable Greedy Decoding
Title | A Stable and Effective Learning Strategy for Trainable Greedy Decoding |
Authors | Yun Chen, Victor O. K. Li, Kyunghyun Cho, Samuel R. Bowman |
Abstract | Beam search is a widely used approximate search strategy for neural network decoders, and it generally outperforms simple greedy decoding on tasks like machine translation. However, this improvement comes at substantial computational cost. In this paper, we propose a flexible new method that allows us to reap nearly the full benefits of beam search with nearly no additional computational cost. The method revolves around a small neural network actor that is trained to observe and manipulate the hidden state of a previously-trained decoder. To train this actor network, we introduce the use of a pseudo-parallel corpus built using the output of beam search on a base model, ranked by a target quality metric like BLEU. Our method is inspired by earlier work on this problem, but requires no reinforcement learning, and can be trained reliably on a range of models. Experiments on three parallel corpora and three architectures show that the method yields substantial improvements in translation quality and speed over each base system. |
Tasks | Machine Translation |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07915v2 |
http://arxiv.org/pdf/1804.07915v2.pdf | |
PWC | https://paperswithcode.com/paper/a-stable-and-effective-learning-strategy-for |
Repo | https://github.com/vadimkantorov/ctc |
Framework | pytorch |
Sparse-Group Bayesian Feature Selection Using Expectation Propagation for Signal Recovery and Network Reconstruction
Title | Sparse-Group Bayesian Feature Selection Using Expectation Propagation for Signal Recovery and Network Reconstruction |
Authors | Edgar Steiger, Martin Vingron |
Abstract | We present a Bayesian method for feature selection in the presence of grouping information with sparsity on the between- and within group level. Instead of using a stochastic algorithm for parameter inference, we employ expectation propagation, which is a deterministic and fast algorithm. Available methods for feature selection in the presence of grouping information have a number of short-comings: on one hand, lasso methods, while being fast, underestimate the regression coefficients and do not make good use of the grouping information, and on the other hand, Bayesian approaches, while accurate in parameter estimation, often rely on the stochastic and slow Gibbs sampling procedure to recover the parameters, rendering them infeasible e.g. for gene network reconstruction. Our approach of a Bayesian sparse-group framework with expectation propagation enables us to not only recover accurate parameter estimates in signal recovery problems, but also makes it possible to apply this Bayesian framework to large-scale network reconstruction problems. The presented method is generic but in terms of application we focus on gene regulatory networks. We show on simulated and experimental data that the method constitutes a good choice for network reconstruction regarding the number of correctly selected features, prediction on new data and reasonable computing time. |
Tasks | Feature Selection |
Published | 2018-09-25 |
URL | http://arxiv.org/abs/1809.09367v1 |
http://arxiv.org/pdf/1809.09367v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-group-bayesian-feature-selection-using |
Repo | https://github.com/edgarst/dogss |
Framework | none |