April 2, 2020

3159 words 15 mins read

Paper Group ANR 258

Paper Group ANR 258

Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors. A Differential-form Pullback Programming Language for Higher-order Reverse-mode Automatic Differentiation. sKPNSGA-II: Knee point based MOEA with self-adaptive angle for Mission Planning Problems. On the Discrepancy between Density Estimation …

Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors

Title Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors
Authors Zhaoqiang Liu, Selwyn Gomes, Avtansh Tiwari, Jonathan Scarlett
Abstract The goal of standard 1-bit compressive sensing is to accurately recover an unknown sparse vector from binary-valued measurements, each indicating the sign of a linear function of the vector. Motivated by recent advances in compressive sensing with generative models, where a generative modeling assumption replaces the usual sparsity assumption, we study the problem of 1-bit compressive sensing with generative models. We first consider noiseless 1-bit measurements, and provide sample complexity bounds for approximate recovery under i.i.d.~Gaussian measurements and a Lipschitz continuous generative prior, as well as a near-matching algorithm-independent lower bound. Moreover, we demonstrate that the Binary $\epsilon$-Stable Embedding property, which characterizes the robustness of the reconstruction to measurement errors and noise, also holds for 1-bit compressive sensing with Lipschitz continuous generative models with sufficiently many Gaussian measurements. In addition, we apply our results to neural network generative models, and provide a proof-of-concept numerical experiment demonstrating significant improvements over sparsity-based approaches.
Tasks Compressive Sensing
Published 2020-02-05
URL https://arxiv.org/abs/2002.01697v2
PDF https://arxiv.org/pdf/2002.01697v2.pdf
PWC https://paperswithcode.com/paper/sample-complexity-bounds-for-1-bit
Repo
Framework

A Differential-form Pullback Programming Language for Higher-order Reverse-mode Automatic Differentiation

Title A Differential-form Pullback Programming Language for Higher-order Reverse-mode Automatic Differentiation
Authors Carol Mak, Luke Ong
Abstract Building on the observation that reverse-mode automatic differentiation (AD) – a generalisation of backpropagation – can naturally be expressed as pullbacks of differential 1-forms, we design a simple higher-order programming language with a first-class differential operator, and present a reduction strategy which exactly simulates reverse-mode AD. We justify our reduction strategy by interpreting our language in any differential $\lambda$-category that satisfies the Hahn-Banach Separation Theorem, and show that the reduction strategy precisely captures reverse-mode AD in a truly higher-order setting.
Tasks
Published 2020-02-19
URL https://arxiv.org/abs/2002.08241v1
PDF https://arxiv.org/pdf/2002.08241v1.pdf
PWC https://paperswithcode.com/paper/a-differential-form-pullback-programming
Repo
Framework

sKPNSGA-II: Knee point based MOEA with self-adaptive angle for Mission Planning Problems

Title sKPNSGA-II: Knee point based MOEA with self-adaptive angle for Mission Planning Problems
Authors Cristian Ramirez-Atencia, Sanaz Mostaghim, David Camacho
Abstract Real-world and complex problems have usually many objective functions that have to be optimized all at once. Over the last decades, Multi-Objective Evolutionary Algorithms (MOEAs) are designed to solve this kind of problems. Nevertheless, some problems have many objectives which lead to a large number of non-dominated solutions obtained by the optimization algorithms. The large set of non-dominated solutions hinders the selection of the most appropriate solution by the decision maker. This paper presents a new algorithm that has been designed to obtain the most significant solutions from the Pareto Optimal Frontier (POF). This approach is based on the cone-domination applied to MOEA, which can find the knee point solutions. In order to obtain the best cone angle, we propose a hypervolume-distribution metric, which is used to self-adapt the angle during the evolving process. This new algorithm has been applied to the real world application in Unmanned Air Vehicle (UAV) Mission Planning Problem. The experimental results show a significant improvement of the algorithm performance in terms of hypervolume, number of solutions, and also the required number of generations to converge.
Tasks
Published 2020-02-20
URL https://arxiv.org/abs/2002.08867v1
PDF https://arxiv.org/pdf/2002.08867v1.pdf
PWC https://paperswithcode.com/paper/skpnsga-ii-knee-point-based-moea-with-self
Repo
Framework

On the Discrepancy between Density Estimation and Sequence Generation

Title On the Discrepancy between Density Estimation and Sequence Generation
Authors Jason Lee, Dustin Tran, Orhan Firat, Kyunghyun Cho
Abstract Many sequence-to-sequence generation tasks, including machine translation and text-to-speech, can be posed as estimating the density of the output y given the input x: p(yx). Given this interpretation, it is natural to evaluate sequence-to-sequence models using conditional log-likelihood on a test set. However, the goal of sequence-to-sequence generation (or structured prediction) is to find the best output y^ given an input x, and each task has its own downstream metric R that scores a model output by comparing against a set of references y*: R(y^, y* x). While we hope that a model that excels in density estimation also performs well on the downstream metric, the exact correlation has not been studied for sequence generation tasks. In this paper, by comparing several density estimators on five machine translation tasks, we find that the correlation between rankings of models based on log-likelihood and BLEU varies significantly depending on the range of the model families being compared. First, log-likelihood is highly correlated with BLEU when we consider models within the same family (e.g. autoregressive models, or latent variable models with the same parameterization of the prior). However, we observe no correlation between rankings of models across different families: (1) among non-autoregressive latent variable models, a flexible prior distribution is better at density estimation but gives worse generation quality than a simple prior, and (2) autoregressive models offer the best translation performance overall, while latent variable models with a normalizing flow prior give the highest held-out log-likelihood across all datasets. Therefore, we recommend using a simple prior for the latent variable non-autoregressive model when fast generation speed is desired.
Tasks Density Estimation, Latent Variable Models, Machine Translation, Structured Prediction
Published 2020-02-17
URL https://arxiv.org/abs/2002.07233v1
PDF https://arxiv.org/pdf/2002.07233v1.pdf
PWC https://paperswithcode.com/paper/on-the-discrepancy-between-density-estimation
Repo
Framework

Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges

Title Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges
Authors Triet H. M. Le, Hao Chen, M. Ali Babar
Abstract Deep Learning (DL) techniques for Natural Language Processing have been evolving remarkably fast. Recently, the DL advances in language modeling, machine translation and paragraph understanding are so prominent that the potential of DL in Software Engineering cannot be overlooked, especially in the field of program learning. To facilitate further research and applications of DL in this field, we provide a comprehensive review to categorize and investigate existing DL methods for source code modeling and generation. To address the limitations of the traditional source code models, we formulate common program learning tasks under an encoder-decoder framework. After that, we introduce recent DL mechanisms suitable to solve such problems. Then, we present the state-of-the-art practices and discuss their challenges with some recommendations for practitioners and researchers as well.
Tasks Language Modelling, Machine Translation
Published 2020-02-13
URL https://arxiv.org/abs/2002.05442v1
PDF https://arxiv.org/pdf/2002.05442v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-source-code-modeling-and
Repo
Framework

Learning Coupled Policies for Simultaneous Machine Translation

Title Learning Coupled Policies for Simultaneous Machine Translation
Authors Philip Arthur, Trevor Cohn, Gholamreza Haffari
Abstract In simultaneous machine translation, the system needs to incrementally generate the output translation before the input sentence ends. This is a coupled decision process consisting of a programmer and interpreter. The programmer’s policy decides about when to WRITE the next output or READ the next input, and the interpreter’s policy decides what word to write. We present an imitation learning (IL) approach to efficiently learn effective coupled programmer-interpreter policies. To enable IL, we present an algorithmic oracle to produce oracle READ/WRITE actions for training bilingual sentence-pairs using the notion of word alignments. We attribute the effectiveness of the learned coupled policies to (i) scheduled sampling addressing the coupled exposure bias, and (ii) quality of oracle actions capturing enough information from the partial input before writing the output. Experiments show our method outperforms strong baselines in terms of translation quality and delay, when translating from German/Arabic/Czech/Bulgarian/Romanian to English.
Tasks Imitation Learning, Machine Translation
Published 2020-02-11
URL https://arxiv.org/abs/2002.04306v1
PDF https://arxiv.org/pdf/2002.04306v1.pdf
PWC https://paperswithcode.com/paper/learning-coupled-policies-for-simultaneous
Repo
Framework

Importance-Driven Deep Learning System Testing

Title Importance-Driven Deep Learning System Testing
Authors Simos Gerasimou, Hasan Ferit Eniser, Alper Sen, Alper Cakan
Abstract Deep Learning (DL) systems are key enablers for engineering intelligent applications due to their ability to solve complex tasks such as image recognition and machine translation. Nevertheless, using DL systems in safety- and security-critical applications requires to provide testing evidence for their dependable operation. Recent research in this direction focuses on adapting testing criteria from traditional software engineering as a means of increasing confidence for their correct behaviour. However, they are inadequate in capturing the intrinsic properties exhibited by these systems. We bridge this gap by introducing DeepImportance, a systematic testing methodology accompanied by an Importance-Driven (IDC) test adequacy criterion for DL systems. Applying IDC enables to establish a layer-wise functional understanding of the importance of DL system components and use this information to assess the semantic diversity of a test set. Our empirical evaluation on several DL systems, across multiple DL datasets and with state-of-the-art adversarial generation techniques demonstrates the usefulness and effectiveness of DeepImportance and its ability to support the engineering of more robust DL systems.
Tasks Machine Translation
Published 2020-02-09
URL https://arxiv.org/abs/2002.03433v1
PDF https://arxiv.org/pdf/2002.03433v1.pdf
PWC https://paperswithcode.com/paper/importance-driven-deep-learning-system
Repo
Framework

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

Title FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA
Authors Shehzeen Hussain, Mojan Javaheripi, Paarth Neekhara, Ryan Kastner, Farinaz Koushanfar
Abstract Autoregressive convolutional neural networks (CNNs) have been widely exploited for sequence generation tasks such as audio synthesis, language modeling and neural machine translation. WaveNet is a deep autoregressive CNN composed of several stacked layers of dilated convolution that is used for sequence generation. While WaveNet produces state-of-the art audio generation results, the naive inference implementation is quite slow; it takes a few minutes to generate just one second of audio on a high-end GPU. In this work, we develop the first accelerator platform~\textit{FastWave} for autoregressive convolutional neural networks, and address the associated design challenges. We design the Fast-Wavenet inference model in Vivado HLS and perform a wide range of optimizations including fixed-point implementation, array partitioning and pipelining. Our model uses a fully parameterized parallel architecture for fast matrix-vector multiplication that enables per-layer customized latency fine-tuning for further throughput improvement. Our experiments comparatively assess the trade-off between throughput and resource utilization for various optimizations. Our best WaveNet design on the Xilinx XCVU13P FPGA that uses only on-chip memory, achieves 66 faster generation speed compared to CPU implementation and 11 faster generation speed than GPU implementation.
Tasks Audio Generation, Language Modelling, Machine Translation
Published 2020-02-09
URL https://arxiv.org/abs/2002.04971v1
PDF https://arxiv.org/pdf/2002.04971v1.pdf
PWC https://paperswithcode.com/paper/fastwave-accelerating-autoregressive
Repo
Framework

Learning Light Field Angular Super-Resolution via a Geometry-Aware Network

Title Learning Light Field Angular Super-Resolution via a Geometry-Aware Network
Authors Jing Jin, Junhui Hou, Hui Yuan, Sam Kwong
Abstract The acquisition of light field images with high angular resolution is costly. Although many methods have been proposed to improve the angular resolution of a sparsely-sampled light field, they always focus on the light field with a small baseline, which is captured by a consumer light field camera. By making full use of the intrinsic \textit{geometry} information of light fields, in this paper we propose an end-to-end learning-based approach aiming at angularly super-resolving a sparsely-sampled light field with a large baseline. Our model consists of two learnable modules and a physically-based module. Specifically, it includes a depth estimation module for explicitly modeling the scene geometry, a physically-based warping for novel views synthesis, and a light field blending module specifically designed for light field reconstruction. Moreover, we introduce a novel loss function to promote the preservation of the light field parallax structure. Experimental results over various light field datasets including large baseline light field images demonstrate the significant superiority of our method when compared with state-of-the-art ones, i.e., our method improves the PSNR of the second best method up to 2 dB in average, while saves the execution time 48$\times$. In addition, our method preserves the light field parallax structure better.
Tasks Depth Estimation, Super-Resolution
Published 2020-02-26
URL https://arxiv.org/abs/2002.11263v1
PDF https://arxiv.org/pdf/2002.11263v1.pdf
PWC https://paperswithcode.com/paper/learning-light-field-angular-super-resolution
Repo
Framework

Rhythm, Chord and Melody Generation for Lead Sheets using Recurrent Neural Networks

Title Rhythm, Chord and Melody Generation for Lead Sheets using Recurrent Neural Networks
Authors Cedric De Boom, Stephanie Van Laere, Tim Verbelen, Bart Dhoedt
Abstract Music that is generated by recurrent neural networks often lacks a sense of direction and coherence. We therefore propose a two-stage LSTM-based model for lead sheet generation, in which the harmonic and rhythmic templates of the song are produced first, after which, in a second stage, a sequence of melody notes is generated conditioned on these templates. A subjective listening test shows that our approach outperforms the baselines and increases perceived musical coherence.
Tasks
Published 2020-02-21
URL https://arxiv.org/abs/2002.10266v1
PDF https://arxiv.org/pdf/2002.10266v1.pdf
PWC https://paperswithcode.com/paper/rhythm-chord-and-melody-generation-for-lead
Repo
Framework

A Provably Robust Multiple Rotation Averaging Scheme for SO(2)

Title A Provably Robust Multiple Rotation Averaging Scheme for SO(2)
Authors Tyler Maunu, Gilad Lerman
Abstract We give adversarial robustness results for synchronization on the rotation group over $\mathbb{R}^2$, $\mathrm{SO}(2)$. In particular, we consider an adversarial corruption setting, where an adversary can choose which measurements to corrupt as well as what to corrupt them to. In this setting, we first show that some common nonconvex formulations, which are categorized as “multiple rotation averaging”, may fail. We then discuss a new fast algorithm, called Trimmed Averaging Synchronization, which has exact recovery and linear convergence up to an outlier percentage of $1/4$.
Tasks
Published 2020-02-13
URL https://arxiv.org/abs/2002.05299v1
PDF https://arxiv.org/pdf/2002.05299v1.pdf
PWC https://paperswithcode.com/paper/a-provably-robust-multiple-rotation-averaging
Repo
Framework

A sequential resource investment planning framework using reinforcement learning and simulation-based optimization: A case study on microgrid storage expansion

Title A sequential resource investment planning framework using reinforcement learning and simulation-based optimization: A case study on microgrid storage expansion
Authors S. Tsianikas, N. Yousefi, J. Zhou, M. Rodgers, D. W. Coit
Abstract A model and expansion plan have been developed to optimally determine microgrid designs as they evolve to dynamically react to changing conditions and to exploit energy storage capabilities. In the wake of the highly electrified future ahead of us, the role of energy storage is crucial wherever distributed generation is abundant, such as microgrid settings. Given the variety of storage options that are recently becoming more economical, determining which type of storage technology to invest in, along with the appropriate timing and capacity becomes a critical research question. In problems where the investment timing is of high priority, like this one, developing analytical and systematic frameworks for rigorously considering these issues is indispensable. From a business perspective, these strategic frameworks will aim to optimize the process of investment planning, by leveraging novel approaches and by capturing all the problem details that traditional approaches are unable to. Reinforcement learning algorithms have recently proven to be successful in problems where sequential decision-making is inherent. In the operations planning area, these algorithms are already used but mostly in short-term problems with well-defined constraints and low levels of uncertainty modeling. On the contrary, in this work, we expand and tailor these techniques to long-term investment planning by utilizing model-free approaches, like the Q-learning algorithm, combined with simulation-based models. We find that specific types of energy storage units, including the vanadium-redox battery, can be expected to be at the core of the future microgrid applications, and therefore, require further attention. Another key finding is that the optimal storage capacity threshold for a system depends heavily on the price movements of the available storage units in the market.
Tasks Decision Making, Q-Learning
Published 2020-01-10
URL https://arxiv.org/abs/2001.03507v1
PDF https://arxiv.org/pdf/2001.03507v1.pdf
PWC https://paperswithcode.com/paper/a-sequential-resource-investment-planning
Repo
Framework

Hardware Architecture Proposal for TEDA algorithm to Data Streaming Anomaly Detection

Title Hardware Architecture Proposal for TEDA algorithm to Data Streaming Anomaly Detection
Authors Lucileide M. D. da Silva, Maria G. F. Coutinho, Carlos E. B. Santos, Mailson R. Santos, Luiz Affonso Guedes, M. Dolores Ruiz, Marcelo A. C. Fernandes
Abstract The amount of data in real-time, such as time series and streaming data, available today continues to grow. Being able to analyze this data the moment it arrives can bring an immense added value. However, it also requires a lot of computational effort and new acceleration techniques. As a possible solution to this problem, this paper proposes a hardware architecture for Typicality and Eccentricity Data Analytic (TEDA) algorithm implemented on Field Programmable Gate Arrays (FPGA) for use in data streaming anomaly detection. TEDA is based on a new approach to outlier detection in the data stream context. In order to validate the proposals, results of the occupation and throughput of the proposed hardware are presented. Besides, the bit accurate simulation results are also presented. The project aims to Xilinx Virtex-6 xc6vlx240t-1ff1156 as the target FPGA.
Tasks Anomaly Detection, Outlier Detection, Time Series
Published 2020-03-08
URL https://arxiv.org/abs/2003.03837v1
PDF https://arxiv.org/pdf/2003.03837v1.pdf
PWC https://paperswithcode.com/paper/hardware-architecture-proposal-for-teda
Repo
Framework

HMANet: Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images

Title HMANet: Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images
Authors Ruigang Niu
Abstract Semantic segmentation in very high resolution (VHR) aerial images is one of the most challenging tasks in remote sensing image understanding. Most of the current approaches are based on deep convolutional neural networks (DCNNs) for its remarkable ability of feature representations. Specifically, attention-based methods can effectively capture long-range dependencies and further reconstruct the feature maps for better representation. However, limited by the mere perspective of spacial and channel attention and huge computation complexity of self-attention mechanism, it’s unlikely to model the effective semantic interdependencies between each pixel-pair. In this work, we propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations from the perspective of space, channel and category in a more effective and efficient manner. Concretely, a class augmented attention (CAA) module embedded with a class channel attention (CCA) module can be used to compute category-based correlation and recalibrate the class-level information. Additionally, we introduce a simple yet region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism via region-wise representations. Extensive experimental results on the ISPRS Vaihingen and Potsdam benchmark demonstrate the effectiveness and efficiency of our HMANet over other state-of-the-art methods.
Tasks Semantic Segmentation
Published 2020-01-09
URL https://arxiv.org/abs/2001.02870v1
PDF https://arxiv.org/pdf/2001.02870v1.pdf
PWC https://paperswithcode.com/paper/hmanet-hybrid-multiple-attention-network-for
Repo
Framework

Learning Algebraic Multigrid Using Graph Neural Networks

Title Learning Algebraic Multigrid Using Graph Neural Networks
Authors Ilay Luz, Meirav Galun, Haggai Maron, Ronen Basri, Irad Yavneh
Abstract Efficient numerical solvers for sparse linear systems are crucial in science and engineering. One of the fastest methods for solving large-scale sparse linear systems is algebraic multigrid (AMG). The main challenge in the construction of AMG algorithms is the selection of the prolongation operator – a problem-dependent sparse matrix which governs the multiscale hierarchy of the solver and is critical to its efficiency. Over many years, numerous methods have been developed for this task, and yet there is no known single right answer except in very special cases. Here we propose a framework for learning AMG prolongation operators for linear systems with sparse symmetric positive (semi-) definite matrices. We train a single graph neural network to learn a mapping from an entire class of such matrices to prolongation operators, using an efficient unsupervised loss function. Experiments on a broad class of problems demonstrate improved convergence rates compared to classical AMG, demonstrating the potential utility of neural networks for developing sparse system solvers.
Tasks
Published 2020-03-12
URL https://arxiv.org/abs/2003.05744v1
PDF https://arxiv.org/pdf/2003.05744v1.pdf
PWC https://paperswithcode.com/paper/learning-algebraic-multigrid-using-graph
Repo
Framework
comments powered by Disqus