April 2, 2020

3404 words 16 mins read

Paper Group ANR 249

Paper Group ANR 249

Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks. NCVis: Noise Contrastive Approach for Scalable Visualization. Matrix-LSTM: a Differentiable Recurrent Surface for Asynchronous Event-Based Data. MAC Protocol Design Optimization Using Deep Learning. The Rumour Mill: Making the Spread of Misinformation Explicit an …

Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks

Title Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks
Authors Shindong Lee, BongGu Ko, Keonnyeong Lee, In-Chul Yoo, Dongsuk Yook
Abstract Voice conversion (VC) refers to transforming the speaker characteristics of an utterance without altering its linguistic contents. Many works on voice conversion require to have parallel training data that is highly expensive to acquire. Recently, the cycle-consistent adversarial network (CycleGAN), which does not require parallel training data, has been applied to voice conversion, showing the state-of-the-art performance. The CycleGAN based voice conversion, however, can be used only for a pair of speakers, i.e., one-to-one voice conversion between two speakers. In this paper, we extend the CycleGAN by conditioning the network on speakers. As a result, the proposed method can perform many-to-many voice conversion among multiple speakers using a single generative adversarial network (GAN). Compared to building multiple CycleGANs for each pair of speakers, the proposed method reduces the computational and spatial cost significantly without compromising the sound quality of the converted voice. Experimental results using the VCC2018 corpus confirm the efficiency of the proposed method.
Tasks Voice Conversion
Published 2020-02-15
URL https://arxiv.org/abs/2002.06328v1
PDF https://arxiv.org/pdf/2002.06328v1.pdf
PWC https://paperswithcode.com/paper/many-to-many-voice-conversion-using
Repo
Framework

NCVis: Noise Contrastive Approach for Scalable Visualization

Title NCVis: Noise Contrastive Approach for Scalable Visualization
Authors Aleksandr Artemenkov, Maxim Panov
Abstract Modern methods for data visualization via dimensionality reduction, such as t-SNE, usually have performance issues that prohibit their application to large amounts of high-dimensional data. In this work, we propose NCVis – a high-performance dimensionality reduction method built on a sound statistical basis of noise contrastive estimation. We show that NCVis outperforms state-of-the-art techniques in terms of speed while preserving the representation quality of other methods. In particular, the proposed approach successfully proceeds a large dataset of more than 1 million news headlines in several minutes and presents the underlying structure in a human-readable way. Moreover, it provides results consistent with classical methods like t-SNE on more straightforward datasets like images of hand-written digits. We believe that the broader usage of such software can significantly simplify the large-scale data analysis and lower the entry barrier to this area.
Tasks Dimensionality Reduction
Published 2020-01-30
URL https://arxiv.org/abs/2001.11411v1
PDF https://arxiv.org/pdf/2001.11411v1.pdf
PWC https://paperswithcode.com/paper/ncvis-noise-contrastive-approach-for-scalable
Repo
Framework

Matrix-LSTM: a Differentiable Recurrent Surface for Asynchronous Event-Based Data

Title Matrix-LSTM: a Differentiable Recurrent Surface for Asynchronous Event-Based Data
Authors Marco Cannici, Marco Ciccone, Andrea Romanoni, Matteo Matteucci
Abstract Dynamic Vision Sensors (DVSs) asynchronously stream events in correspondence of pixels subject to brightness changes. Differently from classic vision devices, they produce a sparse representation of the scene. Therefore, to apply standard computer vision algorithms, events need to be integrated into a frame or event-surface. This is usually attained through hand-crafted grids that reconstruct the frame using ad-hoc heuristics. In this paper, we propose Matrix-LSTM, a grid of Long Short-Term Memory (LSTM) cells to learn end-to-end a task-dependent event-surfaces. Compared to existing reconstruction approaches, our learned event-surface shows good flexibility and expressiveness improving the baselines on optical flow estimation on the MVSEC benchmark and the state-of-the-art of event-based object classification on the N-Cars dataset.
Tasks Object Classification, Optical Flow Estimation
Published 2020-01-10
URL https://arxiv.org/abs/2001.03455v1
PDF https://arxiv.org/pdf/2001.03455v1.pdf
PWC https://paperswithcode.com/paper/matrix-lstm-a-differentiable-recurrent
Repo
Framework

MAC Protocol Design Optimization Using Deep Learning

Title MAC Protocol Design Optimization Using Deep Learning
Authors Hannaneh Barahouei Pasandi, Tamer Nadeem
Abstract Deep learning (DL)-based solutions have recently been developed for communication protocol design. Such learning-based solutions can avoid manual efforts to tune individual protocol parameters. While these solutions look promising, they are hard to interpret due to the black-box nature of the ML techniques. To this end, we propose a novel DRL-based framework to systematically design and evaluate networking protocols. While other proposed ML-based methods mainly focus on tuning individual protocol parameters (e.g., adjusting contention window), our main contribution is to decouple a protocol into a set of parametric modules, each representing a main protocol functionality and is used as DRL input to better understand the generated protocols design optimization and analyze them in a systematic fashion. As a case study, we introduce and evaluate DeepMAC a framework in which a MAC protocol is decoupled into a set of blocks across popular flavors of 802.11 WLANs (e.g., 802.11a/b/g/n/ac). We are interested to see what blocks are selected by DeepMAC across different networking scenarios and whether DeepMAC is able to adapt to network dynamics.
Tasks
Published 2020-02-06
URL https://arxiv.org/abs/2002.02075v1
PDF https://arxiv.org/pdf/2002.02075v1.pdf
PWC https://paperswithcode.com/paper/mac-protocol-design-optimization-using-deep
Repo
Framework

The Rumour Mill: Making the Spread of Misinformation Explicit and Tangible

Title The Rumour Mill: Making the Spread of Misinformation Explicit and Tangible
Authors Nanna Inie, Jeanette Falk Olesen, Leon Derczynski
Abstract Misinformation spread presents a technological and social threat to society. With the advance of AI-based language models, automatically generated texts have become difficult to identify and easy to create at scale. We present “The Rumour Mill”, a playful art piece, designed as a commentary on the spread of rumours and automatically-generated misinformation. The mill is a tabletop interactive machine, which invites a user to experience the process of creating believable text by interacting with different tangible controls on the mill. The user manipulates visible parameters to adjust the genre and type of an automatically generated text rumour. The Rumour Mill is a physical demonstration of the state of current technology and its ability to generate and manipulate natural language text, and of the act of starting and spreading rumours.
Tasks Rumour Detection
Published 2020-02-11
URL https://arxiv.org/abs/2002.04494v2
PDF https://arxiv.org/pdf/2002.04494v2.pdf
PWC https://paperswithcode.com/paper/the-rumour-mill-making-misinformation-spread
Repo
Framework

Gradient Estimation for Federated Learning over Massive MIMO Communication Systems

Title Gradient Estimation for Federated Learning over Massive MIMO Communication Systems
Authors Yo-Seb Jeon, Mohammad Mohammadi Amiri, Jun Li, H. Vincent Poor
Abstract Federated learning is a communication-efficient and privacy-preserving solution to train a global model through the collaboration of multiple devices each with its own local training data set. In this paper, we consider federated learning over massive multiple-input multiple-output (MIMO) communication systems in which wireless devices train a global model with the aid of a central server equipped with a massive antenna array. One major challenge is to design a reception technique at the central sever to accurately estimate local gradient vectors sent from the wireless devices. To overcome this challenge, we propose a novel gradient-estimation algorithm that exploits the sparsity of the local gradient vectors. Inspired by the orthogonal matching pursuit algorithm in compressive sensing, the proposed algorithm iteratively finds the devices with non-zero gradient values while estimating the transmitted signal based on the linear minimum-mean-square-error (LMMSE) method. Meanwhile, the stopping criterion of the proposed algorithm is designed by deriving an analytical threshold for the estimation error of the transmitted signal. We also analyze the computational complexity reduction of the proposed algorithm over a simple LMMSE method. Simulation results demonstrate that the proposed algorithm performs very close to centralized learning, while providing a better performance-complexity tradeoff than linear beamforming methods.
Tasks Compressive Sensing
Published 2020-03-18
URL https://arxiv.org/abs/2003.08059v1
PDF https://arxiv.org/pdf/2003.08059v1.pdf
PWC https://paperswithcode.com/paper/gradient-estimation-for-federated-learning
Repo
Framework

Data-Driven Deep Learning to Design Pilot and Channel Estimator For Massive MIMO

Title Data-Driven Deep Learning to Design Pilot and Channel Estimator For Massive MIMO
Authors Xisuo Ma, Zhen Gao
Abstract In this paper, we propose a data-driven deep learning (DL) approach to jointly design the pilot signals and channel estimator for wideband massive multiple-input multiple-output (MIMO) systems. By exploiting the angular-domain compressibility of massive MIMO channels, the conceived DL framework can reliably reconstruct the high-dimensional channels from the under-determined measurements. Specifically, we design an end-to-end deep neural network (DNN) architecture composed of dimensionality reduction network and reconstruction network to respectively mimic the pilot signals and channel estimator, which can be acquired by data-driven deep learning. For the dimensionality reduction network, we design a fully-connected layer by compressing the high-dimensional massive MIMO channel vector as input to low-dimensional received measurements, where the weights are regarded as the pilot signals. For the reconstruction network, we design a fully-connected layer followed by multiple cascaded convolutional layers, which will reconstruct the high-dimensional channel as the output. By defining the mean square error between input and output as loss function, we leverage Adam algorithm to train the end-to-end DNN aforementioned with extensive channel samples. In this way, both the pilot signals and channel estimator can be simultaneously obtained. The simulation results demonstrate that the superiority of the proposed solution over state-of-the-art compressive sensing approaches.
Tasks Compressive Sensing, Dimensionality Reduction
Published 2020-03-12
URL https://arxiv.org/abs/2003.05875v1
PDF https://arxiv.org/pdf/2003.05875v1.pdf
PWC https://paperswithcode.com/paper/data-driven-deep-learning-to-design-pilot-and
Repo
Framework

Recovering compressed images for automatic crack segmentation using generative models

Title Recovering compressed images for automatic crack segmentation using generative models
Authors Yong Huang, Haoyu Zhang, Hui Li, Stephen Wu
Abstract In a structural health monitoring (SHM) system that uses digital cameras to monitor cracks of structural surfaces, techniques for reliable and effective data compression are essential to ensure a stable and energy efficient crack images transmission in wireless devices, e.g., drones and robots with high definition cameras installed. Compressive sensing (CS) is a signal processing technique that allows accurate recovery of a signal from a sampling rate much smaller than the limitation of the Nyquist sampling theorem. The conventional CS method is based on the principle that, through a regularized optimization, the sparsity property of the original signals in some domain can be exploited to get the exact reconstruction with a high probability. However, the strong assumption of the signals being highly sparse in an invertible space is relatively hard for real crack images. In this paper, we present a new approach of CS that replaces the sparsity regularization with a generative model that is able to effectively capture a low dimension representation of targeted images. We develop a recovery framework for automatic crack segmentation of compressed crack images based on this new CS method and demonstrate the remarkable performance of the method taking advantage of the strong capability of generative models to capture the necessary features required in the crack segmentation task even the backgrounds of the generated images are not well reconstructed. The superior performance of our recovery framework is illustrated by comparing with three existing CS algorithms. Furthermore, we show that our framework is extensible to other common problems in automatic crack segmentation, such as defect recovery from motion blurring and occlusion.
Tasks Compressive Sensing
Published 2020-03-06
URL https://arxiv.org/abs/2003.03028v1
PDF https://arxiv.org/pdf/2003.03028v1.pdf
PWC https://paperswithcode.com/paper/recovering-compressed-images-for-automatic
Repo
Framework

Convolutional Sparse Support Estimator Network (CSEN) From energy efficient support estimation to learning-aided Compressive Sensing

Title Convolutional Sparse Support Estimator Network (CSEN) From energy efficient support estimation to learning-aided Compressive Sensing
Authors Mehmet Yamac, Mete Ahishali, Serkan Kiranyaz, Moncef Gabbouj
Abstract Support estimation (SE) of a sparse signal refers to finding the location indices of the non-zero elements in a sparse representation. Most of the traditional approaches dealing with SE problem are iterative algorithms based on greedy methods or optimization techniques. Indeed, a vast majority of them use sparse signal recovery techniques to obtain support sets instead of directly mapping the non-zero locations from denser measurements (e.g., Compressively Sensed Measurements). This study proposes a novel approach for learning such a mapping from a training set. To accomplish this objective, the Convolutional Support Estimator Networks (CSENs), each with a compact configuration, are designed. The proposed CSEN can be a crucial tool for the following scenarios: (i) Real-time and low-cost support estimation can be applied in any mobile and low-power edge device for anomaly localization, simultaneous face recognition, etc. (ii) CSEN’s output can directly be used as “prior information” which improves the performance of sparse signal recovery algorithms. The results over the benchmark datasets show that state-of-the-art performance levels can be achieved by the proposed approach with a significantly reduced computational complexity.
Tasks Compressive Sensing, Face Recognition
Published 2020-03-02
URL https://arxiv.org/abs/2003.00768v1
PDF https://arxiv.org/pdf/2003.00768v1.pdf
PWC https://paperswithcode.com/paper/convolutional-sparse-support-estimator
Repo
Framework

Conditional Sampling from Invertible Generative Models with Applications to Inverse Problems

Title Conditional Sampling from Invertible Generative Models with Applications to Inverse Problems
Authors Erik M. Lindgren, Jay Whang, Alexandros G. Dimakis
Abstract We consider uncertainty aware compressive sensing when the prior distribution is defined by an invertible generative model. In this problem, we receive a set of low dimensional measurements and we want to generate conditional samples of high dimensional objects conditioned on these measurements. We first show that the conditional sampling problem is hard in general, and thus we consider approximations to the problem. We develop a variational approach to conditional sampling that composes a new generative model with the given generative model. This allows us to utilize the sampling ability of the given generative model to quickly generate samples from the conditional distribution.
Tasks Compressive Sensing
Published 2020-02-26
URL https://arxiv.org/abs/2002.11743v1
PDF https://arxiv.org/pdf/2002.11743v1.pdf
PWC https://paperswithcode.com/paper/conditional-sampling-from-invertible
Repo
Framework

Reducing the Representation Error of GAN Image Priors Using the Deep Decoder

Title Reducing the Representation Error of GAN Image Priors Using the Deep Decoder
Authors Max Daniels, Paul Hand, Reinhard Heckel
Abstract Generative models, such as GANs, learn an explicit low-dimensional representation of a particular class of images, and so they may be used as natural image priors for solving inverse problems such as image restoration and compressive sensing. GAN priors have demonstrated impressive performance on these tasks, but they can exhibit substantial representation error for both in-distribution and out-of-distribution images, because of the mismatch between the learned, approximate image distribution and the data generating distribution. In this paper, we demonstrate a method for reducing the representation error of GAN priors by modeling images as the linear combination of a GAN prior with a Deep Decoder. The deep decoder is an underparameterized and most importantly unlearned natural signal model similar to the Deep Image Prior. No knowledge of the specific inverse problem is needed in the training of the GAN underlying our method. For compressive sensing and image superresolution, our hybrid model exhibits consistently higher PSNRs than both the GAN priors and Deep Decoder separately, both on in-distribution and out-of-distribution images. This model provides a method for extensibly and cheaply leveraging both the benefits of learned and unlearned image recovery priors in inverse problems.
Tasks Compressive Sensing, Image Restoration
Published 2020-01-23
URL https://arxiv.org/abs/2001.08747v1
PDF https://arxiv.org/pdf/2001.08747v1.pdf
PWC https://paperswithcode.com/paper/reducing-the-representation-error-of-gan
Repo
Framework

Compressive sensing based privacy for fall detection

Title Compressive sensing based privacy for fall detection
Authors Ronak Gupta, Prashant Anand, Santanu Chaudhury, Brejesh Lall, Sanjay Singh
Abstract Fall detection holds immense importance in the field of healthcare, where timely detection allows for instant medical assistance. In this context, we propose a 3D ConvNet architecture which consists of 3D Inception modules for fall detection. The proposed architecture is a custom version of Inflated 3D (I3D) architecture, that takes compressed measurements of video sequence as spatio-temporal input, obtained from compressive sensing framework, rather than video sequence as input, as in the case of I3D convolutional neural network. This is adopted since privacy raises a huge concern for patients being monitored through these RGB cameras. The proposed framework for fall detection is flexible enough with respect to a wide variety of measurement matrices. Ten action classes randomly selected from Kinetics-400 with no fall examples, are employed to train our 3D ConvNet post compressive sensing with different types of sensing matrices on the original video clips. Our results show that 3D ConvNet performance remains unchanged with different sensing matrices. Also, the performance obtained with Kinetics pre-trained 3D ConvNet on compressively sensed fall videos from benchmark datasets is better than the state-of-the-art techniques.
Tasks Compressive Sensing
Published 2020-01-10
URL https://arxiv.org/abs/2001.03463v1
PDF https://arxiv.org/pdf/2001.03463v1.pdf
PWC https://paperswithcode.com/paper/compressive-sensing-based-privacy-for-fall
Repo
Framework

Exploiting Event Cameras by Using a Network Grafting Algorithm

Title Exploiting Event Cameras by Using a Network Grafting Algorithm
Authors Yuhuang Hu, Tobi Delbruck, Shih-Chii Liu
Abstract Novel vision sensors such as event cameras provide information that is not available from conventional intensity cameras. An obstacle to using these sensors with current powerful deep neural networks is the lack of large labeled training datasets. This paper proposes a Network Grafting Algorithm (NGA), where a new front end network driven by unconventional visual inputs replaces the front end network of a pretrained deep network that processes intensity frames. The self-supervised training uses only synchronously-recorded intensity frames and novel sensor data to maximize feature similarity between the pretrained network and the grafted network. We show that the enhanced grafted network reaches comparable average precision (AP$_{50}$) scores to the pretrained network on an object detection task using an event camera dataset, with no increase in inference costs. The grafted front end has only 5–8% of the total parameters and can be trained in a few hours on a single GPU equivalent to 5% of the time that would be needed to train the entire object detector from labeled data. NGA allows these new vision sensors to capitalize on previously pretrained powerful deep models, saving on training cost.
Tasks Object Detection
Published 2020-03-24
URL https://arxiv.org/abs/2003.10959v1
PDF https://arxiv.org/pdf/2003.10959v1.pdf
PWC https://paperswithcode.com/paper/exploiting-event-cameras-by-using-a-network
Repo
Framework

Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

Title Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs
Authors Tankut Can, Kamesh Krishnamurthy, David J. Schwab
Abstract Recurrent neural networks (RNNs) are powerful dynamical models for data with complex temporal structure. However, training RNNs has traditionally proved challenging due to exploding or vanishing of gradients. RNN models such as LSTMs and GRUs (and their variants) significantly mitigate the issues associated with training RNNs by introducing various types of {\it gating} units into the architecture. While these gates empirically improve performance, how the addition of gates influences the dynamics and trainability of GRUs and LSTMs is not well understood. Here, we take the perspective of studying randomly-initialized LSTMs and GRUs as dynamical systems, and ask how the salient dynamical properties are shaped by the gates. We leverage tools from random matrix theory and mean-field theory to study the state-to-state Jacobians of GRUs and LSTMs. We show that the update gate in the GRU and the forget gate in the LSTM can lead to an accumulation of slow modes in the dynamics. Moreover, the GRU update gate can poise the system at a marginally stable point. The reset gate in the GRU and the output and input gates in the LSTM control the spectral radius of the Jacobian, and the GRU reset gate also modulates the complexity of the landscape of fixed-points. Furthermore, for the GRU we obtain a phase diagram describing the statistical properties of fixed-points. Finally, we provide some preliminary comparison of training performance to the various dynamical regimes, which will be investigated elsewhere. The techniques introduced here can be generalized to other RNN architectures to elucidate how various architectural choices influence the dynamics and potentially discover novel architectures.
Tasks
Published 2020-01-31
URL https://arxiv.org/abs/2002.00025v1
PDF https://arxiv.org/pdf/2002.00025v1.pdf
PWC https://paperswithcode.com/paper/gating-creates-slow-modes-and-controls-phase
Repo
Framework

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

Title Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation
Authors Fajie Yuan, Xiangnan He, Alexandros Karatzoglou, Liguang Zhang
Abstract Inductive transfer learning has had a big impact on computer vision and NLP domains but has not been used in the area of recommender systems. Even though there has been a large body of research on generating recommendations based on modeling user-item interaction sequences, few of them attempt to represent and transfer these models for serving downstream tasks where only limited data exists. In this paper, we delve on the task of effectively learning a single user representation that can be applied to a diversity of tasks, from cross-domain recommendations to user profile predictions. Fine-tuning a large pre-trained network and adapting it to downstream tasks is an effective way to solve such tasks. However, fine-tuning is parameter inefficient considering that an entire model needs to be re-trained for every new task. To overcome this issue, we develop a parameter efficient transfer learning architecture, termed as PeterRec, which can be configured on-the-fly to various downstream tasks. Specifically, PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks, which are small but as expressive as learning the entire network. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks. Moreover, we show that PeterRec performs efficient transfer learning in multiple domains, where it achieves comparable or sometimes better performance relative to fine-tuning the entire model parameters.
Tasks Recommendation Systems, Transfer Learning
Published 2020-01-13
URL https://arxiv.org/abs/2001.04253v3
PDF https://arxiv.org/pdf/2001.04253v3.pdf
PWC https://paperswithcode.com/paper/parameter-efficient-transfer-from-sequential
Repo
Framework
comments powered by Disqus