January 27, 2020

3074 words 15 mins read

Paper Group ANR 1256

Paper Group ANR 1256

A perspective on multi-agent communication for information fusion. Acoustic Scene Classification Based on a Large-margin Factorized CNN. Residual Attention Graph Convolutional Network for Geometric 3D Scene Classification. Receptive-field-regularized CNN variants for acoustic scene classification. Quantum-Inspired Hamiltonian Monte Carlo for Bayesi …

A perspective on multi-agent communication for information fusion

Title A perspective on multi-agent communication for information fusion
Authors Homagni Saha, Vijay Venkataraman, Alberto Speranzon, Soumik Sarkar
Abstract Collaborative decision making in multi-agent systems typically requires a predefined communication protocol among agents. Usually, agent-level observations are locally processed and information is exchanged using the predefined protocol, enabling the team to perform more efficiently than each agent operating in isolation. In this work, we consider the situation where agents, with complementary sensing modalities must co-operate to achieve a common goal/task by learning an efficient communication protocol. We frame the problem within an actor-critic scheme, where the agents learn optimal policies in a centralized fashion, while taking action in a distributed manner. We provide an interpretation of the emergent communication between the agents. We observe that the information exchanged is not just an encoding of the raw sensor data but is, rather, a specific set of directive actions that depend on the overall task. Simulation results demonstrate the interpretability of the learnt communication in a variety of tasks.
Tasks Decision Making
Published 2019-11-09
URL https://arxiv.org/abs/1911.03743v1
PDF https://arxiv.org/pdf/1911.03743v1.pdf
PWC https://paperswithcode.com/paper/a-perspective-on-multi-agent-communication
Repo
Framework

Acoustic Scene Classification Based on a Large-margin Factorized CNN

Title Acoustic Scene Classification Based on a Large-margin Factorized CNN
Authors Janghoon Cho, Sungrack Yun, Hyoungwoo Park, Jungyun Eum, Kyuwoong Hwang
Abstract In this paper, we present an acoustic scene classification framework based on a large-margin factorized convolutional neural network (CNN). We adopt the factorized CNN to learn the patterns in the time-frequency domain by factorizing the 2D kernel into two separate 1D kernels. The factorized kernel leads to learn the main component of two patterns: the long-term ambient and short-term event sounds which are the key patterns of the audio scene classification. In training our model, we consider the loss function based on the triplet sampling such that the same audio scene samples from different environments are minimized, and simultaneously the different audio scene samples are maximized. With this loss function, the samples from the same audio scene are clustered independently of the environment, and thus we can get the classifier with better generalization ability in an unseen environment. We evaluated our audio scene classification framework using the dataset of the DCASE challenge 2019 task1A. Experimental results show that the proposed algorithm improves the performance of the baseline network and reduces the number of parameters to one third. Furthermore, the performance gain is higher on unseen data, and it shows that the proposed algorithm has better generalization ability.
Tasks Acoustic Scene Classification, Scene Classification
Published 2019-10-14
URL https://arxiv.org/abs/1910.06784v1
PDF https://arxiv.org/pdf/1910.06784v1.pdf
PWC https://paperswithcode.com/paper/acoustic-scene-classification-based-on-a
Repo
Framework

Residual Attention Graph Convolutional Network for Geometric 3D Scene Classification

Title Residual Attention Graph Convolutional Network for Geometric 3D Scene Classification
Authors Albert Mosella-Montoro, Javier Ruiz-Hidalgo
Abstract Geometric 3D scene classification is a very challenging task. Current methodologies extract the geometric information using only a depth channel provided by an RGB-D sensor. These kinds of methodologies introduce possible errors due to missing local geometric context in the depth channel. This work proposes a novel Residual Attention Graph Convolutional Network that exploits the intrinsic geometric context inside a 3D space without using any kind of point features, allowing the use of organized or unorganized 3D data. Experiments are done in NYU Depth v1 and SUN-RGBD datasets to study the different configurations and to demonstrate the effectiveness of the proposed method. Experimental results show that the proposed method outperforms current state-of-the-art in geometric 3D scene classification tasks.
Tasks Scene Classification
Published 2019-09-30
URL https://arxiv.org/abs/1909.13470v1
PDF https://arxiv.org/pdf/1909.13470v1.pdf
PWC https://paperswithcode.com/paper/residual-attention-graph-convolutional
Repo
Framework

Receptive-field-regularized CNN variants for acoustic scene classification

Title Receptive-field-regularized CNN variants for acoustic scene classification
Authors Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer
Abstract Acoustic scene classification and related tasks have been dominated by Convolutional Neural Networks (CNNs). Top-performing CNNs use mainly audio spectograms as input and borrow their architectural design primarily from computer vision. A recent study has shown that restricting the receptive field (RF) of CNNs in appropriate ways is crucial for their performance, robustness and generalization in audio tasks. One side effect of restricting the RF of CNNs is that more frequency information is lost. In this paper, we perform a systematic investigation of different RF configuration for various CNN architectures on the DCASE 2019 Task 1.A dataset. Second, we introduce Frequency Aware CNNs to compensate for the lack of frequency information caused by the restricted RF, and experimentally determine if and in what RF ranges they yield additional improvement. The result of these investigations are several well-performing submissions to different tasks in the DCASE 2019 Challenge.
Tasks Acoustic Scene Classification, Scene Classification
Published 2019-09-05
URL https://arxiv.org/abs/1909.02859v1
PDF https://arxiv.org/pdf/1909.02859v1.pdf
PWC https://paperswithcode.com/paper/receptive-field-regularized-cnn-variants-for
Repo
Framework

Quantum-Inspired Hamiltonian Monte Carlo for Bayesian Sampling

Title Quantum-Inspired Hamiltonian Monte Carlo for Bayesian Sampling
Authors Ziming Liu, Zheng Zhang
Abstract Hamiltonian Monte Carlo (HMC) is an efficient Bayesian sampling method that can make distant proposals in the parameter space by simulating a Hamiltonian dynamical system. Despite its popularity in machine learning and data science, HMC is inefficient to sample from spiky and multimodal distributions. Motivated by the energy-time uncertainty relation from quantum mechanics, we propose a Quantum-Inspired Hamiltonian Monte Carlo algorithm (QHMC). This algorithm allows a particle to have a random mass with a probability distribution rather than a fixed mass. We prove the convergence property of QHMC in the spatial domain and in the time sequence. We further show why such a random mass can improve the performance when we sample a broad class of distributions. In order to handle the big training data sets in large-scale machine learning, we develop a stochastic gradient version of QHMC using Nos'e-Hoover thermostat called QSGNHT, and we also provide theoretical justifications about its steady-state distributions. Finally in the experiments, we demonstrate the effectiveness of QHMC and QSGNHT on synthetic examples, bridge regression, image denoising and neural network pruning. The proposed QHMC and QSGNHT can indeed achieve much more stable and accurate sampling results on the test cases.
Tasks Denoising, Image Denoising, Network Pruning
Published 2019-12-04
URL https://arxiv.org/abs/1912.01937v1
PDF https://arxiv.org/pdf/1912.01937v1.pdf
PWC https://paperswithcode.com/paper/quantum-inspired-hamiltonian-monte-carlo-for
Repo
Framework

Emotion Recognition Using Wearables: A Systematic Literature Review Work in progress

Title Emotion Recognition Using Wearables: A Systematic Literature Review Work in progress
Authors Stanisław Saganowski, Anna Dutkowiak, Adam Dziadek, Maciej Dzieżyc, Joanna Komoszyńska, Weronika Michalska, Adam Polak, Michał Ujma, Przemysław Kazienko
Abstract Wearables like smartwatches or wrist bands equipped with pervasive sensors enable us to monitor our physiological signals. In this study, we address the question whether they can help us to recognize our emotions in our everyday life for ubiquitous computing. Using the systematic literature review, we identified crucial research steps and discussed the main limitations and problems in the domain.
Tasks Emotion Recognition
Published 2019-12-22
URL https://arxiv.org/abs/1912.10528v2
PDF https://arxiv.org/pdf/1912.10528v2.pdf
PWC https://paperswithcode.com/paper/emotion-recognition-using-wearables-a
Repo
Framework

Improving Semantic Parsing for Task Oriented Dialog

Title Improving Semantic Parsing for Task Oriented Dialog
Authors Arash Einolghozati, Panupong Pasupat, Sonal Gupta, Rushin Shah, Mrinal Mohit, Mike Lewis, Luke Zettlemoyer
Abstract Semantic parsing using hierarchical representations has recently been proposed for task oriented dialog with promising results [Gupta et al 2018]. In this paper, we present three different improvements to the model: contextualized embeddings, ensembling, and pairwise re-ranking based on a language model. We taxonomize the errors possible for the hierarchical representation, such as wrong top intent, missing spans or split spans, and show that the three approaches correct different kinds of errors. The best model combines the three techniques and gives 6.4% better exact match accuracy than the state-of-the-art, with an error reduction of 33%, resulting in a new state-of-the-art result on the Task Oriented Parsing (TOP) dataset.
Tasks Language Modelling, Semantic Parsing
Published 2019-02-15
URL http://arxiv.org/abs/1902.06000v1
PDF http://arxiv.org/pdf/1902.06000v1.pdf
PWC https://paperswithcode.com/paper/improving-semantic-parsing-for-task-oriented
Repo
Framework

FLARe: Forecasting by Learning Anticipated Representations

Title FLARe: Forecasting by Learning Anticipated Representations
Authors Surya Teja Devarakonda, Joie Yeahuay Wu, Yi Ren Fung, Madalina Fiterau
Abstract Computational models that forecast the progression of Alzheimer’s disease at the patient level are extremely useful tools for identifying high risk cohorts for early intervention and treatment planning. The state-of-the-art work in this area proposes models that forecast by using latent representations extracted from the longitudinal data across multiple modalities, including volumetric information extracted from medical scans and demographic info. These models incorporate the time horizon, which is the amount of time between the last recorded visit and the future visit, by directly concatenating a representation of it to the data latent representation. In this paper, we present a model which generates a sequence of latent representations of the patient status across the time horizon, providing more informative modeling of the temporal relationships between the patient’s history and future visits. Our proposed model outperforms the baseline in terms of forecasting accuracy and F1 score with the added benefit of robustly handling missing visits.
Tasks
Published 2019-04-17
URL https://arxiv.org/abs/1904.08930v2
PDF https://arxiv.org/pdf/1904.08930v2.pdf
PWC https://paperswithcode.com/paper/flare-forecasting-by-learning-anticipated
Repo
Framework

Capsule and convolutional neural network-based SAR ship classification in Sentinel-1 data

Title Capsule and convolutional neural network-based SAR ship classification in Sentinel-1 data
Authors Leonardo De Laurentiis, Andrea Pomente, Fabio Del Frate, Giovanni Schiavon
Abstract Synthetic Aperture Radar (SAR) constitutes a fundamental asset for wide-areas monitoring with high-resolution requirements. The first SAR sensors have given rise to coarse coastal and maritime monitoring applications, including oil spill, ship and ice floes detection. With the upgrade to very high-resolution sensors in the recent years, with relatively new SAR missions such as Sentinel-1, a great deal of data providing a stronger information content has been released, enabling more refined studies on general targets features and thus permitting complex classifications, as for ship classification, which has become increasingly relevant given the growing need for coastal surveillance in commercial and military segments. In the last decade, several works focused on this topic have been presented, generally based on radiometric features processing; furthermore, in the very recent years a significant amount of research works have focused on emerging deep learning techniques, in particular on Convolutional Neural Networks (CNN). Recently Capsule Neural Networks (CapsNets) have been presented, demonstrating a notable improvement in capturing the properties of given entities, improving the use of spatial informations, in particular of spatial dependence between features, a severely lacking feature in CNNs. In fact, CNNs pooling operations have been criticized for losing spatial relations, thus special capsules, along with a new iterative routing-by-agreement mechanism, have been proposed. In this work a comparison between Capsule and CNNs potential in the ship classification application domain is shown, by leveraging the OpenSARShip, a SAR Sentinel-1 ship chips dataset; in particular, a performance comparison between capsule and various convolutional architectures is built, demonstrating better performances of CapsNet in classifying ships within a small dataset.
Tasks
Published 2019-10-11
URL https://arxiv.org/abs/1910.05401v1
PDF https://arxiv.org/pdf/1910.05401v1.pdf
PWC https://paperswithcode.com/paper/capsule-and-convolutional-neural-network
Repo
Framework

One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation

Title One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation
Authors Matthew Shunshi Zhang, Bradly Stadie
Abstract Recent advances in the sparse neural network literature have made it possible to prune many large feed forward and convolutional networks with only a small quantity of data. Yet, these same techniques often falter when applied to the problem of recovering sparse recurrent networks. These failures are quantitative: when pruned with recent techniques, RNNs typically obtain worse performance than they do under a simple random pruning scheme. The failures are also qualitative: the distribution of active weights in a pruned LSTM or GRU network tend to be concentrated in specific neurons and gates, and not well dispersed across the entire architecture. We seek to rectify both the quantitative and qualitative issues with recurrent network pruning by introducing a new recurrent pruning objective derived from the spectrum of the recurrent Jacobian. Our objective is data efficient (requiring only 64 data points to prune the network), easy to implement, and produces 95% sparse GRUs that significantly improve on existing baselines. We evaluate on sequential MNIST, Billion Words, and Wikitext.
Tasks Network Pruning
Published 2019-11-30
URL https://arxiv.org/abs/1912.00120v1
PDF https://arxiv.org/pdf/1912.00120v1.pdf
PWC https://paperswithcode.com/paper/one-shot-pruning-of-recurrent-neural-networks-1
Repo
Framework

LOGAN: Latent Optimisation for Generative Adversarial Networks

Title LOGAN: Latent Optimisation for Generative Adversarial Networks
Authors Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap
Abstract Training generative adversarial networks requires balancing of delicate adversarial dynamics. Even with careful tuning, training may diverge or end up in a bad equilibrium with dropped modes. In this work, we introduce a new form of latent optimisation inspired by the CS-GAN and show that it improves adversarial dynamics by enhancing interactions between the discriminator and the generator. We develop supporting theoretical analysis from the perspectives of differentiable games and stochastic approximation. Our experiments demonstrate that latent optimisation can significantly improve GAN training, obtaining state-of-the-art performance for the ImageNet (128 x 128) dataset. Our model achieves an Inception Score (IS) of 148 and an Fr'echet Inception Distance (FID) of 3.4, an improvement of 17% and 32% in IS and FID respectively, compared with the baseline BigGAN-deep model with the same architecture and number of parameters.
Tasks Conditional Image Generation, Image Generation
Published 2019-12-02
URL https://arxiv.org/abs/1912.00953v1
PDF https://arxiv.org/pdf/1912.00953v1.pdf
PWC https://paperswithcode.com/paper/logan-latent-optimisation-for-generative-1
Repo
Framework

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

Title DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks
Authors Ao Ren, Tao Zhang, Yuhao Wang, Sheng Lin, Peiyan Dong, Yen-kuang Chen, Yuan Xie, Yanzhi Wang
Abstract The rapidly growing parameter volume of deep neural networks (DNNs) hinders the artificial intelligence applications on resource constrained devices, such as mobile and wearable devices. Neural network pruning, as one of the mainstream model compression techniques, is under extensive study to reduce the number of parameters and computations. In contrast to irregular pruning that incurs high index storage and decoding overhead, structured pruning techniques have been proposed as the promising solutions. However, prior studies on structured pruning tackle the problem mainly from the perspective of facilitating hardware implementation, without analyzing the characteristics of sparse neural networks. The neglect on the study of sparse neural networks causes inefficient trade-off between regularity and pruning ratio. Consequently, the potential of structurally pruning neural networks is not sufficiently mined. In this work, we examine the structural characteristics of the irregularly pruned weight matrices, such as the diverse redundancy of different rows, the sensitivity of different rows to pruning, and the positional characteristics of retained weights. By leveraging the gained insights as a guidance, we first propose the novel block-max weight masking (BMWM) method, which can effectively retain the salient weights while imposing high regularity to the weight matrix. As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that outperforms prior structured pruning work with high pruning ratio and decoding efficiency. Our experimental results show that DARB can achieve 13$\times$ to 25$\times$ pruning ratio, which are 2.8$\times$ to 4.3$\times$ improvements than the state-of-the-art counterparts on multiple neural network models and tasks. Moreover, DARB can achieve 14.3$\times$ decoding efficiency than block pruning with higher pruning ratio.
Tasks Model Compression, Network Pruning
Published 2019-11-19
URL https://arxiv.org/abs/1911.08020v2
PDF https://arxiv.org/pdf/1911.08020v2.pdf
PWC https://paperswithcode.com/paper/darb-a-density-aware-regular-block-pruning
Repo
Framework

Can we trust deep learning models diagnosis? The impact of domain shift in chest radiograph classification

Title Can we trust deep learning models diagnosis? The impact of domain shift in chest radiograph classification
Authors Eduardo H. P. Pooch, Pedro L. Ballester, Rodrigo C. Barros
Abstract While deep learning models become more widespread, their ability to handle unseen data and generalize for any scenario is yet to be challenged. In medical imaging, there is a high heterogeneity of distributions among images based on the equipment that generate them and their parametrization. This heterogeneity triggers a common issue in machine learning called domain shift, which represents the difference between the training data distribution and the distribution of where a model is employed. A high domain shift tends to implicate in a poor performance from models. In this work, we evaluate the extent of domain shift on three of the largest datasets of chest radiographs. We show how training and testing with different datasets (e.g. training in ChestX-ray14 and testing in CheXpert) drastically affects model performance, posing a big question over the reliability of deep learning models.
Tasks
Published 2019-09-03
URL https://arxiv.org/abs/1909.01940v1
PDF https://arxiv.org/pdf/1909.01940v1.pdf
PWC https://paperswithcode.com/paper/can-we-trust-deep-learning-models-diagnosis
Repo
Framework

Cross-Channel Intragroup Sparsity Neural Network

Title Cross-Channel Intragroup Sparsity Neural Network
Authors Zhilin Yu, Chao Wang, Qing Wu, Yong Zhao, Xundong Wu
Abstract Modern deep neural network models generally build upon heavy over-parameterization for their exceptional performance. Network pruning is one often employed approach to obtain less demanding models for their deployment. Fine-grained pruning, while can achieve good model compression ratio, introduces irregularity in the computing data flow, often does not give improved model inference efficiency. Coarse-grained model pruning, while allows good inference speed through removing network weights in whole groups, for example, a whole filter, can lead to significant model performance deterioration. In this study, we introduce the cross-channel intragroup (CCI) sparsity structure that can avoid the inference inefficiency of fine-grained pruning while maintaining outstanding model performance.
Tasks Model Compression, Network Pruning
Published 2019-10-26
URL https://arxiv.org/abs/1910.11971v1
PDF https://arxiv.org/pdf/1910.11971v1.pdf
PWC https://paperswithcode.com/paper/cross-channel-intragroup-sparsity-neural
Repo
Framework

Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion

Title Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion
Authors Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose M. Alvarez, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz
Abstract We introduce DeepInversion, a new method for synthesizing images from the image distribution used to train a deep neural network. We ‘invert’ a trained network (teacher) to synthesize class-conditional input images starting from random noise, without using any additional information about the training dataset. Keeping the teacher fixed, our method optimizes the input while regularizing the distribution of intermediate feature maps using information stored in the batch normalization layers of the teacher. Further, we improve the diversity of synthesized images using Adaptive DeepInversion, which maximizes the Jensen-Shannon divergence between the teacher and student network logits. The resulting synthesized images from networks trained on the CIFAR-10 and ImageNet datasets demonstrate high fidelity and degree of realism, and help enable a new breed of data-free applications - ones that do not require any real images or labeled data. We demonstrate the applicability of our proposed method to three tasks of immense practical importance – (i) data-free network pruning, (ii) data-free knowledge transfer, and (iii) data-free continual learning.
Tasks Continual Learning, Network Pruning, Transfer Learning
Published 2019-12-18
URL https://arxiv.org/abs/1912.08795v1
PDF https://arxiv.org/pdf/1912.08795v1.pdf
PWC https://paperswithcode.com/paper/dreaming-to-distill-data-free-knowledge
Repo
Framework
comments powered by Disqus