January 29, 2020

3067 words 15 mins read

Paper Group ANR 514

Smoke Sky – Exploring New Frontiers of Unmanned Aerial Systems for Wildland Fire Science and Applications. A New Clustering Method Based on Morphological Operations. Play and Prune: Adaptive Filter Pruning for Deep Model Compression. A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent. Vispi: Automatic Visual …

Smoke Sky – Exploring New Frontiers of Unmanned Aerial Systems for Wildland Fire Science and Applications


Title	Smoke Sky – Exploring New Frontiers of Unmanned Aerial Systems for Wildland Fire Science and Applications
Authors	E. Natasha Stavros, Ali Agha, Allen Sirota, Marco Quadrelli, Kamak Ebadi, Kyongsik Yun
Abstract	Wildfire has had increasing impacts on society as the climate changes and the wildland urban interface grows. As such, there is a demand for innovative solutions to help manage fire. Managing wildfire can include proactive fire management such as prescribed burning within constrained areas or advancements for reactive fire management (e.g., fire suppression). Because of the growing societal impact, the JPL BlueSky program sought to assess the current state of fire management and technology and determine areas with high return on investment. To accomplish this, we met with the national interagency Unmanned Aerial System (UAS) Advisory Group (UASAG) and with leading technology transfer experts for fire science and management applications. We provide an overview of the current state as well as an analysis of the impact, maturity and feasibility of integrating different technologies that can be developed by JPL. Based on the findings, the highest return on investment technologies for fire management are first to develop single micro-aerial vehicle (MAV) autonomy, autonomous sensing over fire, and the associated data and information system for active fire local environment mapping. Once this is completed for a single MAV, expanding the work to include many in a swarm would require further investment of distributed MAV autonomy and MAV swarm mechanics, but could greatly expand the breadth of application over large fires. Important to investing in these technologies will be in developing collaborations with the key influencers and champions for using UAS technology in fire management.
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.08288v1
PDF	https://arxiv.org/pdf/1911.08288v1.pdf
PWC	https://paperswithcode.com/paper/smoke-sky-exploring-new-frontiers-of-unmanned
Repo
Framework

A New Clustering Method Based on Morphological Operations


Title	A New Clustering Method Based on Morphological Operations
Authors	Zhenzhou Wang
Abstract	With the booming development of data science, many clustering methods have been proposed. All clustering methods have inherent merits and deficiencies. Therefore, they are only capable of clustering some specific types of data robustly. In addition, the accuracies of the clustering methods rely heavily on the characteristics of the data. In this paper, we propose a new clustering method based on the morphological operations. The morphological dilation is used to connect the data points based on their adjacency and form different connected domains. The iteration of the morphological dilation process stops when the number of connected domains equals the number of the clusters or when the maximum number of iteration is reached. The morphological dilation is then used to label the connected domains. The Euclidean distance between each data point and the points in each labeled connected domain is calculated. For each data point, there is a labeled connected domain that contains a point that yields the smallest Euclidean distance. The data point is assigned with the same labeling number as the labeled connected domain. We evaluate and compare the proposed method with state of the art clustering methods with different types of data. Experimental results show that the proposed method is more robust and generic for clustering two-dimensional or three-dimensional data.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10548v1
PDF	https://arxiv.org/pdf/1905.10548v1.pdf
PWC	https://paperswithcode.com/paper/a-new-clustering-method-based-on
Repo
Framework

Play and Prune: Adaptive Filter Pruning for Deep Model Compression


Title	Play and Prune: Adaptive Filter Pruning for Deep Model Compression
Authors	Pravendra Singh, Vinay Kumar Verma, Piyush Rai, Vinay P. Namboodiri
Abstract	While convolutional neural networks (CNN) have achieved impressive performance on various classification/recognition tasks, they typically consist of a massive number of parameters. This results in significant memory requirement as well as computational overheads. Consequently, there is a growing need for filter-level pruning approaches for compressing CNN based models that not only reduce the total number of parameters but reduce the overall computation as well. We present a new min-max framework for filter-level pruning of CNNs. Our framework, called Play and Prune (PP), jointly prunes and fine-tunes CNN model parameters, with an adaptive pruning rate, while maintaining the model’s predictive performance. Our framework consists of two modules: (1) An adaptive filter pruning (AFP) module, which minimizes the number of filters in the model; and (2) A pruning rate controller (PRC) module, which maximizes the accuracy during pruning. Moreover, unlike most previous approaches, our approach allows directly specifying the desired error tolerance instead of pruning level. Our compressed models can be deployed at run-time, without requiring any special libraries or hardware. Our approach reduces the number of parameters of VGG-16 by an impressive factor of 17.5X, and number of FLOPS by 6.43X, with no loss of accuracy, significantly outperforming other state-of-the-art filter pruning methods.
Tasks	Model Compression
Published	2019-05-11
URL	https://arxiv.org/abs/1905.04446v1
PDF	https://arxiv.org/pdf/1905.04446v1.pdf
PWC	https://paperswithcode.com/paper/play-and-prune-adaptive-filter-pruning-for
Repo
Framework

A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent


Title	A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent
Authors	Eduard Gorbunov, Filip Hanzely, Peter Richtárik
Abstract	In this paper we introduce a unified analysis of a large family of variants of proximal stochastic gradient descent ({\tt SGD}) which so far have required different intuitions, convergence analyses, have different applications, and which have been developed separately in various communities. We show that our framework includes methods with and without the following tricks, and their combinations: variance reduction, importance sampling, mini-batch sampling, quantization, and coordinate sub-sampling. As a by-product, we obtain the first unified theory of {\tt SGD} and randomized coordinate descent ({\tt RCD}) methods, the first unified theory of variance reduced and non-variance-reduced {\tt SGD} methods, and the first unified theory of quantized and non-quantized methods. A key to our approach is a parametric assumption on the iterates and stochastic gradients. In a single theorem we establish a linear convergence result under this assumption and strong-quasi convexity of the loss function. Whenever we recover an existing method as a special case, our theorem gives the best known complexity result. Our approach can be used to motivate the development of new useful methods, and offers pre-proved convergence guarantees. To illustrate the strength of our approach, we develop five new variants of {\tt SGD}, and through numerical experiments demonstrate some of their properties.
Tasks	Quantization
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11261v1
PDF	https://arxiv.org/pdf/1905.11261v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-theory-of-sgd-variance-reduction
Repo
Framework

Vispi: Automatic Visual Perception and Interpretation of Chest X-rays


Title	Vispi: Automatic Visual Perception and Interpretation of Chest X-rays
Authors	Xin Li, Rui Cao, Dongxiao Zhu
Abstract	Medical imaging contains the essential information for rendering diagnostic and treatment decisions. Inspecting (visual perception) and interpreting image to generate a report are tedious clinical routines for a radiologist where automation is expected to greatly reduce the workload. Despite rapid development of natural image captioning, computer-aided medical image visual perception and interpretation remain a challenging task, largely due to the lack of high-quality annotated image-report pairs and tailor-made generative models for sufficient extraction and exploitation of localized semantic features, particularly those associated with abnormalities. To tackle these challenges, we present Vispi, an automatic medical image interpretation system, which first annotates an image via classifying and localizing common thoracic diseases with visual support and then followed by report generation from an attentive LSTM model. Analyzing an open IU X-ray dataset, we demonstrate a superior performance of Vispi in disease classification, localization and report generation using automatic performance evaluation metrics ROUGE and CIDEr.
Tasks	Image Captioning
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05190v1
PDF	https://arxiv.org/pdf/1906.05190v1.pdf
PWC	https://paperswithcode.com/paper/vispi-automatic-visual-perception-and
Repo
Framework

Optimal Attack against Autoregressive Models by Manipulating the Environment


Title	Optimal Attack against Autoregressive Models by Manipulating the Environment
Authors	Yiding Chen, Xiaojin Zhu
Abstract	We describe an optimal adversarial attack formulation against autoregressive time series forecast using Linear Quadratic Regulator (LQR). In this threat model, the environment evolves according to a dynamical system; an autoregressive model observes the current environment state and predicts its future values; an attacker has the ability to modify the environment state in order to manipulate future autoregressive forecasts. The attacker’s goal is to force autoregressive forecasts into tracking a target trajectory while minimizing its attack expenditure. In the white-box setting where the attacker knows the environment and forecast models, we present the optimal attack using LQR for linear models, and Model Predictive Control (MPC) for nonlinear models. In the black-box setting, we combine system identification and MPC. Experiments demonstrate the effectiveness of our attacks.
Tasks	Adversarial Attack, Time Series
Published	2019-02-01
URL	https://arxiv.org/abs/1902.00202v3
PDF	https://arxiv.org/pdf/1902.00202v3.pdf
PWC	https://paperswithcode.com/paper/optimal-adversarial-attack-on-autoregressive
Repo
Framework


Title	Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning
Authors	Tao Jin, Siyu Huang, Yingming Li, Zhongfei Zhang
Abstract	This paper addresses the challenging task of video captioning which aims to generate descriptions for video data. Recently, the attention-based encoder-decoder structures have been widely used in video captioning. In existing literature, the attention weights are often built from the information of an individual modality, while, the association relationships between multiple modalities are neglected. Motivated by this observation, we propose a video captioning model with High-Order Cross-Modal Attention (HOCA) where the attention weights are calculated based on the high-order correlation tensor to capture the frame-level cross-modal interaction of different modalities sufficiently. Furthermore, we novelly introduce Low-Rank HOCA which adopts tensor decomposition to reduce the extremely large space requirement of HOCA, leading to a practical and efficient implementation in real-world applications. Experimental results on two benchmark datasets, MSVD and MSR-VTT, show that Low-rank HOCA establishes a new state-of-the-art.
Tasks	Video Captioning
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00212v1
PDF	https://arxiv.org/pdf/1911.00212v1.pdf
PWC	https://paperswithcode.com/paper/low-rank-hoca-efficient-high-order-cross-1
Repo
Framework

BERT-based Ranking for Biomedical Entity Normalization


Title	BERT-based Ranking for Biomedical Entity Normalization
Authors	Zongcheng Ji, Qiang Wei, Hua Xu
Abstract	Developing high-performance entity normalization algorithms that can alleviate the term variation problem is of great interest to the biomedical community. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings. Bidirectional Encoder Representations from Transformers (BERT), BERT for Biomedical Text Mining (BioBERT) and BERT for Clinical Text Mining (ClinicalBERT) were recently introduced to pre-train contextualized word representation models using bidirectional Transformers, advancing the state-of-the-art for many natural language processing tasks. In this study, we proposed an entity normalization architecture by fine-tuning the pre-trained BERT / BioBERT / ClinicalBERT models and conducted extensive experiments to evaluate the effectiveness of the pre-trained models for biomedical entity normalization using three different types of datasets. Our experimental results show that the best fine-tuned models consistently outperformed previous methods and advanced the state-of-the-art for biomedical entity normalization, with up to 1.17% increase in accuracy.
Tasks	Word Embeddings
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03548v1
PDF	https://arxiv.org/pdf/1908.03548v1.pdf
PWC	https://paperswithcode.com/paper/bert-based-ranking-for-biomedical-entity
Repo
Framework

Biomimetic Ultra-Broadband Perfect Absorbers Optimised with Reinforcement Learning


Title	Biomimetic Ultra-Broadband Perfect Absorbers Optimised with Reinforcement Learning
Authors	Trevon Badloe, Inki Kim, Junsuk Rho
Abstract	By learning the optimal policy with a double deep Q-learning network, we design ultra-broadband, biomimetic, perfect absorbers with various materials, based the structure of a moths eye. All absorbers achieve over 90% average absorption from 400 to 1,600 nm. By training a DDQN with motheye structures made up of chromium, we transfer the learned knowledge to other, similar materials to quickly and efficiently find the optimal parameters from the around 1 billion possible options. The knowledge learned from previous optimisations helps the network to find the best solution for a new material in fewer steps, dramatically increasing the efficiency of finding designs with ultra-broadband absorption.
Tasks	Q-Learning
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12465v1
PDF	https://arxiv.org/pdf/1910.12465v1.pdf
PWC	https://paperswithcode.com/paper/biomimetic-ultra-broadband-perfect-absorbers
Repo
Framework

Learning to Correspond Dynamical Systems


Title	Learning to Correspond Dynamical Systems
Authors	Nam Hee Kim, Zhaoming Xie, Michiel van de Panne
Abstract	Many dynamical systems exhibit similar structure, as often captured by hand-designed simplified models that can be used for analysis and control. We develop a method for learning to correspond pairs of dynamical systems via a learned latent dynamical system. Given trajectory data from two dynamical systems, we learn a shared latent state space and a shared latent dynamics model, along with an encoder-decoder pair for each of the original systems. With the learned correspondences in place, we can use a simulation of one system to produce an imagined motion of its counterpart. We can also simulate in the learned latent dynamics and synthesize the motions of both corresponding systems, as a form of bisimulation. We demonstrate the approach using pairs of controlled bipedal walkers, as well as by pairing a walker with a controlled pendulum.
Tasks
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03015v2
PDF	https://arxiv.org/pdf/1912.03015v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-correspond-dynamical-systems
Repo
Framework

Context-Aware Zero-Shot Learning for Object Recognition


Title	Context-Aware Zero-Shot Learning for Object Recognition
Authors	Eloi Zablocki, Patrick Bordes, Benjamin Piwowarski, Laure Soulier, Patrick Gallinari
Abstract	Zero-Shot Learning (ZSL) aims at classifying unlabeled objects by leveraging auxiliary knowledge, such as semantic representations. A limitation of previous approaches is that only intrinsic properties of objects, e.g. their visual appearance, are taken into account while their context, e.g. the surrounding objects in the image, is ignored. Following the intuitive principle that objects tend to be found in certain contexts but not others, we propose a new and challenging approach, context-aware ZSL, that leverages semantic representations in a new way to model the conditional likelihood of an object to appear in a given context. Finally, through extensive experiments conducted on Visual Genome, we show that contextual information can substantially improve the standard ZSL approach and is robust to unbalanced classes.
Tasks	Object Recognition, Zero-Shot Learning
Published	2019-04-24
URL	http://arxiv.org/abs/1904.12638v2
PDF	http://arxiv.org/pdf/1904.12638v2.pdf
PWC	https://paperswithcode.com/paper/190412638
Repo
Framework

CAG: A Real-time Low-cost Enhanced-robustness High-transferability Content-aware Adversarial Attack Generator


Title	CAG: A Real-time Low-cost Enhanced-robustness High-transferability Content-aware Adversarial Attack Generator
Authors	Huy Phan, Yi Xie, Siyu Liao, Jie Chen, Bo Yuan
Abstract	Deep neural networks (DNNs) are vulnerable to adversarial attack despite their tremendous success in many AI fields. Adversarial attack is a method that causes the intended misclassfication by adding imperceptible perturbations to legitimate inputs. Researchers have developed numerous types of adversarial attack methods. However, from the perspective of practical deployment, these methods suffer from several drawbacks such as long attack generating time, high memory cost, insufficient robustness and low transferability. We propose a Content-aware Adversarial Attack Generator (CAG) to achieve real-time, low-cost, enhanced-robustness and high-transferability adversarial attack. First, as a type of generative model-based attack, CAG shows significant speedup (at least 500 times) in generating adversarial examples compared to the state-of-the-art attacks such as PGD and C&W. CAG only needs a single generative model to perform targeted attack to any targeted class. Because CAG encodes the label information into a trainable embedding layer, it differs from prior generative model-based adversarial attacks that use $n$ different copies of generative models for $n$ different targeted classes. As a result, CAG significantly reduces the required memory cost for generating adversarial examples. CAG can generate adversarial perturbations that focus on the critical areas of input by integrating the class activation maps information in the training process, and hence improve the robustness of CAG attack against the state-of-art adversarial defenses. In addition, CAG exhibits high transferability across different DNN classifier models in black-box attack scenario by introducing random dropout in the process of generating perturbations. Extensive experiments on different datasets and DNN models have verified the real-time, low-cost, enhanced-robustness, and high-transferability benefits of CAG.
Tasks	Adversarial Attack
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07742v1
PDF	https://arxiv.org/pdf/1912.07742v1.pdf
PWC	https://paperswithcode.com/paper/cag-a-real-time-low-cost-enhanced-robustness
Repo
Framework

Approximation of functions by neural networks


Title	Approximation of functions by neural networks
Authors	Andreas Thom
Abstract	We study the approximation of measurable functions on the hypercube by functions arising from affine neural networks. Our main achievement is an approximation of any measurable function $f \colon W_n \to [-1,1]$ up to a prescribed precision $\varepsilon>0$ by a bounded number of neurons, depending only on $\varepsilon$ and not on the function $f$ or $n \in \mathbb N$.
Tasks
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10267v1
PDF	http://arxiv.org/pdf/1901.10267v1.pdf
PWC	https://paperswithcode.com/paper/approximation-of-functions-by-neural-networks
Repo
Framework

Do We Need Neural Models to Explain Human Judgments of Acceptability?


Title	Do We Need Neural Models to Explain Human Judgments of Acceptability?
Authors	Wang Jing, M. A. Kelly, David Reitter
Abstract	Native speakers can judge whether a sentence is an acceptable instance of their language. Acceptability provides a means of evaluating whether computational language models are processing language in a human-like manner. We test the ability of computational language models, simple language features, and word embeddings to predict native English speakers judgments of acceptability on English-language essays written by non-native speakers. We find that much of the sentence acceptability variance can be captured by a combination of features including misspellings, word order, and word similarity (Pearson’s r = 0.494). While predictive neural models fit acceptability judgments well (r = 0.527), we find that a 4-gram model with statistical smoothing is just as good (r = 0.528). Thanks to incorporating a count of misspellings, our 4-gram model surpasses both the previous unsupervised state-of-the art (Lau et al., 2015; r = 0.472), and the average non-expert native speaker (r = 0.46). Our results demonstrate that acceptability is well captured by n-gram statistics and simple language features.
Tasks	Word Embeddings
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08663v2
PDF	https://arxiv.org/pdf/1909.08663v2.pdf
PWC	https://paperswithcode.com/paper/do-we-need-neural-models-to-explain-human
Repo
Framework

Regression Under Human Assistance


Title	Regression Under Human Assistance
Authors	Abir De, Paramita Koley, Niloy Ganguly, Manuel Gomez-Rodriguez
Abstract	Decisions are increasingly taken by both humans and machine learning models. However, machine learning models are currently trained for full automation-they are not aware that some of the decisions may still be taken by humans. In this paper, we take a first step towards the development of machine learning models that are optimized to operate under different automation levels. More specifically, we first introduce the problem of ridge regression under human assistance and show that it is NP-hard. Then, we derive an alternative representation of the corresponding objective function as a difference of nondecreasing submodular functions. Building on this representation, we further show that the objective is nondecreasing and satisfies \xi-submodularity, a recently introduced notion of approximate submodularity. These properties allow simple and efficient greedy algorithm to enjoy approximation guarantees at solving the problem. Experiments on synthetic and real-world data from two important applications-medical diagnoses and content moderation-demonstrate that the greedy algorithm beats several competitive baselines.
Tasks
Published	2019-09-06
URL	https://arxiv.org/abs/1909.02963v3
PDF	https://arxiv.org/pdf/1909.02963v3.pdf
PWC	https://paperswithcode.com/paper/regression-under-human-assistance
Repo
Framework