January 27, 2020

3235 words 16 mins read

Paper Group ANR 1323

Aesthetics of Neural Network Art. Road-network-based Rapid Geolocalization. Convolutional Neural Networks for Speech Controlled Prosthetic Hands. Generalisation in fully-connected neural networks for time series forecasting. General Method for Prime-point Cyclic Convolution over the Real Field. Automated Model Selection with Bayesian Quadrature. Ca …

Aesthetics of Neural Network Art


Title	Aesthetics of Neural Network Art
Authors	Aaron Hertzmann
Abstract	This paper proposes a way to understand neural network artworks as juxtapositions of natural image cues. It is hypothesized that images with unusual combinations of realistic visual cues are interesting, and, neural models trained to model natural images are well-suited to creating interesting images. Art using neural models produces new images similar to those of natural images, but with weird and intriguing variations. This analysis is applied to neural art based on Generative Adversarial Networks, image stylization, Deep Dreams, and Perception Engines.
Tasks	Image Stylization
Published	2019-03-13
URL	http://arxiv.org/abs/1903.05696v2
PDF	http://arxiv.org/pdf/1903.05696v2.pdf
PWC	https://paperswithcode.com/paper/aesthetics-of-neural-network-art
Repo
Framework

Road-network-based Rapid Geolocalization


Title	Road-network-based Rapid Geolocalization
Authors	Yongfei Li, Dongfang Yang, Shicheng Wang, Hao He
Abstract	It has always been a research hotspot to use geographic information to assist the navigation of unmanned aerial vehicles. In this paper, a road-network-based localization method is proposed. We match roads in the measurement images to the reference road vector map, and realize successful localization on areas as large as a whole city. The road network matching problem is treated as a point cloud registration problem under two-dimensional projective transformation, and solved under a hypothesise-and-test framework. To deal with the projective point cloud registration problem, a global projective invariant feature is proposed, which consists of two road intersections augmented with the information of their tangents. We call it two road intersections tuple. We deduce the closed-form solution for determining the alignment transformation from a pair of matching two road intersections tuples. In addition, we propose the necessary conditions for the tuples to match. This can reduce the candidate matching tuples, thus accelerating the search to a great extent. We test all the candidate matching tuples under a hypothesise-and-test framework to search for the best match. The experiments show that our method can localize the target area over an area of 400 within 1 second on a single cpu.
Tasks	Point Cloud Registration
Published	2019-06-25
URL	https://arxiv.org/abs/1906.12174v1
PDF	https://arxiv.org/pdf/1906.12174v1.pdf
PWC	https://paperswithcode.com/paper/road-network-based-rapid-geolocalization
Repo
Framework

Convolutional Neural Networks for Speech Controlled Prosthetic Hands


Title	Convolutional Neural Networks for Speech Controlled Prosthetic Hands
Authors	Mohsen Jafarzadeh, Yonas Tadesse
Abstract	Speech recognition is one of the key topics in artificial intelligence, as it is one of the most common forms of communication in humans. Researchers have developed many speech-controlled prosthetic hands in the past decades, utilizing conventional speech recognition systems that use a combination of neural network and hidden Markov model. Recent advancements in general-purpose graphics processing units (GPGPUs) enable intelligent devices to run deep neural networks in real-time. Thus, state-of-the-art speech recognition systems have rapidly shifted from the paradigm of composite subsystems optimization to the paradigm of end-to-end optimization. However, a low-power embedded GPGPU cannot run these speech recognition systems in real-time. In this paper, we show the development of deep convolutional neural networks (CNN) for speech control of prosthetic hands that run in real-time on a NVIDIA Jetson TX2 developer kit. First, the device captures and converts speech into 2D features (like spectrogram). The CNN receives the 2D features and classifies the hand gestures. Finally, the hand gesture classes are sent to the prosthetic hand motion control system. The whole system is written in Python with Keras, a deep learning library that has a TensorFlow backend. Our experiments on the CNN demonstrate the 91% accuracy and 2ms running time of hand gestures (text output) from speech commands, which can be used to control the prosthetic hands in real-time.
Tasks	Speech Recognition
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01918v1
PDF	https://arxiv.org/pdf/1910.01918v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-speech
Repo
Framework

Generalisation in fully-connected neural networks for time series forecasting


Title	Generalisation in fully-connected neural networks for time series forecasting
Authors	Anastasia Borovykh, Cornelis W. Oosterlee, Sander M. Bohte
Abstract	In this paper we study the generalization capabilities of fully-connected neural networks trained in the context of time series forecasting. Time series do not satisfy the typical assumption in statistical learning theory of the data being i.i.d. samples from some data-generating distribution. We use the input and weight Hessians, that is the smoothness of the learned function with respect to the input and the width of the minimum in weight space, to quantify a network’s ability to generalize to unseen data. While such generalization metrics have been studied extensively in the i.i.d. setting of for example image recognition, here we empirically validate their use in the task of time series forecasting. Furthermore we discuss how one can control the generalization capability of the network by means of the training process using the learning rate, batch size and the number of training iterations as controls. Using these hyperparameters one can efficiently control the complexity of the output function without imposing explicit constraints.
Tasks	Time Series, Time Series Forecasting
Published	2019-02-14
URL	https://arxiv.org/abs/1902.05312v2
PDF	https://arxiv.org/pdf/1902.05312v2.pdf
PWC	https://paperswithcode.com/paper/generalisation-in-fully-connected-neural
Repo
Framework

General Method for Prime-point Cyclic Convolution over the Real Field


Title	General Method for Prime-point Cyclic Convolution over the Real Field
Authors	Qi Cai, Tsung-Ching Lin, Yuanxin Wu, Wenxian Yu, Trieu-Kien Truong
Abstract	A general and fast method is conceived for computing the cyclic convolution of n points, where n is a prime number. This method fully exploits the internal structure of the cyclic matrix, and hence leads to significant reduction of the multiplication complexity in terms of CPU time by 50%, as compared with Winograd’s algorithm. In this paper, we only consider the real and complex fields due to their most important applications, but in general, the idea behind this method can be extended to any finite field of interest. Clearly, it is well-known that the discrete Fourier transform (DFT) can be expressed in terms of cyclic convolution, so it can be utilized to compute the DFT when the block length is a prime.
Tasks
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03398v1
PDF	https://arxiv.org/pdf/1905.03398v1.pdf
PWC	https://paperswithcode.com/paper/190503398
Repo
Framework

Automated Model Selection with Bayesian Quadrature


Title	Automated Model Selection with Bayesian Quadrature
Authors	Henry Chai, Jean-Francois Ton, Roman Garnett, Michael A. Osborne
Abstract	We present a novel technique for tailoring Bayesian quadrature (BQ) to model selection. The state-of-the-art for comparing the evidence of multiple models relies on Monte Carlo methods, which converge slowly and are unreliable for computationally expensive models. Previous research has shown that BQ offers sample efficiency superior to Monte Carlo in computing the evidence of an individual model. However, applying BQ directly to model comparison may waste computation producing an overly-accurate estimate for the evidence of a clearly poor model. We propose an automated and efficient algorithm for computing the most-relevant quantity for model selection: the posterior probability of a model. Our technique maximizes the mutual information between this quantity and observations of the models’ likelihoods, yielding efficient acquisition of samples across disparate model spaces when likelihood observations are limited. Our method produces more-accurate model posterior estimates using fewer model likelihood evaluations than standard Bayesian quadrature and Monte Carlo estimators, as we demonstrate on synthetic and real-world examples.
Tasks	Model Selection
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09724v3
PDF	http://arxiv.org/pdf/1902.09724v3.pdf
PWC	https://paperswithcode.com/paper/automated-model-selection-with-bayesian
Repo
Framework

Can Sentiment Analysis Reveal Structure in a Plotless Novel?


Title	Can Sentiment Analysis Reveal Structure in a Plotless Novel?
Authors	Kathrine Elkins, Jon Chun
Abstract	Modernist novels are thought to break with traditional plot structure. In this paper, we test this theory by applying Sentiment Analysis to one of the most famous modernist novels, To the Lighthouse by Virginia Woolf. We first assess Sentiment Analysis in light of the critique that it cannot adequately account for literary language: we use a unique statistical comparison to demonstrate that even simple lexical approaches to Sentiment Analysis are surprisingly effective. We then use the Syuzhet.R package to explore similarities and differences across modeling methods. This comparative approach, when paired with literary close reading, can offer interpretive clues. To our knowledge, we are the first to undertake a hybrid model that fully leverages the strengths of both computational analysis and close reading. This hybrid model raises new questions for the literary critic, such as how to interpret relative versus absolute emotional valence and how to take into account subjective identification. Our finding is that while To the Lighthouse does not replicate a plot centered around a traditional hero, it does reveal an underlying emotional structure distributed between characters - what we term a distributed heroine model. This finding is innovative in the field of modernist and narrative studies and demonstrates that a hybrid method can yield significant discoveries.
Tasks	Sentiment Analysis
Published	2019-08-31
URL	https://arxiv.org/abs/1910.01441v1
PDF	https://arxiv.org/pdf/1910.01441v1.pdf
PWC	https://paperswithcode.com/paper/can-sentiment-analysis-reveal-structure-in-a
Repo
Framework

Action Selection for MDPs: Anytime AO* vs. UCT


Title	Action Selection for MDPs: Anytime AO* vs. UCT
Authors	Blai Bonet, Hector Geffner
Abstract	In the presence of non-admissible heuristics, A* and other best-first algorithms can be converted into anytime optimal algorithms over OR graphs, by simply continuing the search after the first solution is found. The same trick, however, does not work for best-first algorithms over AND/OR graphs, that must be able to expand leaf nodes of the explicit graph that are not necessarily part of the best partial solution. Anytime optimal variants of AO* must thus address an exploration-exploitation tradeoff: they cannot just “exploit”, they must keep exploring as well. In this work, we develop one such variant of AO* and apply it to finite-horizon MDPs. This Anytime AO* algorithm eventually delivers an optimal policy while using non-admissible random heuristics that can be sampled, as when the heuristic is the cost of a base policy that can be sampled with rollouts. We then test Anytime AO* for action selection over large infinite-horizon MDPs that cannot be solved with existing off-line heuristic search and dynamic programming algorithms, and compare it with UCT.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12104v1
PDF	https://arxiv.org/pdf/1909.12104v1.pdf
PWC	https://paperswithcode.com/paper/action-selection-for-mdps-anytime-ao-vs-uct
Repo
Framework

High Dimensional Restrictive Federated Model Selection with multi-objective Bayesian Optimization over shifted distributions


Title	High Dimensional Restrictive Federated Model Selection with multi-objective Bayesian Optimization over shifted distributions
Authors	Xudong Sun, Andrea Bommert, Florian Pfisterer, Jörg Rahnenführer, Michel Lang, Bernd Bischl
Abstract	A novel machine learning optimization process coined Restrictive Federated Model Selection (RFMS) is proposed under the scenario, for example, when data from healthcare units can not leave the site it is situated on and it is forbidden to carry out training algorithms on remote data sites due to either technical or privacy and trust concerns. To carry out a clinical research under this scenario, an analyst could train a machine learning model only on local data site, but it is still possible to execute a statistical query at a certain cost in the form of sending a machine learning model to some of the remote data sites and get the performance measures as feedback, maybe due to prediction being usually much cheaper. Compared to federated learning, which is optimizing the model parameters directly by carrying out training across all data sites, RFMS trains model parameters only on one local data site but optimizes hyper-parameters across other data sites jointly since hyper-parameters play an important role in machine learning performance. The aim is to get a Pareto optimal model with respective to both local and remote unseen prediction losses, which could generalize well across data sites. In this work, we specifically consider high dimensional data with shifted distributions over data sites. As an initial investigation, Bayesian Optimization especially multi-objective Bayesian Optimization is used to guide an adaptive hyper-parameter optimization process to select models under the RFMS scenario. Empirical results show that solely using the local data site to tune hyper-parameters generalizes poorly across data sites, compared to methods that utilize the local and remote performances. Furthermore, in terms of dominated hypervolumes, multi-objective Bayesian Optimization algorithms show increased performance across multiple data sites among other candidates.
Tasks	Model Selection
Published	2019-02-24
URL	https://arxiv.org/abs/1902.08999v2
PDF	https://arxiv.org/pdf/1902.08999v2.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-restrictive-federated-model
Repo
Framework

Multigrid Neural Memory


Title	Multigrid Neural Memory
Authors	Tri Huynh, Michael Maire, Matthew R. Walter
Abstract	We introduce a radical new approach to endowing neural networks with access to long-term and large-scale memory. Architecting networks with internal multigrid structure and connectivity, while distributing memory cells alongside computation throughout this topology, we observe that coherent memory subsystems emerge as a result of training. Our design both drastically differs from and is far simpler than prior efforts, such as the recently proposed Differentiable Neural Computer (DNC), which uses intricately crafted controllers to connect neural networks to external memory banks. Our hierarchical spatial organization, parameterized convolutionally, permits efficient instantiation of large-capacity memories. Our multigrid topology provides short internal routing pathways, allowing convolutional networks to efficiently approximate the behavior of fully connected networks. Such networks have an implicit capacity for internal attention; augmented with memory, they learn to read and write specific memory locations in a dynamic data-dependent manner. We demonstrate these capabilities on synthetic exploration and mapping tasks, where our network is able to self-organize and retain long-term memory for trajectories of thousands of time steps, outperforming the DNC. On tasks without any notion of spatial geometry: sorting, associative recall, and question answering, our design functions as a truly generic memory and yields excellent results.
Tasks	Question Answering
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05948v3
PDF	https://arxiv.org/pdf/1906.05948v3.pdf
PWC	https://paperswithcode.com/paper/multigrid-neural-memory
Repo
Framework

Quantization Loss Re-Learning Method


Title	Quantization Loss Re-Learning Method
Authors	Kunping Li
Abstract	In order to quantize the gate parameters of the LSTM (Long Short-Term Memory) neural network model with almost no recognition performance degraded, a new quantization method named Quantization Loss Re-Learn Method is proposed in this paper. The method does lossy quantization on gate parameters during training iterations, and the weight parameters learn to offset the loss of gate parameters quantization by adjusting the gradient in back propagation during weight parameters optimization. We proved the effectiveness of this method through theoretical derivation and experiments. The gate parameters had been quantized to 0, 0.5, 1 three values, and on the Named Entity Recognition dataset, the F1 score of the model with the new quantization method on gate parameters decreased by only 0.7% compared to the baseline model.
Tasks	Named Entity Recognition, Quantization
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13568v1
PDF	https://arxiv.org/pdf/1905.13568v1.pdf
PWC	https://paperswithcode.com/paper/quantization-loss-re-learning-method
Repo
Framework

Decentralized Flood Forecasting Using Deep Neural Networks


Title	Decentralized Flood Forecasting Using Deep Neural Networks
Authors	Muhammed Sit, Ibrahim Demir
Abstract	Predicting flood for any location at times of extreme storms is a longstanding problem that has utmost importance in emergency management. Conventional methods that aim to predict water levels in streams use advanced hydrological models still lack of giving accurate forecasts everywhere. This study aims to explore artificial deep neural networks’ performance on flood prediction. While providing models that can be used in forecasting stream stage, this paper presents a dataset that focuses on the connectivity of data points on river networks. It also shows that neural networks can be very helpful in time-series forecasting as in flood events, and support improving existing models through data assimilation.
Tasks	Time Series, Time Series Forecasting
Published	2019-02-06
URL	https://arxiv.org/abs/1902.02308v2
PDF	https://arxiv.org/pdf/1902.02308v2.pdf
PWC	https://paperswithcode.com/paper/decentralized-flood-forecasting-using-deep
Repo
Framework

Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention


Title	Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention
Authors	Yanzeng Li, Bowen Yu, Mengge Xue, Tingwen Liu
Abstract	Most Chinese pre-trained encoders take a character as a basic unit and learn representations according to character’s external contexts, ignoring the semantics expressed in the word, which is the smallest meaningful unit in Chinese. Hence, we propose a novel word aligned attention to incorporate word segmentation information, which is complementary to various Chinese pre-trained language models. Specifically, we devise a mixed-pooling strategy to align the character level attention to the word level, and propose an effective fusion method to solve the potential issue of segmentation error propagation. As a result, word and character information are explicitly integrated at the fine-tuning procedure. Experimental results on various Chinese NLP benchmarks demonstrate that our model could bring another significant gain over several pre-trained models.
Tasks
Published	2019-11-07
URL	https://arxiv.org/abs/1911.02821v1
PDF	https://arxiv.org/pdf/1911.02821v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-pre-trained-chinese-character
Repo
Framework

A Deep Reinforcement Learning Approach for Global Routing


Title	A Deep Reinforcement Learning Approach for Global Routing
Authors	Haiguang Liao, Wentai Zhang, Xuliang Dong, Barnabas Poczos, Kenji Shimada, Levent Burak Kara
Abstract	Global routing has been a historically challenging problem in electronic circuit design, where the challenge is to connect a large and arbitrary number of circuit components with wires without violating the design rules for the printed circuit boards or integrated circuits. Similar routing problems also exist in the design of complex hydraulic systems, pipe systems and logistic networks. Existing solutions typically consist of greedy algorithms and hard-coded heuristics. As such, existing approaches suffer from a lack of model flexibility and non-optimum solutions. As an alternative approach, this work presents a deep reinforcement learning method for solving the global routing problem in a simulated environment. At the heart of the proposed method is deep reinforcement learning that enables an agent to produce an optimal policy for routing based on the variety of problems it is presented with leveraging the conjoint optimization mechanism of deep reinforcement learning. Conjoint optimization mechanism is explained and demonstrated in details; the best network structure and the parameters of the learned model are explored. Based on the fine-tuned model, routing solutions and rewards are presented and analyzed. The results indicate that the approach can outperform the benchmark method of a sequential A* method, suggesting a promising potential for deep reinforcement learning for global routing and other routing or path planning problems in general. Another major contribution of this work is the development of a global routing problem sets generator with the ability to generate parameterized global routing problem sets with different size and constraints, enabling evaluation of different routing algorithms and the generation of training datasets for future data-driven routing approaches.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08809v1
PDF	https://arxiv.org/pdf/1906.08809v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-reinforcement-learning-approach-for-1
Repo
Framework

Long Short-Term Network Based Unobtrusive Perceived Workload Monitoring with Consumer Grade Smartwatches in the Wild


Title	Long Short-Term Network Based Unobtrusive Perceived Workload Monitoring with Consumer Grade Smartwatches in the Wild
Authors	Deniz Ekiz, Yekta Said Can, Cem Ersoy
Abstract	Continuous high perceived workload has a negative impact on the individual’s well-being. Prior works focused on detecting the workload with medical-grade wearable systems in the restricted settings, and the effect of applying deep learning techniques for perceived workload detection in the wild settings is not investigated. We present an unobtrusive, comfortable, pervasive and affordable Long Short-Term Memory Network based continuous workload monitoring system based on a smartwatch application that monitors the perceived workload of individuals in the wild. We make use of modern consumer-grade smartwatches. We have recorded physiological data from daily life with perceived workload questionnaires from subjects in their real-life environments over a month. The model was trained and evaluated with the daily-life physiological data coming from different days which makes it robust to daily changes in the heart rate variability, that we use with accelerometer features to asses low and high workload. Our system has the capability of removing motion-related artifacts and detecting perceived workload by using traditional and deep classifiers. We discussed the problems related to in the wild applications with the consumer-grade smartwatches. We showed that Long Short-Term Memory Network outperforms traditional classifiers on discrimination of low and high workload with smartwatches in the wild.
Tasks	Heart Rate Variability
Published	2019-11-30
URL	https://arxiv.org/abs/1912.00019v1
PDF	https://arxiv.org/pdf/1912.00019v1.pdf
PWC	https://paperswithcode.com/paper/long-short-term-network-based-unobtrusive
Repo
Framework