January 29, 2020

3369 words 16 mins read

Paper Group ANR 512

Paper Group ANR 512

Joint Block Low Rank and Sparse Matrix Recovery in Array Self-Calibration Off-Grid DoA Estimation. Sketch-Driven Regular Expression Generation from Natural Language and Examples. Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem. Non-Autoregressive Transformer Automatic Speech Recognition. …

Joint Block Low Rank and Sparse Matrix Recovery in Array Self-Calibration Off-Grid DoA Estimation

Title Joint Block Low Rank and Sparse Matrix Recovery in Array Self-Calibration Off-Grid DoA Estimation
Authors Cheng-Yu Hung, Mostafa Kaveh
Abstract This letter addresses the estimation of directions-of-arrival (DoA) by a sensor array using a sparse model in the presence of array calibration errors and off-grid directions. The received signal utilizes previously used models for unknown errors in calibration and structured linear representation of the off-grid effect. A convex optimization problem is formulated with an objective function to promote two-layer joint block-sparsity with its second-order cone programming (SOCP) representation. The performance of the proposed method is demonstrated by numerical simulations and compared with the Cramer-Rao Bound (CRB), and several previously proposed methods.
Tasks Calibration
Published 2019-03-17
URL https://arxiv.org/abs/1903.07158v2
PDF https://arxiv.org/pdf/1903.07158v2.pdf
PWC https://paperswithcode.com/paper/joint-block-low-rank-and-sparse-matrix
Repo
Framework

Sketch-Driven Regular Expression Generation from Natural Language and Examples

Title Sketch-Driven Regular Expression Generation from Natural Language and Examples
Authors Xi Ye, Qiaochu Chen, Xinyu Wang, Isil Dillig, Greg Durrett
Abstract Recent systems for converting natural language descriptions into regular expressions have achieved some success, but typically deal with short, formulaic text and can only produce simple regular expressions, limiting their applicability. Real-world regular expressions are complex, hard to describe with brief sentences, and sometimes require examples to fully convey the user’s intent. We present a framework for regular expression synthesis in this setting where both natural language and examples are available. First, a semantic parser (either grammar-based or neural) maps the natural language description into an intermediate sketch, which is an incomplete regular expression containing holes to denote missing components. Then a program synthesizer enumerates the regular expression space defined by the sketch and finds a regular expression that is consistent with the given string examples. Our semantic parser can be trained from supervised or heuristically-derived sketches and additionally fine-tuned with weak supervision based on correctness of the synthesized regex. We conduct experiments on two public large-scale datasets (Kushman and Barzilay, 2013; Locascio et al., 2016) and a real-world dataset we collected from StackOverflow. Our system achieves state-of-the-art performance on the public datasets and successfully solves 57% of the real-world dataset, which existing neural systems completely fail on.
Tasks
Published 2019-08-16
URL https://arxiv.org/abs/1908.05848v1
PDF https://arxiv.org/pdf/1908.05848v1.pdf
PWC https://paperswithcode.com/paper/sketch-driven-regular-expression-generation
Repo
Framework

Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem

Title Leveraging human Domain Knowledge to model an empirical Reward function for a Reinforcement Learning problem
Authors Dattaraj Rao
Abstract Traditional Reinforcement Learning (RL) problems depend on an exhaustive simulation environment that models real-world physics of the problem and trains the RL agent by observing this environment. In this paper, we present a novel approach to creating an environment by modeling the reward function based on empirical rules extracted from human domain knowledge of the system under study. Using this empirical rewards function, we will build an environment and train the agent. We will first create an environment that emulates the effect of setting cabin temperature through thermostat. This is typically done in RL problems by creating an exhaustive model of the system with detailed thermodynamic study. Instead, we propose an empirical approach to model the reward function based on human domain knowledge. We will document some rules of thumb that we usually exercise as humans while setting thermostat temperature and try and model these into our reward function. This modeling of empirical human domain rules into a reward function for RL is the unique aspect of this paper. This is a continuous action space problem and using deep deterministic policy gradient (DDPG) method, we will solve for maximizing the reward function. We will create a policy network that predicts optimal temperature setpoint given external temperature and humidity.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.07116v1
PDF https://arxiv.org/pdf/1909.07116v1.pdf
PWC https://paperswithcode.com/paper/leveraging-human-domain-knowledge-to-model-an
Repo
Framework

Non-Autoregressive Transformer Automatic Speech Recognition

Title Non-Autoregressive Transformer Automatic Speech Recognition
Authors Nanxin Chen, Shinji Watanabe, Jesús Villalba, Najim Dehak
Abstract Recently very deep transformers start showing outperformed performance to traditional bi-directional long short-term memory networks by a large margin. However, to put it into production usage, inference computation cost and latency are still serious concerns in real scenarios. In this paper, we study a novel non-autoregressive transformers structure for speech recognition, which is originally introduced in machine translation. During training input tokens fed to the decoder are randomly replaced by a special mask token. The network is required to predict those mask tokens by taking both context and input speech into consideration. During inference, we start from all mask tokens and the network gradually predicts all tokens based on partial results. We show this framework can support different decoding strategies, including traditional left-to-right. A new decoding strategy is proposed as an example, which starts from the easiest predictions to difficult ones. Some preliminary results on Aishell and CSJ benchmarks show the possibility to train such a non-autoregressive network for ASR. Especially in Aishell, the proposed method outperformed Kaldi nnet3 and chain model setup and is quite closed to the performance of the start-of-the-art end-to-end model.
Tasks Machine Translation, Speech Recognition
Published 2019-11-10
URL https://arxiv.org/abs/1911.04908v1
PDF https://arxiv.org/pdf/1911.04908v1.pdf
PWC https://paperswithcode.com/paper/non-autoregressive-transformer-automatic
Repo
Framework

Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection

Title Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection
Authors Vicent Sanz Marco, Ben Taylor, Zheng Wang, Yehia Elkhatib
Abstract Deep neural networks ( DNNs ) are becoming a key enabling technology for many application domains. However, on-device inference on battery-powered, resource-constrained embedding systems is often infeasible due to prohibitively long inferencing time and resource requirements of many DNNs. Offloading computation into the cloud is often unacceptable due to privacy concerns, high latency, or the lack of connectivity. While compression algorithms often succeed in reducing inferencing times, they come at the cost of reduced accuracy. This paper presents a new, alternative approach to enable efficient execution of DNNs on embedded devices. Our approach dynamically determines which DNN to use for a given input, by considering the desired accuracy and inference time. It employs machine learning to develop a low-cost predictive model to quickly select a pre-trained DNN to use for a given input and the optimization constraint. We achieve this by first off-line training a predictive model, and then using the learned model to select a DNN model to use for new, unseen inputs. We apply our approach to two representative DNN domains: image classification and machine translation. We evaluate our approach on a Jetson TX2 embedded deep learning platform and consider a range of influential DNN models including convolutional and recurrent neural networks. For image classification, we achieve a 1.8x reduction in inference time with a 7.52% improvement in accuracy, over the most-capable single DNN model. For machine translation, we achieve a 1.34x reduction in inference time over the most-capable single model, with little impact on the quality of translation.
Tasks Image Classification, Machine Translation, Model Selection
Published 2019-11-09
URL https://arxiv.org/abs/1911.04946v1
PDF https://arxiv.org/pdf/1911.04946v1.pdf
PWC https://paperswithcode.com/paper/optimizing-deep-learning-inference-on
Repo
Framework

Self-Attention and Ingredient-Attention Based Model for Recipe Retrieval from Image Queries

Title Self-Attention and Ingredient-Attention Based Model for Recipe Retrieval from Image Queries
Authors Matthias Fontanellaz, Stergios Christodoulidis, Stavroula Mougiakakou
Abstract Direct computer vision based-nutrient content estimation is a demanding task, due to deformation and occlusions of ingredients, as well as high intra-class and low inter-class variability between meal classes. In order to tackle these issues, we propose a system for recipe retrieval from images. The recipe information can subsequently be used to estimate the nutrient content of the meal. In this study, we utilize the multi-modal Recipe1M dataset, which contains over 1 million recipes accompanied by over 13 million images. The proposed model can operate as a first step in an automatic pipeline for the estimation of nutrition content by supporting hints related to ingredient and instruction. Through self-attention, our model can directly process raw recipe text, making the upstream instruction sentence embedding process redundant and thus reducing training time, while providing desirable retrieval results. Furthermore, we propose the use of an ingredient attention mechanism, in order to gain insight into which instructions, parts of instructions or single instruction words are of importance for processing a single ingredient within a certain recipe. Attention-based recipe text encoding contributes to solving the issue of high intra-class/low inter-class variability by focusing on preparation steps specific to the meal. The experimental results demonstrate the potential of such a system for recipe retrieval from images. A comparison with respect to two baseline methods is also presented.
Tasks Sentence Embedding
Published 2019-11-05
URL https://arxiv.org/abs/1911.01770v1
PDF https://arxiv.org/pdf/1911.01770v1.pdf
PWC https://paperswithcode.com/paper/self-attention-and-ingredient-attention-based
Repo
Framework

Theory of neuromorphic computing by waves: machine learning by rogue waves, dispersive shocks, and solitons

Title Theory of neuromorphic computing by waves: machine learning by rogue waves, dispersive shocks, and solitons
Authors Giulia Marcucci, Davide Pierangeli, Claudio Conti
Abstract We study artificial neural networks with nonlinear waves as a computing reservoir. We discuss universality and the conditions to learn a dataset in terms of output channels and nonlinearity. A feed-forward three-layer model, with an encoding input layer, a wave layer, and a decoding readout, behaves as a conventional neural network in approximating mathematical functions, real-world datasets, and universal Boolean gates. The rank of the transmission matrix has a fundamental role in assessing the learning abilities of the wave. For a given set of training points, a threshold nonlinearity for universal interpolation exists. When considering the nonlinear Schroedinger equation, the use of highly nonlinear regimes implies that solitons, rogue, and shock waves do have a leading role in training and computing. Our results may enable the realization of novel machine learning devices by using diverse physical systems, as nonlinear optics, hydrodynamics, polaritonics, and Bose-Einstein condensates. The application of these concepts to photonics opens the way to a large class of accelerators and new computational paradigms. In complex wave systems, as multimodal fibers, integrated optical circuits, random, topological devices, and metasurfaces, nonlinear waves can be employed to perform computation and solve complex combinatorial optimization.
Tasks Combinatorial Optimization
Published 2019-12-15
URL https://arxiv.org/abs/1912.07044v1
PDF https://arxiv.org/pdf/1912.07044v1.pdf
PWC https://paperswithcode.com/paper/theory-of-neuromorphic-computing-by-waves
Repo
Framework

How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?

Title How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?
Authors Yun Chen, Liangyou Li, Xin Jiang, Xiao Chen, Qun Liu
Abstract Despite the success of neural machine translation (NMT), simultaneous neural machine translation (SNMT), the task of translating in real time before a full sentence has been observed, remains challenging due to the syntactic structure difference and simultaneity requirements. In this paper, we propose a general framework to improve simultaneous translation with a pretrained consecutive neural machine translation (CNMT) model. Our framework contains two parts: prefix translation that utilizes a pretrained CNMT model to better translate source prefixes and a stopping criterion that determines when to stop the prefix translation. Experiments on three translation corpora and two language pairs show the efficacy of the proposed framework on balancing the quality and latency in simultaneous translation.
Tasks Machine Translation
Published 2019-11-08
URL https://arxiv.org/abs/1911.03154v1
PDF https://arxiv.org/pdf/1911.03154v1.pdf
PWC https://paperswithcode.com/paper/how-to-do-simultaneous-translation-better
Repo
Framework

Hope Speech Detection: A Computational Analysis of the Voice of Peace

Title Hope Speech Detection: A Computational Analysis of the Voice of Peace
Authors Shriphani Palakodety, Ashiqur R. KhudaBukhsh, Jaime G. Carbonell
Abstract The recent Pulwama terror attack (February 14, 2019, Pulwama, Kashmir) triggered a chain of escalating events between India and Pakistan adding another episode to their 70-year-old dispute over Kashmir. The present era of ubiquitious social media has never seen nuclear powers closer to war. In this paper, we analyze this evolving international crisis via a substantial corpus constructed using comments on YouTube videos (921,235 English comments posted by 392,460 users out of 2.04 million overall comments by 791,289 users on 2,890 videos). Our main contributions in the paper are three-fold. First, we present an observation that polyglot word-embeddings reveal precise and accurate language clusters, and subsequently construct a document language-identification technique with negligible annotation requirements. We demonstrate the viability and utility across a variety of data sets involving several low-resource languages. Second, we present an analysis on temporal trends of pro-peace and pro-war intent observing that when tensions between the two nations were at their peak, pro-peace intent in the corpus was at its highest point. Finally, in the context of heated discussions in a politically tense situation where two nations are at the brink of a full-fledged war, we argue the importance of automatic identification of user-generated web content that can diffuse hostility and address this prediction task, dubbed \emph{hope-speech detection}.
Tasks Language Identification, Word Embeddings
Published 2019-09-11
URL https://arxiv.org/abs/1909.12940v4
PDF https://arxiv.org/pdf/1909.12940v4.pdf
PWC https://paperswithcode.com/paper/kashmir-a-computational-analysis-of-the-voice
Repo
Framework

Learning Global Pairwise Interactions with Bayesian Neural Networks

Title Learning Global Pairwise Interactions with Bayesian Neural Networks
Authors Tianyu Cui, Pekka Marttinen, Samuel Kaski
Abstract Estimating global pairwise interaction effects, i.e., the difference between the joint effect and the sum of marginal effects of two input features, with uncertainty properly quantified, is centrally important in science applications. We propose a non-parametric probabilistic method for detecting interaction effects of unknown form. First, the relationship between the features and the output is modelled using a Bayesian neural network, capable of representing complex interactions and principled uncertainty. Second, interaction effects and their uncertainty are estimated from the trained model. For the second step, we propose an intuitive global interaction measure: Bayesian Group Expected Hessian (GEH), which aggregates information of local interactions as captured by the Hessian. GEH provides a natural trade-off between type I and type II error and, moreover, comes with theoretical guarantees ensuring that the estimated interaction effects and their uncertainty can be improved by training a more accurate BNN. The method empirically outperforms available non-probabilistic alternatives on simulated and real-world data. Finally, we demonstrate its ability to detect interpretable interactions between higher-level features (at deeper layers of the neural network).
Tasks
Published 2019-01-24
URL https://arxiv.org/abs/1901.08361v3
PDF https://arxiv.org/pdf/1901.08361v3.pdf
PWC https://paperswithcode.com/paper/recovering-pairwise-interactions-using-neural
Repo
Framework

Multi-scale fully convolutional neural networks for histopathology image segmentation: from nuclear aberrations to the global tissue architecture

Title Multi-scale fully convolutional neural networks for histopathology image segmentation: from nuclear aberrations to the global tissue architecture
Authors Rüdiger Schmitz, Frederic Madesta, Maximilian Nielsen, René Werner, Thomas Rösch
Abstract Histopathologic diagnosis is dependent on simultaneous information from a broad range of scales, ranging from nuclear aberrations ($\approx \mathcal{O}(0.1 \mu m)$) over cellular structures ($\approx \mathcal{O}(10\mu m)$) to the global tissue architecture ($\gtrapprox \mathcal{O}(1 mm)$). Bearing in mind which information is employed by human pathologists, we introduce and examine different strategies for the integration of multiple and widely separate spatial scales into common U-Net-based architectures. Based on this, we present a family of new, end-to-end trainable, multi-scale multi-encoder fully-convolutional neural networks for human modus operandi-inspired computer vision in histopathology.
Tasks Semantic Segmentation
Published 2019-09-24
URL https://arxiv.org/abs/1909.10726v1
PDF https://arxiv.org/pdf/1909.10726v1.pdf
PWC https://paperswithcode.com/paper/multi-scale-fully-convolutional-neural
Repo
Framework

Machine Learning for Generalizable Prediction of Flood Susceptibility

Title Machine Learning for Generalizable Prediction of Flood Susceptibility
Authors Chelsea Sidrane, Dylan J Fitzpatrick, Andrew Annex, Diane O’Donoghue, Yarin Gal, Piotr Biliński
Abstract Flooding is a destructive and dangerous hazard and climate change appears to be increasing the frequency of catastrophic flooding events around the world. Physics-based flood models are costly to calibrate and are rarely generalizable across different river basins, as model outputs are sensitive to site-specific parameters and human-regulated infrastructure. In contrast, statistical models implicitly account for such factors through the data on which they are trained. Such models trained primarily from remotely-sensed Earth observation data could reduce the need for extensive in-situ measurements. In this work, we develop generalizable, multi-basin models of river flooding susceptibility using geographically-distributed data from the USGS stream gauge network. Machine learning models are trained in a supervised framework to predict two measures of flood susceptibility from a mix of river basin attributes, impervious surface cover information derived from satellite imagery, and historical records of rainfall and stream height. We report prediction performance of multiple models using precision-recall curves, and compare with performance of naive baselines. This work on multi-basin flood prediction represents a step in the direction of making flood prediction accessible to all at-risk communities.
Tasks
Published 2019-10-15
URL https://arxiv.org/abs/1910.06521v1
PDF https://arxiv.org/pdf/1910.06521v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-generalizable-prediction
Repo
Framework

Multi-Channel Neural Network for Assessing Neonatal Pain from Videos

Title Multi-Channel Neural Network for Assessing Neonatal Pain from Videos
Authors Md Sirajus Salekin, Ghada Zamzmi, Dmitry Goldgof, Rangachar Kasturi, Thao Ho, Yu Sun
Abstract Neonates do not have the ability to either articulate pain or communicate it non-verbally by pointing. The current clinical standard for assessing neonatal pain is intermittent and highly subjective. This discontinuity and subjectivity can lead to inconsistent assessment, and therefore, inadequate treatment. In this paper, we propose a multi-channel deep learning framework for assessing neonatal pain from videos. The proposed framework integrates information from two pain indicators or channels, namely facial expression and body movement, using convolutional neural network (CNN). It also integrates temporal information using a recurrent neural network (LSTM). The experimental results prove the efficiency and superiority of the proposed temporal and multi-channel framework as compared to existing similar methods.
Tasks
Published 2019-08-25
URL https://arxiv.org/abs/1908.09254v1
PDF https://arxiv.org/pdf/1908.09254v1.pdf
PWC https://paperswithcode.com/paper/multi-channel-neural-network-for-assessing
Repo
Framework

RLCache: Automated Cache Management Using Reinforcement Learning

Title RLCache: Automated Cache Management Using Reinforcement Learning
Authors Sami Alabed
Abstract This study investigates the use of reinforcement learning to guide a general purpose cache manager decisions. Cache managers directly impact the overall performance of computer systems. They govern decisions about which objects should be cached, the duration they should be cached for, and decides on which objects to evict from the cache if it is full. These three decisions impact both the cache hit rate and size of the storage that is needed to achieve that cache hit rate. An optimal cache manager will avoid unnecessary operations, maximise the cache hit rate which results in fewer round trips to a slower backend storage system, and minimise the size of storage needed to achieve a high hit-rate. This project investigates using reinforcement learning in cache management by designing three separate agents for each of the cache manager tasks. Furthermore, the project investigates two advanced reinforcement learning architectures for multi-decision problems: a single multi-task agent and a multi-agent. We also introduce a framework to simplify the modelling of computer systems problems as a reinforcement learning task. The framework abstracts delayed experiences observations and reward assignment in computer systems while providing a flexible way to scale to multiple agents. Simulation results based on an established database benchmark system show that reinforcement learning agents can achieve a higher cache hit rate over heuristic driven algorithms while minimising the needed space. They are also able to adapt to a changing workload and dynamically adjust their caching strategy accordingly. The proposed cache manager model is generic and applicable to other types of caches, such as file system caches. This project is the first, to our knowledge, to model cache manager decisions as a multi-task control problem.
Tasks
Published 2019-09-30
URL https://arxiv.org/abs/1909.13839v1
PDF https://arxiv.org/pdf/1909.13839v1.pdf
PWC https://paperswithcode.com/paper/rlcache-automated-cache-management-using
Repo
Framework

Feature Fusion Use Unsupervised Prior Knowledge to Let Small Object Represent

Title Feature Fusion Use Unsupervised Prior Knowledge to Let Small Object Represent
Authors Tian Liu, Lichun Wang, Shaofan Wang
Abstract Fusing low level and high level features is a widely used strategy to provide details that might be missing during convolution and pooling. Different from previous works, we propose a new fusion mechanism called FillIn which takes advantage of prior knowledge described with superpixel segmentation. According to the prior knowledge, the FillIn chooses small region on low level feature map to fill into high level feature map. By using the proposed fusion mechanism, the low level features have equal channels for some tiny region as high level features, which makes the low level features have relatively independent power to decide final semantic label. We demonstrate the effectiveness of our model on PASCAL VOC 2012, it achieves competitive test result based on DeepLabv3+ backbone and visualizations of predictions prove our fusion can let small objects represent and low level features have potential for segmenting small objects.
Tasks
Published 2019-12-17
URL https://arxiv.org/abs/1912.08059v1
PDF https://arxiv.org/pdf/1912.08059v1.pdf
PWC https://paperswithcode.com/paper/feature-fusion-use-unsupervised-prior
Repo
Framework
comments powered by Disqus