May 7, 2019

3073 words 15 mins read

Paper Group ANR 124

Paper Group ANR 124

Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets. What to do about non-standard (or non-canonical) language in NLP. Churn analysis using deep convolutional neural networks and autoencoders. Prediction of Seasonal Temperature Using Soft Computing Techniques: Application in Benevento (Southern Italy) Area. lpopt: A …

Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets

Title Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets
Authors Maciej Wielgosz, Andrzej Skoczeń, Matej Mertik
Abstract The superconducting LHC magnets are coupled with an electronic monitoring system which records and analyses voltage time series reflecting their performance. A currently used system is based on a range of preprogrammed triggers which launches protection procedures when a misbehavior of the magnets is detected. All the procedures used in the protection equipment were designed and implemented according to known working scenarios of the system and are updated and monitored by human operators. This paper proposes a novel approach to monitoring and fault protection of the Large Hadron Collider (LHC) superconducting magnets which employs state-of-the-art Deep Learning algorithms. Consequently, the authors of the paper decided to examine the performance of LSTM recurrent neural networks for modeling of voltage time series of the magnets. In order to address this challenging task different network architectures and hyper-parameters were used to achieve the best possible performance of the solution. The regression results were measured in terms of RMSE for different number of future steps and history length taken into account for the prediction. The best result of RMSE=0.00104 was obtained for a network of 128 LSTM cells within the internal layer and 16 steps history buffer.
Tasks Time Series
Published 2016-11-18
URL http://arxiv.org/abs/1611.06241v2
PDF http://arxiv.org/pdf/1611.06241v2.pdf
PWC https://paperswithcode.com/paper/using-lstm-recurrent-neural-networks-for
Repo
Framework

What to do about non-standard (or non-canonical) language in NLP

Title What to do about non-standard (or non-canonical) language in NLP
Authors Barbara Plank
Abstract Real world data differs radically from the benchmark corpora we use in natural language processing (NLP). As soon as we apply our technologies to the real world, performance drops. The reason for this problem is obvious: NLP models are trained on samples from a limited set of canonical varieties that are considered standard, most prominently English newswire. However, there are many dimensions, e.g., socio-demographics, language, genre, sentence type, etc. on which texts can differ from the standard. The solution is not obvious: we cannot control for all factors, and it is not clear how to best go beyond the current practice of training on homogeneous data from a single domain and language. In this paper, I review the notion of canonicity, and how it shapes our community’s approach to language. I argue for leveraging what I call fortuitous data, i.e., non-obvious data that is hitherto neglected, hidden in plain sight, or raw data that needs to be refined. If we embrace the variety of this heterogeneous data by combining it with proper algorithms, we will not only produce more robust models, but will also enable adaptive language technology capable of addressing natural language variation.
Tasks
Published 2016-08-28
URL http://arxiv.org/abs/1608.07836v1
PDF http://arxiv.org/pdf/1608.07836v1.pdf
PWC https://paperswithcode.com/paper/what-to-do-about-non-standard-or-non
Repo
Framework

Churn analysis using deep convolutional neural networks and autoencoders

Title Churn analysis using deep convolutional neural networks and autoencoders
Authors Artit Wangperawong, Cyrille Brun, Olav Laudy, Rujikorn Pavasuthipaisit
Abstract Customer temporal behavioral data was represented as images in order to perform churn prediction by leveraging deep learning architectures prominent in image classification. Supervised learning was performed on labeled data of over 6 million customers using deep convolutional neural networks, which achieved an AUC of 0.743 on the test dataset using no more than 12 temporal features for each customer. Unsupervised learning was conducted using autoencoders to better understand the reasons for customer churn. Images that maximally activate the hidden units of an autoencoder trained with churned customers reveal ample opportunities for action to be taken to prevent churn among strong data, no voice users.
Tasks Image Classification
Published 2016-04-18
URL http://arxiv.org/abs/1604.05377v1
PDF http://arxiv.org/pdf/1604.05377v1.pdf
PWC https://paperswithcode.com/paper/churn-analysis-using-deep-convolutional
Repo
Framework

Prediction of Seasonal Temperature Using Soft Computing Techniques: Application in Benevento (Southern Italy) Area

Title Prediction of Seasonal Temperature Using Soft Computing Techniques: Application in Benevento (Southern Italy) Area
Authors Salvatore Rampone, Alessio Valente
Abstract In this work two soft computing methods, Artificial Neural Networks and Genetic Programming, are proposed in order to forecast the mean temperature that will occur in future seasons. The area in which the soft computing techniques were applied is that of the surroundings of the town of Benevento, in the south of Italy, having geographic coordinates (lat. 41{\deg}07’50"N; long.14{\deg}47’13"E). This area is not affected by maritime influences as well as by winds coming from the west. The methods are fed by data recorded in the meteorological stations of Benevento and Castelvenere, located in the hilly area, which characterizes the territory surrounding this city, at 144 m a.s.l. Both the applied methods show low error rates, while the Genetic Programming offers an explicit rule representation (a formula) explaining the prevision. Keywords Seasonal Temperature Forecasting; Soft Computing; Artificial Neural Networks; Genetic Programming; Southern Italy.
Tasks
Published 2016-11-15
URL http://arxiv.org/abs/1611.04767v1
PDF http://arxiv.org/pdf/1611.04767v1.pdf
PWC https://paperswithcode.com/paper/prediction-of-seasonal-temperature-using-soft
Repo
Framework

lpopt: A Rule Optimization Tool for Answer Set Programming

Title lpopt: A Rule Optimization Tool for Answer Set Programming
Authors Manuel Bichler, Michael Morak, Stefan Woltran
Abstract State-of-the-art answer set programming (ASP) solvers rely on a program called a grounder to convert non-ground programs containing variables into variable-free, propositional programs. The size of this grounding depends heavily on the size of the non-ground rules, and thus, reducing the size of such rules is a promising approach to improve solving performance. To this end, in this paper we announce lpopt, a tool that decomposes large logic programming rules into smaller rules that are easier to handle for current solvers. The tool is specifically tailored to handle the standard syntax of the ASP language (ASP-Core) and makes it easier for users to write efficient and intuitive ASP programs, which would otherwise often require significant hand-tuning by expert ASP engineers. It is based on an idea proposed by Morak and Woltran (2012) that we extend significantly in order to handle the full ASP syntax, including complex constructs like aggregates, weak constraints, and arithmetic expressions. We present the algorithm, the theoretical foundations on how to treat these constructs, as well as an experimental evaluation showing the viability of our approach.
Tasks
Published 2016-08-19
URL http://arxiv.org/abs/1608.05675v2
PDF http://arxiv.org/pdf/1608.05675v2.pdf
PWC https://paperswithcode.com/paper/lpopt-a-rule-optimization-tool-for-answer-set
Repo
Framework
Title Driving CDCL Search
Authors Carmine Dodaro, Philip Gasteiger, Nicola Leone, Benjamin Musitsch, Francesco Ricca, Konstantin Schekotihin
Abstract The CDCL algorithm is the leading solution adopted by state-of-the-art solvers for SAT, SMT, ASP, and others. Experiments show that the performance of CDCL solvers can be significantly boosted by embedding domain-specific heuristics, especially on large real-world problems. However, a proper integration of such criteria in off-the-shelf CDCL implementations is not obvious. In this paper, we distill the key ingredients that drive the search of CDCL solvers, and propose a general framework for designing and implementing new heuristics. We implemented our strategy in an ASP solver, and we experimented on two industrial domains. On hard problem instances, state-of-the-art implementations fail to find any solution in acceptable time, whereas our implementation is very successful and finds all solutions.
Tasks
Published 2016-11-16
URL http://arxiv.org/abs/1611.05190v1
PDF http://arxiv.org/pdf/1611.05190v1.pdf
PWC https://paperswithcode.com/paper/driving-cdcl-search
Repo
Framework

Spontaneous Subtle Expression Detection and Recognition based on Facial Strain

Title Spontaneous Subtle Expression Detection and Recognition based on Facial Strain
Authors Sze-Teng Liong, John See, Raphael Chung-Wei Phan, Yee-Hui Oh, Anh Cat Le Ngo, KokSheik Wong, Su-Wei Tan
Abstract Optical strain is an extension of optical flow that is capable of quantifying subtle changes on faces and representing the minute facial motion intensities at the pixel level. This is computationally essential for the relatively new field of spontaneous micro-expression, where subtle expressions can be technically challenging to pinpoint. In this paper, we present a novel method for detecting and recognizing micro-expressions by utilizing facial optical strain magnitudes to construct optical strain features and optical strain weighted features. The two sets of features are then concatenated to form the resultant feature histogram. Experiments were performed on the CASME II and SMIC databases. We demonstrate on both databases, the usefulness of optical strain information and more importantly, that our best approaches are able to outperform the original baseline results for both detection and recognition tasks. A comparison of the proposed method with other existing spatio-temporal feature extraction approaches is also presented.
Tasks Optical Flow Estimation
Published 2016-06-09
URL http://arxiv.org/abs/1606.02792v1
PDF http://arxiv.org/pdf/1606.02792v1.pdf
PWC https://paperswithcode.com/paper/spontaneous-subtle-expression-detection-and
Repo
Framework

Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration

Title Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration
Authors Himabindu Lakkaraju, Ece Kamar, Rich Caruana, Eric Horvitz
Abstract Predictive models deployed in the real world may assign incorrect labels to instances with high confidence. Such errors or unknown unknowns are rooted in model incompleteness, and typically arise because of the mismatch between training data and the cases encountered at test time. As the models are blind to such errors, input from an oracle is needed to identify these failures. In this paper, we formulate and address the problem of informed discovery of unknown unknowns of any given predictive model where unknown unknowns occur due to systematic biases in the training data. We propose a model-agnostic methodology which uses feedback from an oracle to both identify unknown unknowns and to intelligently guide the discovery. We employ a two-phase approach which first organizes the data into multiple partitions based on the feature similarity of instances and the confidence scores assigned by the predictive model, and then utilizes an explore-exploit strategy for discovering unknown unknowns across these partitions. We demonstrate the efficacy of our framework by varying the underlying causes of unknown unknowns across various applications. To the best of our knowledge, this paper presents the first algorithmic approach to the problem of discovering unknown unknowns of predictive models.
Tasks
Published 2016-10-28
URL http://arxiv.org/abs/1610.09064v3
PDF http://arxiv.org/pdf/1610.09064v3.pdf
PWC https://paperswithcode.com/paper/identifying-unknown-unknowns-in-the-open
Repo
Framework

Weakly supervised spoken term discovery using cross-lingual side information

Title Weakly supervised spoken term discovery using cross-lingual side information
Authors Sameer Bansal, Herman Kamper, Sharon Goldwater, Adam Lopez
Abstract Recent work on unsupervised term discovery (UTD) aims to identify and cluster repeated word-like units from audio alone. These systems are promising for some very low-resource languages where transcribed audio is unavailable, or where no written form of the language exists. However, in some cases it may still be feasible (e.g., through crowdsourcing) to obtain (possibly noisy) text translations of the audio. If so, this information could be used as a source of side information to improve UTD. Here, we present a simple method for rescoring the output of a UTD system using text translations, and test it on a corpus of Spanish audio with English translations. We show that it greatly improves the average precision of the results over a wide range of system configurations and data preprocessing methods.
Tasks
Published 2016-09-21
URL http://arxiv.org/abs/1609.06530v1
PDF http://arxiv.org/pdf/1609.06530v1.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-spoken-term-discovery-using
Repo
Framework

Data-driven root-cause analysis for distributed system anomalies

Title Data-driven root-cause analysis for distributed system anomalies
Authors Chao Liu, Kin Gwn Lore, Soumik Sarkar
Abstract Modern distributed cyber-physical systems encounter a large variety of anomalies and in many cases, they are vulnerable to catastrophic fault propagation scenarios due to strong connectivity among the sub-systems. In this regard, root-cause analysis becomes highly intractable due to complex fault propagation mechanisms in combination with diverse operating modes. This paper presents a new data-driven framework for root-cause analysis for addressing such issues. The framework is based on a spatiotemporal feature extraction scheme for distributed cyber-physical systems built on the concept of symbolic dynamics for discovering and representing causal interactions among subsystems of a complex system. We present two approaches for root-cause analysis, namely the sequential state switching ($S^3$, based on free energy concept of a Restricted Boltzmann Machine, RBM) and artificial anomaly association ($A^3$, a multi-class classification framework using deep neural networks, DNN). Synthetic data from cases with failed pattern(s) and anomalous node are simulated to validate the proposed approaches, then compared with the performance of vector autoregressive (VAR) model-based root-cause analysis. Real dataset based on Tennessee Eastman process (TEP) is also used for validation. The results show that: (1) $S^3$ and $A^3$ approaches can obtain high accuracy in root-cause analysis and successfully handle multiple nominal operation modes, and (2) the proposed tool-chain is shown to be scalable while maintaining high accuracy.
Tasks
Published 2016-05-20
URL http://arxiv.org/abs/1605.06421v2
PDF http://arxiv.org/pdf/1605.06421v2.pdf
PWC https://paperswithcode.com/paper/data-driven-root-cause-analysis-for
Repo
Framework

Semi-parametric Order-based Generalized Multivariate Regression

Title Semi-parametric Order-based Generalized Multivariate Regression
Authors Milad Kharratzadeh, Mark Coates
Abstract In this paper, we consider a generalized multivariate regression problem where the responses are monotonic functions of linear transformations of predictors. We propose a semi-parametric algorithm based on the ordering of the responses which is invariant to the functional form of the transformation function. We prove that our algorithm, which maximizes the rank correlation of responses and linear transformations of predictors, is a consistent estimator of the true coefficient matrix. We also identify the rate of convergence and show that the squared estimation error decays with a rate of $o(1/\sqrt{n})$. We then propose a greedy algorithm to maximize the highly non-smooth objective function of our model and examine its performance through extensive simulations. Finally, we compare our algorithm with traditional multivariate regression algorithms over synthetic and real data.
Tasks
Published 2016-02-19
URL http://arxiv.org/abs/1602.06276v1
PDF http://arxiv.org/pdf/1602.06276v1.pdf
PWC https://paperswithcode.com/paper/semi-parametric-order-based-generalized
Repo
Framework

Machine Translation Evaluation Resources and Methods: A Survey

Title Machine Translation Evaluation Resources and Methods: A Survey
Authors Lifeng Han
Abstract We introduce the Machine Translation (MT) evaluation survey that contains both manual and automatic evaluation methods. The traditional human evaluation criteria mainly include the intelligibility, fidelity, fluency, adequacy, comprehension, and informativeness. The advanced human assessments include task-oriented measures, post-editing, segment ranking, and extended criteriea, etc. We classify the automatic evaluation methods into two categories, including lexical similarity scenario and linguistic features application. The lexical similarity methods contain edit distance, precision, recall, F-measure, and word order. The linguistic features can be divided into syntactic features and semantic features respectively. The syntactic features include part of speech tag, phrase types and sentence structures, and the semantic features include named entity, synonyms, textual entailment, paraphrase, semantic roles, and language models. The deep learning models for evaluation are very newly proposed. Subsequently, we also introduce the evaluation methods for MT evaluation including different correlation scores, and the recent quality estimation (QE) tasks for MT. This paper differs from the existing works \cite{GALEprogram2009,EuroMatrixProject2007} from several aspects, by introducing some recent development of MT evaluation measures, the different classifications from manual to automatic evaluation measures, the introduction of recent QE tasks of MT, and the concise construction of the content. We hope this work will be helpful for MT researchers to easily pick up some metrics that are best suitable for their specific MT model development, and help MT evaluation researchers to get a general clue of how MT evaluation research developed. Furthermore, hopefully, this work can also shine some light on other evaluation tasks, except for translation, of NLP fields.
Tasks Machine Translation, Natural Language Inference
Published 2016-05-15
URL http://arxiv.org/abs/1605.04515v8
PDF http://arxiv.org/pdf/1605.04515v8.pdf
PWC https://paperswithcode.com/paper/machine-translation-evaluation-resources-and
Repo
Framework

Dense Prediction on Sequences with Time-Dilated Convolutions for Speech Recognition

Title Dense Prediction on Sequences with Time-Dilated Convolutions for Speech Recognition
Authors Tom Sercu, Vaibhava Goel
Abstract In computer vision pixelwise dense prediction is the task of predicting a label for each pixel in the image. Convolutional neural networks achieve good performance on this task, while being computationally efficient. In this paper we carry these ideas over to the problem of assigning a sequence of labels to a set of speech frames, a task commonly known as framewise classification. We show that dense prediction view of framewise classification offers several advantages and insights, including computational efficiency and the ability to apply batch normalization. When doing dense prediction we pay specific attention to strided pooling in time and introduce an asymmetric dilated convolution, called time-dilated convolution, that allows for efficient and elegant implementation of pooling in time. We show results using time-dilated convolutions in a very deep VGG-style CNN with batch normalization on the Hub5 Switchboard-2000 benchmark task. With a big n-gram language model, we achieve 7.7% WER which is the best single model single-pass performance reported so far.
Tasks Language Modelling, Speech Recognition
Published 2016-11-28
URL http://arxiv.org/abs/1611.09288v2
PDF http://arxiv.org/pdf/1611.09288v2.pdf
PWC https://paperswithcode.com/paper/dense-prediction-on-sequences-with-time
Repo
Framework

Recognition of Visually Perceived Compositional Human Actions by Multiple Spatio-Temporal Scales Recurrent Neural Networks

Title Recognition of Visually Perceived Compositional Human Actions by Multiple Spatio-Temporal Scales Recurrent Neural Networks
Authors Haanvid Lee, Minju Jung, Jun Tani
Abstract The current paper proposes a novel neural network model for recognizing visually perceived human actions. The proposed multiple spatio-temporal scales recurrent neural network (MSTRNN) model is derived by introducing multiple timescale recurrent dynamics to the conventional convolutional neural network model. One of the essential characteristics of the MSTRNN is that its architecture imposes both spatial and temporal constraints simultaneously on the neural activity which vary in multiple scales among different layers. As suggested by the principle of the upward and downward causation, it is assumed that the network can develop meaningful structures such as functional hierarchy by taking advantage of such constraints during the course of learning. To evaluate the characteristics of the model, the current study uses three types of human action video dataset consisting of different types of primitive actions and different levels of compositionality on them. The performance of the MSTRNN in testing with these dataset is compared with the ones by other representative deep learning models used in the field. The analysis of the internal representation obtained through the learning with the dataset clarifies what sorts of functional hierarchy can be developed by extracting the essential compositionality underlying the dataset.
Tasks
Published 2016-02-05
URL http://arxiv.org/abs/1602.01921v3
PDF http://arxiv.org/pdf/1602.01921v3.pdf
PWC https://paperswithcode.com/paper/recognition-of-visually-perceived
Repo
Framework

Easy-First Dependency Parsing with Hierarchical Tree LSTMs

Title Easy-First Dependency Parsing with Hierarchical Tree LSTMs
Authors Eliyahu Kiperwasser, Yoav Goldberg
Abstract We suggest a compositional vector representation of parse trees that relies on a recursive combination of recurrent-neural network encoders. To demonstrate its effectiveness, we use the representation as the backbone of a greedy, bottom-up dependency parser, achieving state-of-the-art accuracies for English and Chinese, without relying on external word embeddings. The parser’s implementation is available for download at the first author’s webpage.
Tasks Dependency Parsing, Word Embeddings
Published 2016-03-01
URL http://arxiv.org/abs/1603.00375v2
PDF http://arxiv.org/pdf/1603.00375v2.pdf
PWC https://paperswithcode.com/paper/easy-first-dependency-parsing-with
Repo
Framework
comments powered by Disqus