Paper Group ANR 345
Interpretable Two-level Boolean Rule Learning for Classification. PSF : Introduction to R Package for Pattern Sequence Based Forecasting Algorithm. Entity Embedding-based Anomaly Detection for Heterogeneous Categorical Events. Right Ideals of a Ring and Sublanguages of Science. Challenges in Bayesian Adaptive Data Analysis. Open-Vocabulary Semantic …
Interpretable Two-level Boolean Rule Learning for Classification
Title | Interpretable Two-level Boolean Rule Learning for Classification |
Authors | Guolong Su, Dennis Wei, Kush R. Varshney, Dmitry M. Malioutov |
Abstract | As a contribution to interpretable machine learning research, we develop a novel optimization framework for learning accurate and sparse two-level Boolean rules. We consider rules in both conjunctive normal form (AND-of-ORs) and disjunctive normal form (OR-of-ANDs). A principled objective function is proposed to trade classification accuracy and interpretability, where we use Hamming loss to characterize accuracy and sparsity to characterize interpretability. We propose efficient procedures to optimize these objectives based on linear programming (LP) relaxation, block coordinate descent, and alternating minimization. Experiments show that our new algorithms provide very good tradeoffs between accuracy and interpretability. |
Tasks | Interpretable Machine Learning |
Published | 2016-06-18 |
URL | http://arxiv.org/abs/1606.05798v1 |
http://arxiv.org/pdf/1606.05798v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-two-level-boolean-rule-learning-1 |
Repo | |
Framework | |
PSF : Introduction to R Package for Pattern Sequence Based Forecasting Algorithm
Title | PSF : Introduction to R Package for Pattern Sequence Based Forecasting Algorithm |
Authors | Neeraj Bokde, Gualberto Asencio-Cortés, Francisco Martínez-Álvarez, Kishore Kulat |
Abstract | This paper discusses about an R package that implements the Pattern Sequence based Forecasting (PSF) algorithm, which was developed for univariate time series forecasting. This algorithm has been successfully applied to many different fields. The PSF algorithm consists of two major parts: clustering and prediction. The clustering part includes selection of the optimum number of clusters. It labels time series data with reference to such clusters. The prediction part includes functions like optimum window size selection for specific patterns and prediction of future values with reference to past pattern sequences. The PSF package consists of various functions to implement the PSF algorithm. It also contains a function which automates all other functions to obtain optimized prediction results. The aim of this package is to promote the PSF algorithm and to ease its implementation with minimum efforts. This paper describes all the functions in the PSF package with their syntax. It also provides a simple example of usage. Finally, the usefulness of this package is discussed by comparing it to auto.arima and ets, well-known time series forecasting functions available on CRAN repository. |
Tasks | Time Series, Time Series Forecasting |
Published | 2016-06-17 |
URL | http://arxiv.org/abs/1606.05492v3 |
http://arxiv.org/pdf/1606.05492v3.pdf | |
PWC | https://paperswithcode.com/paper/psf-introduction-to-r-package-for-pattern |
Repo | |
Framework | |
Entity Embedding-based Anomaly Detection for Heterogeneous Categorical Events
Title | Entity Embedding-based Anomaly Detection for Heterogeneous Categorical Events |
Authors | Ting Chen, Lu-An Tang, Yizhou Sun, Zhengzhang Chen, Kai Zhang |
Abstract | Anomaly detection plays an important role in modern data-driven security applications, such as detecting suspicious access to a socket from a process. In many cases, such events can be described as a collection of categorical values that are considered as entities of different types, which we call heterogeneous categorical events. Due to the lack of intrinsic distance measures among entities, and the exponentially large event space, most existing work relies heavily on heuristics to calculate abnormal scores for events. Different from previous work, we propose a principled and unified probabilistic model APE (Anomaly detection via Probabilistic pairwise interaction and Entity embedding) that directly models the likelihood of events. In this model, we embed entities into a common latent space using their observed co-occurrence in different events. More specifically, we first model the compatibility of each pair of entities according to their embeddings. Then we utilize the weighted pairwise interactions of different entity types to define the event probability. Using Noise-Contrastive Estimation with “context-dependent” noise distribution, our model can be learned efficiently regardless of the large event space. Experimental results on real enterprise surveillance data show that our methods can accurately detect abnormal events compared to other state-of-the-art abnormal detection techniques. |
Tasks | Anomaly Detection |
Published | 2016-08-26 |
URL | http://arxiv.org/abs/1608.07502v1 |
http://arxiv.org/pdf/1608.07502v1.pdf | |
PWC | https://paperswithcode.com/paper/entity-embedding-based-anomaly-detection-for |
Repo | |
Framework | |
Right Ideals of a Ring and Sublanguages of Science
Title | Right Ideals of a Ring and Sublanguages of Science |
Authors | Javier Arias Navarro |
Abstract | Among Zellig Harris’s numerous contributions to linguistics his theory of the sublanguages of science probably ranks among the most underrated. However, not only has this theory led to some exhaustive and meaningful applications in the study of the grammar of immunology language and its changes over time, but it also illustrates the nature of mathematical relations between chunks or subsets of a grammar and the language as a whole. This becomes most clear when dealing with the connection between metalanguage and language, as well as when reflecting on operators. This paper tries to justify the claim that the sublanguages of science stand in a particular algebraic relation to the rest of the language they are embedded in, namely, that of right ideals in a ring. |
Tasks | |
Published | 2016-03-03 |
URL | http://arxiv.org/abs/1603.01032v2 |
http://arxiv.org/pdf/1603.01032v2.pdf | |
PWC | https://paperswithcode.com/paper/right-ideals-of-a-ring-and-sublanguages-of |
Repo | |
Framework | |
Challenges in Bayesian Adaptive Data Analysis
Title | Challenges in Bayesian Adaptive Data Analysis |
Authors | Sam Elder |
Abstract | Traditional statistical analysis requires that the analysis process and data are independent. By contrast, the new field of adaptive data analysis hopes to understand and provide algorithms and accuracy guarantees for research as it is commonly performed in practice, as an iterative process of interacting repeatedly with the same data set, such as repeated tests against a holdout set. Previous work has defined a model with a rather strong lower bound on sample complexity in terms of the number of queries, $n\sim\sqrt q$, arguing that adaptive data analysis is much harder than static data analysis, where $n\sim\log q$ is possible. Instead, we argue that those strong lower bounds point to a limitation of the previous model in that it must consider wildly asymmetric scenarios which do not hold in typical applications. To better understand other difficulties of adaptivity, we propose a new Bayesian version of the problem that mandates symmetry. Since the other lower bound techniques are ruled out, we can more effectively see difficulties that might otherwise be overshadowed. As a first contribution to this model, we produce a new problem using error-correcting codes on which a large family of methods, including all previously proposed algorithms, require roughly $n\sim\sqrt[4]q$. These early results illustrate new difficulties in adaptive data analysis regarding slightly correlated queries on problems with concentrated uncertainty. |
Tasks | |
Published | 2016-04-08 |
URL | http://arxiv.org/abs/1604.02492v5 |
http://arxiv.org/pdf/1604.02492v5.pdf | |
PWC | https://paperswithcode.com/paper/challenges-in-bayesian-adaptive-data-analysis |
Repo | |
Framework | |
Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge
Title | Open-Vocabulary Semantic Parsing with both Distributional Statistics and Formal Knowledge |
Authors | Matt Gardner, Jayant Krishnamurthy |
Abstract | Traditional semantic parsers map language onto compositional, executable queries in a fixed schema. This mapping allows them to effectively leverage the information contained in large, formal knowledge bases (KBs, e.g., Freebase) to answer questions, but it is also fundamentally limiting—these semantic parsers can only assign meaning to language that falls within the KB’s manually-produced schema. Recently proposed methods for open vocabulary semantic parsing overcome this limitation by learning execution models for arbitrary language, essentially using a text corpus as a kind of knowledge base. However, all prior approaches to open vocabulary semantic parsing replace a formal KB with textual information, making no use of the KB in their models. We show how to combine the disparate representations used by these two approaches, presenting for the first time a semantic parser that (1) produces compositional, executable representations of language, (2) can successfully leverage the information contained in both a formal KB and a large corpus, and (3) is not limited to the schema of the underlying KB. We demonstrate significantly improved performance over state-of-the-art baselines on an open-domain natural language question answering task. |
Tasks | Question Answering, Semantic Parsing |
Published | 2016-07-12 |
URL | http://arxiv.org/abs/1607.03542v2 |
http://arxiv.org/pdf/1607.03542v2.pdf | |
PWC | https://paperswithcode.com/paper/open-vocabulary-semantic-parsing-with-both |
Repo | |
Framework | |
Effective Connectivity-Based Neural Decoding: A Causal Interaction-Driven Approach
Title | Effective Connectivity-Based Neural Decoding: A Causal Interaction-Driven Approach |
Authors | Saba Emrani, Hamid Krim |
Abstract | We propose a geometric model-free causality measurebased on multivariate delay embedding that can efficiently detect linear and nonlinear causal interactions between time series with no prior information. We then exploit the proposed causal interaction measure in real MEG data analysis. The results are used to construct effective connectivity maps of brain activity to decode different categories of visual stimuli. Moreover, we discovered that the MEG-based effective connectivity maps as a response to structured images exhibit more geometric patterns, as disclosed by analyzing the evolution of toplogical structures of the underlying networks using persistent homology. Extensive simulation and experimental result have been carried out to substantiate the capabilities of the proposed approach. |
Tasks | Time Series |
Published | 2016-07-24 |
URL | http://arxiv.org/abs/1607.07078v1 |
http://arxiv.org/pdf/1607.07078v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-connectivity-based-neural-decoding |
Repo | |
Framework | |
Fuzzy clustering of distribution-valued data using adaptive L2 Wasserstein distances
Title | Fuzzy clustering of distribution-valued data using adaptive L2 Wasserstein distances |
Authors | Antonio Irpino, Francisco De Carvalho, Rosanna Verde |
Abstract | Distributional (or distribution-valued) data are a new type of data arising from several sources and are considered as realizations of distributional variables. A new set of fuzzy c-means algorithms for data described by distributional variables is proposed. The algorithms use the $L2$ Wasserstein distance between distributions as dissimilarity measures. Beside the extension of the fuzzy c-means algorithm for distributional data, and considering a decomposition of the squared $L2$ Wasserstein distance, we propose a set of algorithms using different automatic way to compute the weights associated with the variables as well as with their components, globally or cluster-wise. The relevance weights are computed in the clustering process introducing product-to-one constraints. The relevance weights induce adaptive distances expressing the importance of each variable or of each component in the clustering process, acting also as a variable selection method in clustering. We have tested the proposed algorithms on artificial and real-world data. Results confirm that the proposed methods are able to better take into account the cluster structure of the data with respect to the standard fuzzy c-means, with non-adaptive distances. |
Tasks | |
Published | 2016-05-02 |
URL | http://arxiv.org/abs/1605.00513v1 |
http://arxiv.org/pdf/1605.00513v1.pdf | |
PWC | https://paperswithcode.com/paper/fuzzy-clustering-of-distribution-valued-data |
Repo | |
Framework | |
An Unsupervised Method for Detection and Validation of The Optic Disc and The Fovea
Title | An Unsupervised Method for Detection and Validation of The Optic Disc and The Fovea |
Authors | Mrinal Haloi, Samarendra Dandapat, Rohit Sinha |
Abstract | In this work, we have presented a novel method for detection of retinal image features, the optic disc and the fovea, from colour fundus photographs of dilated eyes for Computer-aided Diagnosis(CAD) system. A saliency map based method was used to detect the optic disc followed by an unsupervised probabilistic Latent Semantic Analysis for detection validation. The validation concept is based on distinct vessels structures in the optic disc. By using the clinical information of standard location of the fovea with respect to the optic disc, the macula region is estimated. Accuracy of 100% detection is achieved for the optic disc and the macula on MESSIDOR and DIARETDB1 and 98.8% detection accuracy on STARE dataset. |
Tasks | |
Published | 2016-01-25 |
URL | http://arxiv.org/abs/1601.06608v1 |
http://arxiv.org/pdf/1601.06608v1.pdf | |
PWC | https://paperswithcode.com/paper/an-unsupervised-method-for-detection-and |
Repo | |
Framework | |
Digital Stylometry: Linking Profiles Across Social Networks
Title | Digital Stylometry: Linking Profiles Across Social Networks |
Authors | Soroush Vosoughi, Helen Zhou, Deb Roy |
Abstract | There is an ever growing number of users with accounts on multiple social media and networking sites. Consequently, there is increasing interest in matching user accounts and profiles across different social networks in order to create aggregate profiles of users. In this paper, we present models for Digital Stylometry, which is a method for matching users through stylometry inspired techniques. We experimented with linguistic, temporal, and combined temporal-linguistic models for matching user accounts, using standard and novel techniques. Using publicly available data, our best model, a combined temporal-linguistic one, was able to correctly match the accounts of 31% of 5,612 distinct users across Twitter and Facebook. |
Tasks | |
Published | 2016-05-17 |
URL | http://arxiv.org/abs/1605.05166v1 |
http://arxiv.org/pdf/1605.05166v1.pdf | |
PWC | https://paperswithcode.com/paper/digital-stylometry-linking-profiles-across |
Repo | |
Framework | |
Probabilistic Forecasting and Simulation of Electricity Markets via Online Dictionary Learning
Title | Probabilistic Forecasting and Simulation of Electricity Markets via Online Dictionary Learning |
Authors | Weisi Deng, Yuting Ji, Lang Tong |
Abstract | The problem of probabilistic forecasting and online simulation of real-time electricity market with stochastic generation and demand is considered. By exploiting the parametric structure of the direct current optimal power flow, a new technique based on online dictionary learning (ODL) is proposed. The ODL approach incorporates real-time measurements and historical traces to produce forecasts of joint and marginal probability distributions of future locational marginal prices, power flows, and dispatch levels, conditional on the system state at the time of forecasting. Compared with standard Monte Carlo simulation techniques, the ODL approach offers several orders of magnitude improvement in computation time, making it feasible for online forecasting of market operations. Numerical simulations on large and moderate size power systems illustrate its performance and complexity features and its potential as a tool for system operators. |
Tasks | Dictionary Learning |
Published | 2016-06-25 |
URL | http://arxiv.org/abs/1606.07855v1 |
http://arxiv.org/pdf/1606.07855v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-forecasting-and-simulation-of |
Repo | |
Framework | |
TrueHappiness: Neuromorphic Emotion Recognition on TrueNorth
Title | TrueHappiness: Neuromorphic Emotion Recognition on TrueNorth |
Authors | Peter U. Diehl, Bruno U. Pedroni, Andrew Cassidy, Paul Merolla, Emre Neftci, Guido Zarrella |
Abstract | We present an approach to constructing a neuromorphic device that responds to language input by producing neuron spikes in proportion to the strength of the appropriate positive or negative emotional response. Specifically, we perform a fine-grained sentiment analysis task with implementations on two different systems: one using conventional spiking neural network (SNN) simulators and the other one using IBM’s Neurosynaptic System TrueNorth. Input words are projected into a high-dimensional semantic space and processed through a fully-connected neural network (FCNN) containing rectified linear units trained via backpropagation. After training, this FCNN is converted to a SNN by substituting the ReLUs with integrate-and-fire neurons. We show that there is practically no performance loss due to conversion to a spiking network on a sentiment analysis test set, i.e. correlations between predictions and human annotations differ by less than 0.02 comparing the original DNN and its spiking equivalent. Additionally, we show that the SNN generated with this technique can be mapped to existing neuromorphic hardware – in our case, the TrueNorth chip. Mapping to the chip involves 4-bit synaptic weight discretization and adjustment of the neuron thresholds. The resulting end-to-end system can take a user input, i.e. a word in a vocabulary of over 300,000 words, and estimate its sentiment on TrueNorth with a power consumption of approximately 50 uW. |
Tasks | Emotion Recognition, Sentiment Analysis |
Published | 2016-01-16 |
URL | http://arxiv.org/abs/1601.04183v1 |
http://arxiv.org/pdf/1601.04183v1.pdf | |
PWC | https://paperswithcode.com/paper/truehappiness-neuromorphic-emotion |
Repo | |
Framework | |
Network Unfolding Map by Edge Dynamics Modeling
Title | Network Unfolding Map by Edge Dynamics Modeling |
Authors | Filipe Alves Neto Verri, Paulo Roberto Urio, Liang Zhao |
Abstract | The emergence of collective dynamics in neural networks is a mechanism of the animal and human brain for information processing. In this paper, we develop a computational technique using distributed processing elements in a complex network, which are called particles, to solve semi-supervised learning problems. Three actions govern the particles’ dynamics: generation, walking, and absorption. Labeled vertices generate new particles that compete against rival particles for edge domination. Active particles randomly walk in the network until they are absorbed by either a rival vertex or an edge currently dominated by rival particles. The result from the model evolution consists of sets of edges arranged by the label dominance. Each set tends to form a connected subnetwork to represent a data class. Although the intrinsic dynamics of the model is a stochastic one, we prove there exists a deterministic version with largely reduced computational complexity; specifically, with linear growth. Furthermore, the edge domination process corresponds to an unfolding map in such way that edges “stretch” and “shrink” according to the vertex-edge dynamics. Consequently, the unfolding effect summarizes the relevant relationships between vertices and the uncovered data classes. The proposed model captures important details of connectivity patterns over the vertex-edge dynamics evolution, in contrast to previous approaches which focused on only vertex or only edge dynamics. Computer simulations reveal that the new model can identify nonlinear features in both real and artificial data, including boundaries between distinct classes and overlapping structures of data. |
Tasks | |
Published | 2016-03-03 |
URL | http://arxiv.org/abs/1603.01182v2 |
http://arxiv.org/pdf/1603.01182v2.pdf | |
PWC | https://paperswithcode.com/paper/network-unfolding-map-by-edge-dynamics |
Repo | |
Framework | |
Unified Scalable Equivalent Formulations for Schatten Quasi-Norms
Title | Unified Scalable Equivalent Formulations for Schatten Quasi-Norms |
Authors | Fanhua Shang, Yuanyuan Liu, James Cheng |
Abstract | The Schatten quasi-norm can be used to bridge the gap between the nuclear norm and rank function, and is the tighter approximation to matrix rank. However, most existing Schatten quasi-norm minimization (SQNM) algorithms, as well as for nuclear norm minimization, are too slow or even impractical for large-scale problems, due to the SVD or EVD of the whole matrix in each iteration. In this paper, we rigorously prove that for any p, p1, p2>0 satisfying 1/p=1/p1+1/p2, the Schatten-p quasi-norm of any matrix is equivalent to minimizing the product of the Schatten-p1 norm (or quasi-norm) and Schatten-p2 norm (or quasi-norm) of its two factor matrices. Then we present and prove the equivalence relationship between the product formula of the Schatten quasi-norm and its weighted sum formula for the two cases of p1 and p2: p1=p2 and p1\neq p2. In particular, when p>1/2, there is an equivalence between the Schatten-p quasi-norm of any matrix and the Schatten-2p norms of its two factor matrices, where the widely used equivalent formulation of the nuclear norm can be viewed as a special case. That is, various SQNM problems with p>1/2 can be transformed into the one only involving smooth, convex norms of two factor matrices, which can lead to simpler and more efficient algorithms than conventional methods. We further extend the theoretical results of two factor matrices to the cases of three and more factor matrices, from which we can see that for any 0<p<1, the Schatten-p quasi-norm of any matrix is the minimization of the mean of the Schatten-(p3+1)p norms of all factor matrices, where p3 denotes the largest integer not exceeding 1/p. In other words, for any 0<p<1, the SQNM problem can be transformed into an optimization problem only involving the smooth, convex norms of multiple factor matrices. |
Tasks | |
Published | 2016-06-02 |
URL | http://arxiv.org/abs/1606.00668v2 |
http://arxiv.org/pdf/1606.00668v2.pdf | |
PWC | https://paperswithcode.com/paper/unified-scalable-equivalent-formulations-for |
Repo | |
Framework | |
Dense Bag-of-Temporal-SIFT-Words for Time Series Classification
Title | Dense Bag-of-Temporal-SIFT-Words for Time Series Classification |
Authors | Adeline Bailly, Simon Malinowski, Romain Tavenard, Thomas Guyet, Laetitia Chapel |
Abstract | Time series classification is an application of particular interest with the increase of data to monitor. Classical techniques for time series classification rely on point-to-point distances. Recently, Bag-of-Words approaches have been used in this context. Words are quantized versions of simple features extracted from sliding windows. The SIFT framework has proved efficient for image classification. In this paper, we design a time series classification scheme that builds on the SIFT framework adapted to time series to feed a Bag-of-Words. We then refine our method by studying the impact of normalized Bag-of-Words, as well as densely extract point descriptors. Proposed adjustements achieve better performance. The evaluation shows that our method outperforms classical techniques in terms of classification. |
Tasks | Image Classification, Time Series, Time Series Classification |
Published | 2016-01-08 |
URL | http://arxiv.org/abs/1601.01799v2 |
http://arxiv.org/pdf/1601.01799v2.pdf | |
PWC | https://paperswithcode.com/paper/dense-bag-of-temporal-sift-words-for-time |
Repo | |
Framework | |