May 5, 2019

3309 words 16 mins read

Paper Group ANR 559

Representation of linguistic form and function in recurrent neural networks. Computer Aided Restoration of Handwritten Character Strokes. Sparse additive Gaussian process with soft interactions. Process Discovery using Inductive Miner and Decomposition. Adapting ELM to Time Series Classification: A Novel Diversified Top-k Shapelets Extraction Metho …

Representation of linguistic form and function in recurrent neural networks


Title	Representation of linguistic form and function in recurrent neural networks
Authors	Ákos Kádár, Grzegorz Chrupała, Afra Alishahi
Abstract	We present novel methods for analyzing the activation patterns of RNNs from a linguistic point of view and explore the types of linguistic structure they learn. As a case study, we use a multi-task gated recurrent network architecture consisting of two parallel pathways with shared word embeddings trained on predicting the representations of the visual scene corresponding to an input sentence, and predicting the next word in the same sentence. Based on our proposed method to estimate the amount of contribution of individual tokens in the input to the final prediction of the networks we show that the image prediction pathway: a) is sensitive to the information structure of the sentence b) pays selective attention to lexical categories and grammatical functions that carry semantic information c) learns to treat the same input token differently depending on its grammatical functions in the sentence. In contrast the language model is comparatively more sensitive to words with a syntactic function. Furthermore, we propose methods to ex- plore the function of individual hidden units in RNNs and show that the two pathways of the architecture in our case study contain specialized units tuned to patterns informative for the task, some of which can carry activations to later time steps to encode long-term dependencies.
Tasks	Language Modelling, Word Embeddings
Published	2016-02-29
URL	http://arxiv.org/abs/1602.08952v2
PDF	http://arxiv.org/pdf/1602.08952v2.pdf
PWC	https://paperswithcode.com/paper/representation-of-linguistic-form-and
Repo
Framework

Computer Aided Restoration of Handwritten Character Strokes


Title	Computer Aided Restoration of Handwritten Character Strokes
Authors	Barak Sober, David Levin
Abstract	This work suggests a new variational approach to the task of computer aided restoration of incomplete characters, residing in a highly noisy document. We model character strokes as the movement of a pen with a varying radius. Following this model, a cubic spline representation is being utilized to perform gradient descent steps, while maintaining interpolation at some initial (manually sampled) points. The proposed algorithm was utilized in the process of restoring approximately 1000 ancient Hebrew characters (dating to ca. 8th-7th century BCE), some of which are presented herein and show that the algorithm yields plausible results when applied on deteriorated documents.
Tasks
Published	2016-02-23
URL	http://arxiv.org/abs/1602.07038v2
PDF	http://arxiv.org/pdf/1602.07038v2.pdf
PWC	https://paperswithcode.com/paper/computer-aided-restoration-of-handwritten
Repo
Framework

Sparse additive Gaussian process with soft interactions


Title	Sparse additive Gaussian process with soft interactions
Authors	Garret Vo, Debdeep Pati
Abstract	Additive nonparametric regression models provide an attractive tool for variable selection in high dimensions when the relationship between the response and predictors is complex. They offer greater flexibility compared to parametric non-linear regression models and better interpretability and scalability than the non-parametric regression models. However, achieving sparsity simultaneously in the number of nonparametric components as well as in the variables within each nonparametric component poses a stiff computational challenge. In this article, we develop a novel Bayesian additive regression model using a combination of hard and soft shrinkages to separately control the number of additive components and the variables within each component. An efficient algorithm is developed to select the importance variables and estimate the interaction network. Excellent performance is obtained in simulated and real data examples.
Tasks
Published	2016-07-09
URL	http://arxiv.org/abs/1607.02670v1
PDF	http://arxiv.org/pdf/1607.02670v1.pdf
PWC	https://paperswithcode.com/paper/sparse-additive-gaussian-process-with-soft
Repo
Framework

Process Discovery using Inductive Miner and Decomposition


Title	Process Discovery using Inductive Miner and Decomposition
Authors	Raji Ghawi
Abstract	This report presents a submission to the Process Discovery Contest. The contest is dedicated to the assessment of tools and techniques that discover business process models from event logs. The objective is to compare the efficiency of techniques to discover process models that provide a proper balance between “overfitting” and “underfitting”. In the context of the Process Discovery Contest, process discovery is turned into a classification task with a training set and a test set; where a process model needs to decide whether traces are fitting or not. In this report, we first show how we use two discovery techniques, namely: Inductive Miner and Decomposition, to discover process models from the training set using ProM tool. Second, we show how we use replay results to 1) check the rediscoverability of models, and to 2) classify unseen traces (in test logs) as fitting or not. Then, we discuss the classification results of validation logs, the complexity of discovered models, and their impact on the selection of models for submission. The report ends with the pictures of the submitted process models.
Tasks
Published	2016-10-25
URL	http://arxiv.org/abs/1610.07989v1
PDF	http://arxiv.org/pdf/1610.07989v1.pdf
PWC	https://paperswithcode.com/paper/process-discovery-using-inductive-miner-and
Repo
Framework

Adapting ELM to Time Series Classification: A Novel Diversified Top-k Shapelets Extraction Method


Title	Adapting ELM to Time Series Classification: A Novel Diversified Top-k Shapelets Extraction Method
Authors	Qiuyan Yan, Qifa Sun, Xinming Yan
Abstract	ELM (Extreme Learning Machine) is a single hidden layer feed-forward network, where the weights between input and hidden layer are initialized randomly. ELM is efficient due to its utilization of the analytical approach to compute weights between hidden and output layer. However, ELM still fails to output the semantic classification outcome. To address such limitation, in this paper, we propose a diversified top-k shapelets transform framework, where the shapelets are the subsequences i.e., the best representative and interpretative features of each class. As we identified, the most challenge problems are how to extract the best k shapelets in original candidate sets and how to automatically determine the k value. Specifically, we first define the similar shapelets and diversified top-k shapelets to construct diversity shapelets graph. Then, a novel diversity graph based top-k shapelets extraction algorithm named as \textbf{DivTopkshapelets}\ is proposed to search top-k diversified shapelets. Finally, we propose a shapelets transformed ELM algorithm named as \textbf{DivShapELM} to automatically determine the k value, which is further utilized for time series classification. The experimental results over public data sets demonstrate that the proposed approach significantly outperforms traditional ELM algorithm in terms of effectiveness and efficiency.
Tasks	Time Series, Time Series Classification
Published	2016-06-20
URL	http://arxiv.org/abs/1606.05934v1
PDF	http://arxiv.org/pdf/1606.05934v1.pdf
PWC	https://paperswithcode.com/paper/adapting-elm-to-time-series-classification-a
Repo
Framework


Title	Unsupervised Learning For Effective User Engagement on Social Media
Authors	Thai Pham, Camelia Simoiu
Abstract	In this paper, we investigate the effectiveness of unsupervised feature learning techniques in predicting user engagement on social media. Specifically, we compare two methods to predict the number of feedbacks (i.e., comments) that a blog post is likely to receive. We compare Principal Component Analysis (PCA) and sparse Autoencoder to a baseline method where the data are only centered and scaled, on each of two models: Linear Regression and Regression Tree. We find that unsupervised learning techniques significantly improve the prediction accuracy on both models. For the Linear Regression model, sparse Autoencoder achieves the best result, with an improvement in the root mean squared error (RMSE) on the test set of 42% over the baseline method. For the Regression Tree model, PCA achieves the best result, with an improvement in RMSE of 15% over the baseline.
Tasks
Published	2016-11-11
URL	http://arxiv.org/abs/1611.03894v1
PDF	http://arxiv.org/pdf/1611.03894v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-for-effective-user
Repo
Framework

Predicting online extremism, content adopters, and interaction reciprocity


Title	Predicting online extremism, content adopters, and interaction reciprocity
Authors	Emilio Ferrara, Wen-Qiang Wang, Onur Varol, Alessandro Flammini, Aram Galstyan
Abstract	We present a machine learning framework that leverages a mixture of metadata, network, and temporal features to detect extremist users, and predict content adopters and interaction reciprocity in social media. We exploit a unique dataset containing millions of tweets generated by more than 25 thousand users who have been manually identified, reported, and suspended by Twitter due to their involvement with extremist campaigns. We also leverage millions of tweets generated by a random sample of 25 thousand regular users who were exposed to, or consumed, extremist content. We carry out three forecasting tasks, (i) to detect extremist users, (ii) to estimate whether regular users will adopt extremist content, and finally (iii) to predict whether users will reciprocate contacts initiated by extremists. All forecasting tasks are set up in two scenarios: a post hoc (time independent) prediction task on aggregated data, and a simulated real-time prediction task. The performance of our framework is extremely promising, yielding in the different forecasting scenarios up to 93% AUC for extremist user detection, up to 80% AUC for content adoption prediction, and finally up to 72% AUC for interaction reciprocity forecasting. We conclude by providing a thorough feature analysis that helps determine which are the emerging signals that provide predictive power in different scenarios.
Tasks
Published	2016-05-02
URL	http://arxiv.org/abs/1605.00659v1
PDF	http://arxiv.org/pdf/1605.00659v1.pdf
PWC	https://paperswithcode.com/paper/predicting-online-extremism-content-adopters
Repo
Framework

A new correlation clustering method for cancer mutation analysis


Title	A new correlation clustering method for cancer mutation analysis
Authors	Jack P. Hou, Amin Emad, Gregory J. Puleo, Jian Ma, Olgica Milenkovic
Abstract	Cancer genomes exhibit a large number of different alterations that affect many genes in a diverse manner. It is widely believed that these alterations follow combinatorial patterns that have a strong connection with the underlying molecular interaction networks and functional pathways. A better understanding of the generative mechanisms behind the mutation rules and their influence on gene communities is of great importance for the process of driver mutations discovery and for identification of network modules related to cancer development and progression. We developed a new method for cancer mutation pattern analysis based on a constrained form of correlation clustering. Correlation clustering is an agnostic learning method that can be used for general community detection problems in which the number of communities or their structure is not known beforehand. The resulting algorithm, named $C^3$, leverages mutual exclusivity of mutations, patient coverage, and driver network concentration principles; it accepts as its input a user determined combination of heterogeneous patient data, such as that available from TCGA (including mutation, copy number, and gene expression information), and creates a large number of clusters containing mutually exclusive mutated genes in a particular type of cancer. The cluster sizes may be required to obey some useful soft size constraints, without impacting the computational complexity of the algorithm. To test $C^3$, we performed a detailed analysis on TCGA breast cancer and glioblastoma data and showed that our algorithm outperforms the state-of-the-art CoMEt method in terms of discovering mutually exclusive gene modules and identifying driver genes. Our $C^3$ method represents a unique tool for efficient and reliable identification of mutation patterns and driver pathways in large-scale cancer genomics studies.
Tasks	Community Detection
Published	2016-01-25
URL	http://arxiv.org/abs/1601.06476v1
PDF	http://arxiv.org/pdf/1601.06476v1.pdf
PWC	https://paperswithcode.com/paper/a-new-correlation-clustering-method-for
Repo
Framework

Early Detection of Combustion Instabilities using Deep Convolutional Selective Autoencoders on Hi-speed Flame Video


Title	Early Detection of Combustion Instabilities using Deep Convolutional Selective Autoencoders on Hi-speed Flame Video
Authors	Adedotun Akintayo, Kin Gwn Lore, Soumalya Sarkar, Soumik Sarkar
Abstract	This paper proposes an end-to-end convolutional selective autoencoder approach for early detection of combustion instabilities using rapidly arriving flame image frames. The instabilities arising in combustion processes cause significant deterioration and safety issues in various human-engineered systems such as land and air based gas turbine engines. These properties are described as self-sustaining, large amplitude pressure oscillations and show varying spatial scales periodic coherent vortex structure shedding. However, such instability is extremely difficult to detect before a combustion process becomes completely unstable due to its sudden (bifurcation-type) nature. In this context, an autoencoder is trained to selectively mask stable flame and allow unstable flame image frames. In that process, the model learns to identify and extract rich descriptive and explanatory flame shape features. With such a training scheme, the selective autoencoder is shown to be able to detect subtle instability features as a combustion process makes transition from stable to unstable region. As a consequence, the deep learning tool-chain can perform as an early detection framework for combustion instabilities that will have a transformative impact on the safety and performance of modern engines.
Tasks
Published	2016-03-25
URL	http://arxiv.org/abs/1603.07839v1
PDF	http://arxiv.org/pdf/1603.07839v1.pdf
PWC	https://paperswithcode.com/paper/early-detection-of-combustion-instabilities
Repo
Framework

Local Discriminant Hyperalignment for multi-subject fMRI data alignment


Title	Local Discriminant Hyperalignment for multi-subject fMRI data alignment
Authors	Muhammad Yousefnezhad, Daoqiang Zhang
Abstract	Multivariate Pattern (MVP) classification can map different cognitive states to the brain tasks. One of the main challenges in MVP analysis is validating the generated results across subjects. However, analyzing multi-subject fMRI data requires accurate functional alignments between neuronal activities of different subjects, which can rapidly increase the performance and robustness of the final results. Hyperalignment (HA) is one of the most effective functional alignment methods, which can be mathematically formulated by the Canonical Correlation Analysis (CCA) methods. Since HA mostly uses the unsupervised CCA techniques, its solution may not be optimized for MVP analysis. By incorporating the idea of Local Discriminant Analysis (LDA) into CCA, this paper proposes Local Discriminant Hyperalignment (LDHA) as a novel supervised HA method, which can provide better functional alignment for MVP analysis. Indeed, the locality is defined based on the stimuli categories in the train-set, where the correlation between all stimuli in the same category will be maximized and the correlation between distinct categories of stimuli approaches to near zero. Experimental studies on multi-subject MVP analysis confirm that the LDHA method achieves superior performance to other state-of-the-art HA algorithms.
Tasks	Multi-Subject Fmri Data Alignment
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08366v1
PDF	http://arxiv.org/pdf/1611.08366v1.pdf
PWC	https://paperswithcode.com/paper/local-discriminant-hyperalignment-for-multi
Repo
Framework

Segmentation Rectification for Video Cutout via One-Class Structured Learning


Title	Segmentation Rectification for Video Cutout via One-Class Structured Learning
Authors	Junyan Wang, Sai-kit Yeung, Jue Wang, Kun Zhou
Abstract	Recent works on interactive video object cutout mainly focus on designing dynamic foreground-background (FB) classifiers for segmentation propagation. However, the research on optimally removing errors from the FB classification is sparse, and the errors often accumulate rapidly, causing significant errors in the propagated frames. In this work, we take the initial steps to addressing this problem, and we call this new task \emph{segmentation rectification}. Our key observation is that the possibly asymmetrically distributed false positive and false negative errors were handled equally in the conventional methods. We, alternatively, propose to optimally remove these two types of errors. To this effect, we propose a novel bilayer Markov Random Field (MRF) model for this new task. We also adopt the well-established structured learning framework to learn the optimal model from data. Additionally, we propose a novel one-class structured SVM (OSSVM) which greatly speeds up the structured learning process. Our method naturally extends to RGB-D videos as well. Comprehensive experiments on both RGB and RGB-D data demonstrate that our simple and effective method significantly outperforms the segmentation propagation methods adopted in the state-of-the-art video cutout systems, and the results also suggest the potential usefulness of our method in image cutout system.
Tasks
Published	2016-02-16
URL	http://arxiv.org/abs/1602.04906v1
PDF	http://arxiv.org/pdf/1602.04906v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-rectification-for-video-cutout
Repo
Framework

Fast Graph-Based Object Segmentation for RGB-D Images


Title	Fast Graph-Based Object Segmentation for RGB-D Images
Authors	Giorgio Toscana, Stefano Rosa
Abstract	Object segmentation is an important capability for robotic systems, in particular for grasping. We present a graph- based approach for the segmentation of simple objects from RGB-D images. We are interested in segmenting objects with large variety in appearance, from lack of texture to strong textures, for the task of robotic grasping. The algorithm does not rely on image features or machine learning. We propose a modified Canny edge detector for extracting robust edges by using depth information and two simple cost functions for combining color and depth cues. The cost functions are used to build an undirected graph, which is partitioned using the concept of internal and external differences between graph regions. The partitioning is fast with O(NlogN) complexity. We also discuss ways to deal with missing depth information. We test the approach on different publicly available RGB-D object datasets, such as the Rutgers APC RGB-D dataset and the RGB-D Object Dataset, and compare the results with other existing methods.
Tasks	Robotic Grasping, Semantic Segmentation
Published	2016-05-12
URL	http://arxiv.org/abs/1605.03746v1
PDF	http://arxiv.org/pdf/1605.03746v1.pdf
PWC	https://paperswithcode.com/paper/fast-graph-based-object-segmentation-for-rgb
Repo
Framework

Restoring STM images via Sparse Coding: noise and artifact removal


Title	Restoring STM images via Sparse Coding: noise and artifact removal
Authors	João P. Oliveira, Ana Bragança, José Bioucas-Dias, Mário Figueiredo, Luís Alcácer, Jorge Morgado, Quirina Ferreira
Abstract	In this article, we present a denoising algorithm to improve the interpretation and quality of scanning tunneling microscopy (STM) images. Given the high level of self-similarity of STM images, we propose a denoising algorithm by reformulating the true estimation problem as a sparse regression, often termed sparse coding. We introduce modifications to the algorithm to cope with the existence of artifacts, mainly dropouts, which appear in a structured way as consecutive line segments on the scanning direction. The resulting algorithm treats the artifacts as missing data, and the estimated values outperform those algorithms that substitute the outliers by a local filtering. We provide code implementations for both Matlab and Gwyddion.
Tasks	Denoising
Published	2016-10-11
URL	http://arxiv.org/abs/1610.03437v1
PDF	http://arxiv.org/pdf/1610.03437v1.pdf
PWC	https://paperswithcode.com/paper/restoring-stm-images-via-sparse-coding-noise
Repo
Framework

Network-Efficient Distributed Word2vec Training System for Large Vocabularies


Title	Network-Efficient Distributed Word2vec Training System for Large Vocabularies
Authors	Erik Ordentlich, Lee Yang, Andy Feng, Peter Cnudde, Mihajlo Grbovic, Nemanja Djuric, Vladan Radosavljevic, Gavin Owens
Abstract	Word2vec is a popular family of algorithms for unsupervised training of dense vector representations of words on large text corpuses. The resulting vectors have been shown to capture semantic relationships among their corresponding words, and have shown promise in reducing a number of natural language processing (NLP) tasks to mathematical operations on these vectors. While heretofore applications of word2vec have centered around vocabularies with a few million words, wherein the vocabulary is the set of words for which vectors are simultaneously trained, novel applications are emerging in areas outside of NLP with vocabularies comprising several 100 million words. Existing word2vec training systems are impractical for training such large vocabularies as they either require that the vectors of all vocabulary words be stored in the memory of a single server or suffer unacceptable training latency due to massive network data transfer. In this paper, we present a novel distributed, parallel training system that enables unprecedented practical training of vectors for vocabularies with several 100 million words on a shared cluster of commodity servers, using far less network traffic than the existing solutions. We evaluate the proposed system on a benchmark dataset, showing that the quality of vectors does not degrade relative to non-distributed training. Finally, for several quarters, the system has been deployed for the purpose of matching queries to ads in Gemini, the sponsored search advertising platform at Yahoo, resulting in significant improvement of business metrics.
Tasks
Published	2016-06-27
URL	http://arxiv.org/abs/1606.08495v1
PDF	http://arxiv.org/pdf/1606.08495v1.pdf
PWC	https://paperswithcode.com/paper/network-efficient-distributed-word2vec
Repo
Framework

Flexible Models for Microclustering with Application to Entity Resolution


Title	Flexible Models for Microclustering with Application to Entity Resolution
Authors	Giacomo Zanella, Brenda Betancourt, Hanna Wallach, Jeffrey Miller, Abbas Zaidi, Rebecca C. Steorts
Abstract	Most generative models for clustering implicitly assume that the number of data points in each cluster grows linearly with the total number of data points. Finite mixture models, Dirichlet process mixture models, and Pitman–Yor process mixture models make this assumption, as do all other infinitely exchangeable clustering models. However, for some applications, this assumption is inappropriate. For example, when performing entity resolution, the size of each cluster should be unrelated to the size of the data set, and each cluster should contain a negligible fraction of the total number of data points. These applications require models that yield clusters whose sizes grow sublinearly with the size of the data set. We address this requirement by defining the microclustering property and introducing a new class of models that can exhibit this property. We compare models within this class to two commonly used clustering models using four entity-resolution data sets.
Tasks	Entity Resolution
Published	2016-10-31
URL	http://arxiv.org/abs/1610.09780v1
PDF	http://arxiv.org/pdf/1610.09780v1.pdf
PWC	https://paperswithcode.com/paper/flexible-models-for-microclustering-with
Repo
Framework