October 20, 2019

3210 words 16 mins read

Paper Group AWR 186

Paper Group AWR 186

Alarm-Based Prescriptive Process Monitoring. Segmentation of Scanning Tunneling Microscopy Images Using Variational Methods and Empirical Wavelets. Efficient Video Object Segmentation via Network Modulation. Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference. Conditional WaveGAN. Role action embeddings: sc …

Alarm-Based Prescriptive Process Monitoring

Title Alarm-Based Prescriptive Process Monitoring
Authors Irene Teinemaa, Niek Tax, Massimiliano de Leoni, Marlon Dumas, Fabrizio Maria Maggi
Abstract Predictive process monitoring is concerned with the analysis of events produced during the execution of a process in order to predict the future state of ongoing cases thereof. Existing techniques in this field are able to predict, at each step of a case, the likelihood that the case will end up in an undesired outcome. These techniques, however, do not take into account what process workers may do with the generated predictions in order to decrease the likelihood of undesired outcomes. This paper proposes a framework for prescriptive process monitoring, which extends predictive process monitoring approaches with the concepts of alarms, interventions, compensations, and mitigation effects. The framework incorporates a parameterized cost model to assess the cost-benefit tradeoffs of applying prescriptive process monitoring in a given setting. The paper also outlines an approach to optimize the generation of alarms given a dataset and a set of cost model parameters. The proposed approach is empirically evaluated using a range of real-life event logs.
Tasks
Published 2018-03-23
URL http://arxiv.org/abs/1803.08706v2
PDF http://arxiv.org/pdf/1803.08706v2.pdf
PWC https://paperswithcode.com/paper/alarm-based-prescriptive-process-monitoring
Repo https://github.com/samadeusfp/prescriptiveProcessMonitoring
Framework none

Segmentation of Scanning Tunneling Microscopy Images Using Variational Methods and Empirical Wavelets

Title Segmentation of Scanning Tunneling Microscopy Images Using Variational Methods and Empirical Wavelets
Authors Bui Kevin, Fauman Jacob, Kes David, Torres Mandiola Leticia, Ciomaga Adina, Salazar Ricardo, Bertozzi L. Andrea, Gilles Jerome, Guttentag I. Andrew, Weiss S. Paul
Abstract In the fields of nanoscience and nanotechnology, it is important to be able to functionalize surfaces chemically for a wide variety of applications. Scanning tunneling microscopes (STMs) are important instruments in this area used to measure the surface structure and chemistry with better than molecular resolution. Self-assembly is frequently used to create monolayers that redefine the surface chemistry in just a single-molecule-thick layer. Indeed, STM images reveal rich information about the structure of self-assembled monolayers since they convey chemical and physical properties of the studied material. In order to assist in and to enhance the analysis of STM and other images, we propose and demonstrate an image-processing framework that produces two image segmentations: one is based on intensities (apparent heights in STM images) and the other is based on textural patterns. The proposed framework begins with a cartoon+texture decomposition, which separates an image into its cartoon and texture components. Afterward, the cartoon image is segmented by a modified multiphase version of the local Chan-Vese model, while the texture image is segmented by a combination of 2D empirical wavelet transform and a clustering algorithm. Overall, our proposed framework contains several new features, specifically in presenting a new application of cartoon+texture decomposition and of the empirical wavelet transforms and in developing a specialized framework to segment STM images and other data. To demonstrate the potential of our approach, we apply it to actual STM images of cyanide monolayers on Au{111} and present their corresponding segmentation results.
Tasks
Published 2018-04-24
URL http://arxiv.org/abs/1804.08890v1
PDF http://arxiv.org/pdf/1804.08890v1.pdf
PWC https://paperswithcode.com/paper/segmentation-of-scanning-tunneling-microscopy
Repo https://github.com/kbui1993/Microscopy-Codes
Framework none

Efficient Video Object Segmentation via Network Modulation

Title Efficient Video Object Segmentation via Network Modulation
Authors Linjie Yang, Yanran Wang, Xuehan Xiong, Jianchao Yang, Aggelos K. Katsaggelos
Abstract Video object segmentation targets at segmenting a specific object throughout a video sequence, given only an annotated first frame. Recent deep learning based approaches find it effective by fine-tuning a general-purpose segmentation model on the annotated frame using hundreds of iterations of gradient descent. Despite the high accuracy these methods achieve, the fine-tuning process is inefficient and fail to meet the requirements of real world applications. We propose a novel approach that uses a single forward pass to adapt the segmentation model to the appearance of a specific object. Specifically, a second meta neural network named modulator is learned to manipulate the intermediate layers of the segmentation network given limited visual and spatial information of the target object. The experiments show that our approach is 70times faster than fine-tuning approaches while achieving similar accuracy.
Tasks Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published 2018-02-04
URL http://arxiv.org/abs/1802.01218v1
PDF http://arxiv.org/pdf/1802.01218v1.pdf
PWC https://paperswithcode.com/paper/efficient-video-object-segmentation-via
Repo https://github.com/linjieyangsc/video_seg
Framework tf

Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference

Title Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference
Authors Mike Wu, Noah Goodman, Stefano Ermon
Abstract Stochastic optimization techniques are standard in variational inference algorithms. These methods estimate gradients by approximating expectations with independent Monte Carlo samples. In this paper, we explore a technique that uses correlated, but more representative , samples to reduce estimator variance. Specifically, we show how to generate antithetic samples that match sample moments with the true moments of an underlying importance distribution. Combining a differentiable antithetic sampler with modern stochastic variational inference, we showcase the effectiveness of this approach for learning a deep generative model.
Tasks Stochastic Optimization
Published 2018-10-05
URL https://arxiv.org/abs/1810.02555v2
PDF https://arxiv.org/pdf/1810.02555v2.pdf
PWC https://paperswithcode.com/paper/differentiable-antithetic-sampling-for
Repo https://github.com/mhw32/antithetic-vae-public
Framework pytorch

Conditional WaveGAN

Title Conditional WaveGAN
Authors Chae Young Lee, Anoop Toffy, Gue Jun Jung, Woo-Jin Han
Abstract Generative models are successfully used for image synthesis in the recent years. But when it comes to other modalities like audio, text etc little progress has been made. Recent works focus on generating audio from a generative model in an unsupervised setting. We explore the possibility of using generative models conditioned on class labels. Concatenation based conditioning and conditional scaling were explored in this work with various hyper-parameter tuning methods. In this paper we introduce Conditional WaveGANs (cWaveGAN). Find our implementation at https://github.com/acheketa/cwavegan
Tasks Audio Generation
Published 2018-09-27
URL http://arxiv.org/abs/1809.10636v1
PDF http://arxiv.org/pdf/1809.10636v1.pdf
PWC https://paperswithcode.com/paper/conditional-wavegan
Repo https://github.com/acheketa/cwavegan
Framework tf

Role action embeddings: scalable representation of network positions

Title Role action embeddings: scalable representation of network positions
Authors George Berry
Abstract We consider the question of embedding nodes with similar local neighborhoods together in embedding space, commonly referred to as “role embeddings.” We propose RAE, an unsupervised framework that learns role embeddings. It combines a within-node loss function and a graph neural network (GNN) architecture to place nodes with similar local neighborhoods close in embedding space. We also propose a faster way of generating negative examples called neighbor shuffling, which quickly creates negative examples directly within batches. These techniques can be easily combined with existing GNN methods to create unsupervised role embeddings at scale. We then explore role action embeddings, which summarize the non-structural features in a node’s neighborhood, leading to better performance on node classification tasks. We find that the model architecture proposed here provides strong performance on both graph and node classification tasks, in some cases competitive with semi-supervised methods.
Tasks Node Classification
Published 2018-11-19
URL http://arxiv.org/abs/1811.08019v2
PDF http://arxiv.org/pdf/1811.08019v2.pdf
PWC https://paperswithcode.com/paper/role-action-embeddings-scalable
Repo https://github.com/georgeberry/role-action-embeddings
Framework pytorch

Paraphrase Thought: Sentence Embedding Module Imitating Human Language Recognition

Title Paraphrase Thought: Sentence Embedding Module Imitating Human Language Recognition
Authors Myeongjun Jang, Pilsung Kang
Abstract Sentence embedding is an important research topic in natural language processing. It is essential to generate a good embedding vector that fully reflects the semantic meaning of a sentence in order to achieve an enhanced performance for various natural language processing tasks, such as machine translation and document classification. Thus far, various sentence embedding models have been proposed, and their feasibility has been demonstrated through good performances on tasks following embedding, such as sentiment analysis and sentence classification. However, because the performances of sentence classification and sentiment analysis can be enhanced by using a simple sentence representation method, it is not sufficient to claim that these models fully reflect the meanings of sentences based on good performances for such tasks. In this paper, inspired by human language recognition, we propose the following concept of semantic coherence, which should be satisfied for a good sentence embedding method: similar sentences should be located close to each other in the embedding space. Then, we propose the Paraphrase-Thought (P-thought) model to pursue semantic coherence as much as possible. Experimental results on two paraphrase identification datasets (MS COCO and STS benchmark) show that the P-thought models outperform the benchmarked sentence embedding methods.
Tasks Document Classification, Machine Translation, Paraphrase Identification, Sentence Classification, Sentence Embedding, Sentiment Analysis
Published 2018-08-16
URL http://arxiv.org/abs/1808.05505v3
PDF http://arxiv.org/pdf/1808.05505v3.pdf
PWC https://paperswithcode.com/paper/paraphrase-thought-sentence-embedding-module
Repo https://github.com/MJ-Jang/Paraphrase-Thought
Framework tf

Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation

Title Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation
Authors Jack Kosaian, K. V. Rashmi, Shivaram Venkataraman
Abstract Machine learning algorithms are typically run on large scale, distributed compute infrastructure that routinely face a number of unavailabilities such as failures and temporary slowdowns. Adding redundant computations using coding-theoretic tools called “codes” is an emerging technique to alleviate the adverse effects of such unavailabilities. A code consists of an encoding function that proactively introduces redundant computation and a decoding function that reconstructs unavailable outputs using the available ones. Past work focuses on using codes to provide resilience for linear computations and specific iterative optimization algorithms. However, computations performed for a variety of applications including inference on state-of-the-art machine learning algorithms, such as neural networks, typically fall outside this realm. In this paper, we propose taking a learning-based approach to designing codes that can handle non-linear computations. We present carefully designed neural network architectures and a training methodology for learning encoding and decoding functions that produce approximate reconstructions of unavailable computation results. We present extensive experimental results demonstrating the effectiveness of the proposed approach: we show that the our learned codes can accurately reconstruct $64 - 98%$ of the unavailable predictions from neural-network based image classifiers on the MNIST, Fashion-MNIST, and CIFAR-10 datasets. To the best of our knowledge, this work proposes the first learning-based approach for designing codes, and also presents the first coding-theoretic solution that can provide resilience for any non-linear (differentiable) computation. Our results show that learning can be an effective technique for designing codes, and that learned codes are a highly promising approach for bringing the benefits of coding to non-linear computations.
Tasks
Published 2018-06-04
URL http://arxiv.org/abs/1806.01259v1
PDF http://arxiv.org/pdf/1806.01259v1.pdf
PWC https://paperswithcode.com/paper/learning-a-code-machine-learning-for
Repo https://github.com/Thesys-lab/learned-cc
Framework pytorch

An Analysis of Neural Language Modeling at Multiple Scales

Title An Analysis of Neural Language Modeling at Multiple Scales
Authors Stephen Merity, Nitish Shirish Keskar, Richard Socher
Abstract Many of the leading approaches in language modeling introduce novel, complex and specialized architectures. We take existing state-of-the-art word level language models based on LSTMs and QRNNs and extend them to both larger vocabularies as well as character-level granularity. When properly tuned, LSTMs and QRNNs achieve state-of-the-art results on character-level (Penn Treebank, enwik8) and word-level (WikiText-103) datasets, respectively. Results are obtained in only 12 hours (WikiText-103) to 2 days (enwik8) using a single modern GPU.
Tasks Language Modelling
Published 2018-03-22
URL http://arxiv.org/abs/1803.08240v1
PDF http://arxiv.org/pdf/1803.08240v1.pdf
PWC https://paperswithcode.com/paper/an-analysis-of-neural-language-modeling-at
Repo https://github.com/arvieFrydenlund/awd-lstm-lm
Framework pytorch

Investigating the Effects of Word Substitution Errors on Sentence Embeddings

Title Investigating the Effects of Word Substitution Errors on Sentence Embeddings
Authors Rohit Voleti, Julie M. Liss, Visar Berisha
Abstract A key initial step in several natural language processing (NLP) tasks involves embedding phrases of text to vectors of real numbers that preserve semantic meaning. To that end, several methods have been recently proposed with impressive results on semantic similarity tasks. However, all of these approaches assume that perfect transcripts are available when generating the embeddings. While this is a reasonable assumption for analysis of written text, it is limiting for analysis of transcribed text. In this paper we investigate the effects of word substitution errors, such as those coming from automatic speech recognition errors (ASR), on several state-of-the-art sentence embedding methods. To do this, we propose a new simulator that allows the experimenter to induce ASR-plausible word substitution errors in a corpus at a desired word error rate. We use this simulator to evaluate the robustness of several sentence embedding methods. Our results show that pre-trained neural sentence encoders are both robust to ASR errors and perform well on textual similarity tasks after errors are introduced. Meanwhile, unweighted averages of word vectors perform well with perfect transcriptions, but their performance degrades rapidly on textual similarity tasks for text with word substitution errors.
Tasks Semantic Similarity, Semantic Textual Similarity, Sentence Embedding, Sentence Embeddings, Speech Recognition
Published 2018-11-16
URL http://arxiv.org/abs/1811.07021v2
PDF http://arxiv.org/pdf/1811.07021v2.pdf
PWC https://paperswithcode.com/paper/investigating-the-effects-of-word
Repo https://github.com/rvoleti89/icaasp-2019-word-subs
Framework none

Automatic Gradient Boosting

Title Automatic Gradient Boosting
Authors Janek Thomas, Stefan Coors, Bernd Bischl
Abstract Automatic machine learning performs predictive modeling with high performing machine learning tools without human interference. This is achieved by making machine learning applications parameter-free, i.e. only a dataset is provided while the complete model selection and model building process is handled internally through (often meta) optimization. Projects like Auto-WEKA and auto-sklearn aim to solve the Combined Algorithm Selection and Hyperparameter optimization (CASH) problem resulting in huge configuration spaces. However, for most real-world applications, the optimization over only a few different key learning algorithms can not only be sufficient, but also potentially beneficial. The latter becomes apparent when one considers that models have to be validated, explained, deployed and maintained. Here, less complex model are often preferred, for validation or efficiency reasons, or even a strict requirement. Automatic gradient boosting simplifies this idea one step further, using only gradient boosting as a single learning algorithm in combination with model-based hyperparameter tuning, threshold optimization and encoding of categorical features. We introduce this general framework as well as a concrete implementation called autoxgboost. It is compared to current AutoML projects on 16 datasets and despite its simplicity is able to achieve comparable results on about half of the datasets as well as performing best on two.
Tasks AutoML, Hyperparameter Optimization, Model Selection
Published 2018-07-10
URL http://arxiv.org/abs/1807.03873v2
PDF http://arxiv.org/pdf/1807.03873v2.pdf
PWC https://paperswithcode.com/paper/automatic-gradient-boosting
Repo https://github.com/ja-thomas/autoxgboost
Framework none

Efficient Solvers for Sparse Subspace Clustering

Title Efficient Solvers for Sparse Subspace Clustering
Authors Farhad Pourkamali-Anaraki, James Folberth, Stephen Becker
Abstract Sparse subspace clustering (SSC) clusters $n$ points that lie near a union of low-dimensional subspaces. The SSC model expresses each point as a linear or affine combination of the other points, using either $\ell_1$ or $\ell_0$ regularization. Using $\ell_1$ regularization results in a convex problem but requires $O(n^2)$ storage, and is typically solved by the alternating direction method of multipliers which takes $O(n^3)$ flops. The $\ell_0$ model is non-convex but only needs memory linear in $n$, and is solved via orthogonal matching pursuit and cannot handle the case of affine subspaces. This paper shows that a proximal gradient framework can solve SSC, covering both $\ell_1$ and $\ell_0$ models, and both linear and affine constraints. For both $\ell_1$ and $\ell_0$, algorithms to compute the proximity operator in the presence of affine constraints have not been presented in the SSC literature, so we derive an exact and efficient algorithm that solves the $\ell_1$ case with just $O(n^2)$ flops. In the $\ell_0$ case, our algorithm retains the low-memory overhead, and is the first algorithm to solve the SSC-$\ell_0$ model with affine constraints. Experiments show our algorithms do not rely on sensitive regularization parameters, and they are less sensitive to sparsity misspecification and high noise.
Tasks
Published 2018-04-17
URL https://arxiv.org/abs/1804.06291v2
PDF https://arxiv.org/pdf/1804.06291v2.pdf
PWC https://paperswithcode.com/paper/efficient-solvers-for-sparse-subspace
Repo https://github.com/stephenbeckr/SSC
Framework none
Title Surprising Negative Results for Generative Adversarial Tree Search
Authors Kamyar Azizzadenesheli, Brandon Yang, Weitang Liu, Zachary C Lipton, Animashree Anandkumar
Abstract While many recent advances in deep reinforcement learning (RL) rely on model-free methods, model-based approaches remain an alluring prospect for their potential to exploit unsupervised data to learn environment model. In this work, we provide an extensive study on the design of deep generative models for RL environments and propose a sample efficient and robust method to learn the model of Atari environments. We deploy this model and propose generative adversarial tree search (GATS) a deep RL algorithm that learns the environment model and implements Monte Carlo tree search (MCTS) on the learned model for planning. While MCTS on the learned model is computationally expensive, similar to AlphaGo, GATS follows depth limited MCTS. GATS employs deep Q network (DQN) and learns a Q-function to assign values to the leaves of the tree in MCTS. We theoretical analyze GATS vis-a-vis the bias-variance trade-off and show GATS is able to mitigate the worst-case error in the Q-estimate. While we were expecting GATS to enjoy a better sample complexity and faster converges to better policies, surprisingly, GATS fails to outperform DQN. We provide a study on which we show why depth limited MCTS fails to perform desirably.
Tasks Atari Games
Published 2018-06-15
URL https://arxiv.org/abs/1806.05780v4
PDF https://arxiv.org/pdf/1806.05780v4.pdf
PWC https://paperswithcode.com/paper/surprising-negative-results-for-generative
Repo https://github.com/bclyang/updated-atari-env
Framework none

Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization

Title Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization
Authors Guokan Shang, Wensi Ding, Zekun Zhang, Antoine Jean-Pierre Tixier, Polykarpos Meladianos, Michalis Vazirgiannis, Jean-Pierre Lorré
Abstract We introduce a novel graph-based framework for abstractive meeting speech summarization that is fully unsupervised and does not rely on any annotations. Our work combines the strengths of multiple recent approaches while addressing their weaknesses. Moreover, we leverage recent advances in word embeddings and graph degeneracy applied to NLP to take exterior semantic knowledge into account, and to design custom diversity and informativeness measures. Experiments on the AMI and ICSI corpus show that our system improves on the state-of-the-art. Code and data are publicly available, and our system can be interactively tested.
Tasks Abstractive Text Summarization, Dialogue Understanding, Meeting Summarization, Sentence Compression, Word Embeddings
Published 2018-05-14
URL http://arxiv.org/abs/1805.05271v2
PDF http://arxiv.org/pdf/1805.05271v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-abstractive-meeting
Repo https://github.com/bearblog/CoreRank
Framework none

Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks

Title Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks
Authors Martin Trapp, Robert Peharz, Carl E. Rasmussen, Franz Pernkopf
Abstract While Gaussian processes (GPs) are the method of choice for regression tasks, they also come with practical difficulties, as inference cost scales cubic in time and quadratic in memory. In this paper, we introduce a natural and expressive way to tackle these problems, by incorporating GPs in sum-product networks (SPNs), a recently proposed tractable probabilistic model allowing exact and efficient inference. In particular, by using GPs as leaves of an SPN we obtain a novel flexible prior over functions, which implicitly represents an exponentially large mixture of local GPs. Exact and efficient posterior inference in this model can be done in a natural interplay of the inference mechanisms in GPs and SPNs. Thereby, each GP is – similarly as in a mixture of experts approach – responsible only for a subset of data points, which effectively reduces inference cost in a divide and conquer fashion. We show that integrating GPs into the SPN framework leads to a promising probabilistic regression model which is: (1) computational and memory efficient, (2) allows efficient and exact posterior inference, (3) is flexible enough to mix different kernel functions, and (4) naturally accounts for non-stationarities in time series. In a variate of experiments, we show that the SPN-GP model can learn input dependent parameters and hyper-parameters and is on par with or outperforms the traditional GPs as well as state of the art approximations on real-world data.
Tasks Gaussian Processes, Time Series
Published 2018-09-12
URL http://arxiv.org/abs/1809.04400v1
PDF http://arxiv.org/pdf/1809.04400v1.pdf
PWC https://paperswithcode.com/paper/learning-deep-mixtures-of-gaussian-process
Repo https://github.com/eugene/spngp
Framework none
comments powered by Disqus