October 20, 2019

3210 words 16 mins read

Paper Group AWR 186

Alarm-Based Prescriptive Process Monitoring. Segmentation of Scanning Tunneling Microscopy Images Using Variational Methods and Empirical Wavelets. Efficient Video Object Segmentation via Network Modulation. Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference. Conditional WaveGAN. Role action embeddings: sc …

Alarm-Based Prescriptive Process Monitoring


Title	Alarm-Based Prescriptive Process Monitoring
Authors	Irene Teinemaa, Niek Tax, Massimiliano de Leoni, Marlon Dumas, Fabrizio Maria Maggi
Abstract	Predictive process monitoring is concerned with the analysis of events produced during the execution of a process in order to predict the future state of ongoing cases thereof. Existing techniques in this field are able to predict, at each step of a case, the likelihood that the case will end up in an undesired outcome. These techniques, however, do not take into account what process workers may do with the generated predictions in order to decrease the likelihood of undesired outcomes. This paper proposes a framework for prescriptive process monitoring, which extends predictive process monitoring approaches with the concepts of alarms, interventions, compensations, and mitigation effects. The framework incorporates a parameterized cost model to assess the cost-benefit tradeoffs of applying prescriptive process monitoring in a given setting. The paper also outlines an approach to optimize the generation of alarms given a dataset and a set of cost model parameters. The proposed approach is empirically evaluated using a range of real-life event logs.
Tasks
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08706v2
PDF	http://arxiv.org/pdf/1803.08706v2.pdf
PWC	https://paperswithcode.com/paper/alarm-based-prescriptive-process-monitoring
Repo	https://github.com/samadeusfp/prescriptiveProcessMonitoring
Framework	none

Segmentation of Scanning Tunneling Microscopy Images Using Variational Methods and Empirical Wavelets


Title	Segmentation of Scanning Tunneling Microscopy Images Using Variational Methods and Empirical Wavelets
Authors	Bui Kevin, Fauman Jacob, Kes David, Torres Mandiola Leticia, Ciomaga Adina, Salazar Ricardo, Bertozzi L. Andrea, Gilles Jerome, Guttentag I. Andrew, Weiss S. Paul
Abstract	In the fields of nanoscience and nanotechnology, it is important to be able to functionalize surfaces chemically for a wide variety of applications. Scanning tunneling microscopes (STMs) are important instruments in this area used to measure the surface structure and chemistry with better than molecular resolution. Self-assembly is frequently used to create monolayers that redefine the surface chemistry in just a single-molecule-thick layer. Indeed, STM images reveal rich information about the structure of self-assembled monolayers since they convey chemical and physical properties of the studied material. In order to assist in and to enhance the analysis of STM and other images, we propose and demonstrate an image-processing framework that produces two image segmentations: one is based on intensities (apparent heights in STM images) and the other is based on textural patterns. The proposed framework begins with a cartoon+texture decomposition, which separates an image into its cartoon and texture components. Afterward, the cartoon image is segmented by a modified multiphase version of the local Chan-Vese model, while the texture image is segmented by a combination of 2D empirical wavelet transform and a clustering algorithm. Overall, our proposed framework contains several new features, specifically in presenting a new application of cartoon+texture decomposition and of the empirical wavelet transforms and in developing a specialized framework to segment STM images and other data. To demonstrate the potential of our approach, we apply it to actual STM images of cyanide monolayers on Au{111} and present their corresponding segmentation results.
Tasks
Published	2018-04-24
URL	http://arxiv.org/abs/1804.08890v1
PDF	http://arxiv.org/pdf/1804.08890v1.pdf
PWC	https://paperswithcode.com/paper/segmentation-of-scanning-tunneling-microscopy
Repo	https://github.com/kbui1993/Microscopy-Codes
Framework	none

Efficient Video Object Segmentation via Network Modulation


Title	Efficient Video Object Segmentation via Network Modulation
Authors	Linjie Yang, Yanran Wang, Xuehan Xiong, Jianchao Yang, Aggelos K. Katsaggelos
Abstract	Video object segmentation targets at segmenting a specific object throughout a video sequence, given only an annotated first frame. Recent deep learning based approaches find it effective by fine-tuning a general-purpose segmentation model on the annotated frame using hundreds of iterations of gradient descent. Despite the high accuracy these methods achieve, the fine-tuning process is inefficient and fail to meet the requirements of real world applications. We propose a novel approach that uses a single forward pass to adapt the segmentation model to the appearance of a specific object. Specifically, a second meta neural network named modulator is learned to manipulate the intermediate layers of the segmentation network given limited visual and spatial information of the target object. The experiments show that our approach is 70times faster than fine-tuning approaches while achieving similar accuracy.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published	2018-02-04
URL	http://arxiv.org/abs/1802.01218v1
PDF	http://arxiv.org/pdf/1802.01218v1.pdf
PWC	https://paperswithcode.com/paper/efficient-video-object-segmentation-via
Repo	https://github.com/linjieyangsc/video_seg
Framework	tf

Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference


Title	Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference
Authors	Mike Wu, Noah Goodman, Stefano Ermon
Abstract	Stochastic optimization techniques are standard in variational inference algorithms. These methods estimate gradients by approximating expectations with independent Monte Carlo samples. In this paper, we explore a technique that uses correlated, but more representative , samples to reduce estimator variance. Specifically, we show how to generate antithetic samples that match sample moments with the true moments of an underlying importance distribution. Combining a differentiable antithetic sampler with modern stochastic variational inference, we showcase the effectiveness of this approach for learning a deep generative model.
Tasks	Stochastic Optimization
Published	2018-10-05
URL	https://arxiv.org/abs/1810.02555v2
PDF	https://arxiv.org/pdf/1810.02555v2.pdf
PWC	https://paperswithcode.com/paper/differentiable-antithetic-sampling-for
Repo	https://github.com/mhw32/antithetic-vae-public
Framework	pytorch

Conditional WaveGAN


Title	Conditional WaveGAN
Authors	Chae Young Lee, Anoop Toffy, Gue Jun Jung, Woo-Jin Han
Abstract	Generative models are successfully used for image synthesis in the recent years. But when it comes to other modalities like audio, text etc little progress has been made. Recent works focus on generating audio from a generative model in an unsupervised setting. We explore the possibility of using generative models conditioned on class labels. Concatenation based conditioning and conditional scaling were explored in this work with various hyper-parameter tuning methods. In this paper we introduce Conditional WaveGANs (cWaveGAN). Find our implementation at https://github.com/acheketa/cwavegan
Tasks	Audio Generation
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10636v1
PDF	http://arxiv.org/pdf/1809.10636v1.pdf
PWC	https://paperswithcode.com/paper/conditional-wavegan
Repo	https://github.com/acheketa/cwavegan
Framework	tf

Role action embeddings: scalable representation of network positions


Title	Role action embeddings: scalable representation of network positions
Authors	George Berry
Abstract	We consider the question of embedding nodes with similar local neighborhoods together in embedding space, commonly referred to as “role embeddings.” We propose RAE, an unsupervised framework that learns role embeddings. It combines a within-node loss function and a graph neural network (GNN) architecture to place nodes with similar local neighborhoods close in embedding space. We also propose a faster way of generating negative examples called neighbor shuffling, which quickly creates negative examples directly within batches. These techniques can be easily combined with existing GNN methods to create unsupervised role embeddings at scale. We then explore role action embeddings, which summarize the non-structural features in a node’s neighborhood, leading to better performance on node classification tasks. We find that the model architecture proposed here provides strong performance on both graph and node classification tasks, in some cases competitive with semi-supervised methods.
Tasks	Node Classification
Published	2018-11-19
URL	http://arxiv.org/abs/1811.08019v2
PDF	http://arxiv.org/pdf/1811.08019v2.pdf
PWC	https://paperswithcode.com/paper/role-action-embeddings-scalable
Repo	https://github.com/georgeberry/role-action-embeddings
Framework	pytorch

Paraphrase Thought: Sentence Embedding Module Imitating Human Language Recognition


Title	Paraphrase Thought: Sentence Embedding Module Imitating Human Language Recognition
Authors	Myeongjun Jang, Pilsung Kang
Abstract	Sentence embedding is an important research topic in natural language processing. It is essential to generate a good embedding vector that fully reflects the semantic meaning of a sentence in order to achieve an enhanced performance for various natural language processing tasks, such as machine translation and document classification. Thus far, various sentence embedding models have been proposed, and their feasibility has been demonstrated through good performances on tasks following embedding, such as sentiment analysis and sentence classification. However, because the performances of sentence classification and sentiment analysis can be enhanced by using a simple sentence representation method, it is not sufficient to claim that these models fully reflect the meanings of sentences based on good performances for such tasks. In this paper, inspired by human language recognition, we propose the following concept of semantic coherence, which should be satisfied for a good sentence embedding method: similar sentences should be located close to each other in the embedding space. Then, we propose the Paraphrase-Thought (P-thought) model to pursue semantic coherence as much as possible. Experimental results on two paraphrase identification datasets (MS COCO and STS benchmark) show that the P-thought models outperform the benchmarked sentence embedding methods.
Tasks	Document Classification, Machine Translation, Paraphrase Identification, Sentence Classification, Sentence Embedding, Sentiment Analysis
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05505v3
PDF	http://arxiv.org/pdf/1808.05505v3.pdf
PWC	https://paperswithcode.com/paper/paraphrase-thought-sentence-embedding-module
Repo	https://github.com/MJ-Jang/Paraphrase-Thought
Framework	tf

Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation


Title	Learning a Code: Machine Learning for Approximate Non-Linear Coded Computation
Authors	Jack Kosaian, K. V. Rashmi, Shivaram Venkataraman
Abstract	Machine learning algorithms are typically run on large scale, distributed compute infrastructure that routinely face a number of unavailabilities such as failures and temporary slowdowns. Adding redundant computations using coding-theoretic tools called “codes” is an emerging technique to alleviate the adverse effects of such unavailabilities. A code consists of an encoding function that proactively introduces redundant computation and a decoding function that reconstructs unavailable outputs using the available ones. Past work focuses on using codes to provide resilience for linear computations and specific iterative optimization algorithms. However, computations performed for a variety of applications including inference on state-of-the-art machine learning algorithms, such as neural networks, typically fall outside this realm. In this paper, we propose taking a learning-based approach to designing codes that can handle non-linear computations. We present carefully designed neural network architectures and a training methodology for learning encoding and decoding functions that produce approximate reconstructions of unavailable computation results. We present extensive experimental results demonstrating the effectiveness of the proposed approach: we show that the our learned codes can accurately reconstruct $64 - 98%$ of the unavailable predictions from neural-network based image classifiers on the MNIST, Fashion-MNIST, and CIFAR-10 datasets. To the best of our knowledge, this work proposes the first learning-based approach for designing codes, and also presents the first coding-theoretic solution that can provide resilience for any non-linear (differentiable) computation. Our results show that learning can be an effective technique for designing codes, and that learned codes are a highly promising approach for bringing the benefits of coding to non-linear computations.
Tasks
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01259v1
PDF	http://arxiv.org/pdf/1806.01259v1.pdf
PWC	https://paperswithcode.com/paper/learning-a-code-machine-learning-for
Repo	https://github.com/Thesys-lab/learned-cc
Framework	pytorch

An Analysis of Neural Language Modeling at Multiple Scales


Title	An Analysis of Neural Language Modeling at Multiple Scales
Authors	Stephen Merity, Nitish Shirish Keskar, Richard Socher
Abstract	Many of the leading approaches in language modeling introduce novel, complex and specialized architectures. We take existing state-of-the-art word level language models based on LSTMs and QRNNs and extend them to both larger vocabularies as well as character-level granularity. When properly tuned, LSTMs and QRNNs achieve state-of-the-art results on character-level (Penn Treebank, enwik8) and word-level (WikiText-103) datasets, respectively. Results are obtained in only 12 hours (WikiText-103) to 2 days (enwik8) using a single modern GPU.
Tasks	Language Modelling
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08240v1
PDF	http://arxiv.org/pdf/1803.08240v1.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-neural-language-modeling-at
Repo	https://github.com/arvieFrydenlund/awd-lstm-lm
Framework	pytorch

Investigating the Effects of Word Substitution Errors on Sentence Embeddings


Title	Investigating the Effects of Word Substitution Errors on Sentence Embeddings
Authors	Rohit Voleti, Julie M. Liss, Visar Berisha
Abstract	A key initial step in several natural language processing (NLP) tasks involves embedding phrases of text to vectors of real numbers that preserve semantic meaning. To that end, several methods have been recently proposed with impressive results on semantic similarity tasks. However, all of these approaches assume that perfect transcripts are available when generating the embeddings. While this is a reasonable assumption for analysis of written text, it is limiting for analysis of transcribed text. In this paper we investigate the effects of word substitution errors, such as those coming from automatic speech recognition errors (ASR), on several state-of-the-art sentence embedding methods. To do this, we propose a new simulator that allows the experimenter to induce ASR-plausible word substitution errors in a corpus at a desired word error rate. We use this simulator to evaluate the robustness of several sentence embedding methods. Our results show that pre-trained neural sentence encoders are both robust to ASR errors and perform well on textual similarity tasks after errors are introduced. Meanwhile, unweighted averages of word vectors perform well with perfect transcriptions, but their performance degrades rapidly on textual similarity tasks for text with word substitution errors.
Tasks	Semantic Similarity, Semantic Textual Similarity, Sentence Embedding, Sentence Embeddings, Speech Recognition
Published	2018-11-16
URL	http://arxiv.org/abs/1811.07021v2
PDF	http://arxiv.org/pdf/1811.07021v2.pdf
PWC	https://paperswithcode.com/paper/investigating-the-effects-of-word
Repo	https://github.com/rvoleti89/icaasp-2019-word-subs
Framework	none

Automatic Gradient Boosting


Title	Automatic Gradient Boosting
Authors	Janek Thomas, Stefan Coors, Bernd Bischl
Abstract	Automatic machine learning performs predictive modeling with high performing machine learning tools without human interference. This is achieved by making machine learning applications parameter-free, i.e. only a dataset is provided while the complete model selection and model building process is handled internally through (often meta) optimization. Projects like Auto-WEKA and auto-sklearn aim to solve the Combined Algorithm Selection and Hyperparameter optimization (CASH) problem resulting in huge configuration spaces. However, for most real-world applications, the optimization over only a few different key learning algorithms can not only be sufficient, but also potentially beneficial. The latter becomes apparent when one considers that models have to be validated, explained, deployed and maintained. Here, less complex model are often preferred, for validation or efficiency reasons, or even a strict requirement. Automatic gradient boosting simplifies this idea one step further, using only gradient boosting as a single learning algorithm in combination with model-based hyperparameter tuning, threshold optimization and encoding of categorical features. We introduce this general framework as well as a concrete implementation called autoxgboost. It is compared to current AutoML projects on 16 datasets and despite its simplicity is able to achieve comparable results on about half of the datasets as well as performing best on two.
Tasks	AutoML, Hyperparameter Optimization, Model Selection
Published	2018-07-10
URL	http://arxiv.org/abs/1807.03873v2
PDF	http://arxiv.org/pdf/1807.03873v2.pdf
PWC	https://paperswithcode.com/paper/automatic-gradient-boosting
Repo	https://github.com/ja-thomas/autoxgboost
Framework	none

Efficient Solvers for Sparse Subspace Clustering


Title	Efficient Solvers for Sparse Subspace Clustering
Authors	Farhad Pourkamali-Anaraki, James Folberth, Stephen Becker
Abstract	Sparse subspace clustering (SSC) clusters $n$ points that lie near a union of low-dimensional subspaces. The SSC model expresses each point as a linear or affine combination of the other points, using either $\ell_1$ or $\ell_0$ regularization. Using $\ell_1$ regularization results in a convex problem but requires $O(n^2)$ storage, and is typically solved by the alternating direction method of multipliers which takes $O(n^3)$ flops. The $\ell_0$ model is non-convex but only needs memory linear in $n$, and is solved via orthogonal matching pursuit and cannot handle the case of affine subspaces. This paper shows that a proximal gradient framework can solve SSC, covering both $\ell_1$ and $\ell_0$ models, and both linear and affine constraints. For both $\ell_1$ and $\ell_0$, algorithms to compute the proximity operator in the presence of affine constraints have not been presented in the SSC literature, so we derive an exact and efficient algorithm that solves the $\ell_1$ case with just $O(n^2)$ flops. In the $\ell_0$ case, our algorithm retains the low-memory overhead, and is the first algorithm to solve the SSC-$\ell_0$ model with affine constraints. Experiments show our algorithms do not rely on sensitive regularization parameters, and they are less sensitive to sparsity misspecification and high noise.
Tasks
Published	2018-04-17
URL	https://arxiv.org/abs/1804.06291v2
PDF	https://arxiv.org/pdf/1804.06291v2.pdf
PWC	https://paperswithcode.com/paper/efficient-solvers-for-sparse-subspace
Repo	https://github.com/stephenbeckr/SSC
Framework	none

Surprising Negative Results for Generative Adversarial Tree Search


Title	Surprising Negative Results for Generative Adversarial Tree Search
Authors	Kamyar Azizzadenesheli, Brandon Yang, Weitang Liu, Zachary C Lipton, Animashree Anandkumar
Abstract	While many recent advances in deep reinforcement learning (RL) rely on model-free methods, model-based approaches remain an alluring prospect for their potential to exploit unsupervised data to learn environment model. In this work, we provide an extensive study on the design of deep generative models for RL environments and propose a sample efficient and robust method to learn the model of Atari environments. We deploy this model and propose generative adversarial tree search (GATS) a deep RL algorithm that learns the environment model and implements Monte Carlo tree search (MCTS) on the learned model for planning. While MCTS on the learned model is computationally expensive, similar to AlphaGo, GATS follows depth limited MCTS. GATS employs deep Q network (DQN) and learns a Q-function to assign values to the leaves of the tree in MCTS. We theoretical analyze GATS vis-a-vis the bias-variance trade-off and show GATS is able to mitigate the worst-case error in the Q-estimate. While we were expecting GATS to enjoy a better sample complexity and faster converges to better policies, surprisingly, GATS fails to outperform DQN. We provide a study on which we show why depth limited MCTS fails to perform desirably.
Tasks	Atari Games
Published	2018-06-15
URL	https://arxiv.org/abs/1806.05780v4
PDF	https://arxiv.org/pdf/1806.05780v4.pdf
PWC	https://paperswithcode.com/paper/surprising-negative-results-for-generative
Repo	https://github.com/bclyang/updated-atari-env
Framework	none

Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization


Title	Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization
Authors	Guokan Shang, Wensi Ding, Zekun Zhang, Antoine Jean-Pierre Tixier, Polykarpos Meladianos, Michalis Vazirgiannis, Jean-Pierre Lorré
Abstract	We introduce a novel graph-based framework for abstractive meeting speech summarization that is fully unsupervised and does not rely on any annotations. Our work combines the strengths of multiple recent approaches while addressing their weaknesses. Moreover, we leverage recent advances in word embeddings and graph degeneracy applied to NLP to take exterior semantic knowledge into account, and to design custom diversity and informativeness measures. Experiments on the AMI and ICSI corpus show that our system improves on the state-of-the-art. Code and data are publicly available, and our system can be interactively tested.
Tasks	Abstractive Text Summarization, Dialogue Understanding, Meeting Summarization, Sentence Compression, Word Embeddings
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05271v2
PDF	http://arxiv.org/pdf/1805.05271v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-abstractive-meeting
Repo	https://github.com/bearblog/CoreRank
Framework	none

Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks


Title	Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks
Authors	Martin Trapp, Robert Peharz, Carl E. Rasmussen, Franz Pernkopf
Abstract	While Gaussian processes (GPs) are the method of choice for regression tasks, they also come with practical difficulties, as inference cost scales cubic in time and quadratic in memory. In this paper, we introduce a natural and expressive way to tackle these problems, by incorporating GPs in sum-product networks (SPNs), a recently proposed tractable probabilistic model allowing exact and efficient inference. In particular, by using GPs as leaves of an SPN we obtain a novel flexible prior over functions, which implicitly represents an exponentially large mixture of local GPs. Exact and efficient posterior inference in this model can be done in a natural interplay of the inference mechanisms in GPs and SPNs. Thereby, each GP is – similarly as in a mixture of experts approach – responsible only for a subset of data points, which effectively reduces inference cost in a divide and conquer fashion. We show that integrating GPs into the SPN framework leads to a promising probabilistic regression model which is: (1) computational and memory efficient, (2) allows efficient and exact posterior inference, (3) is flexible enough to mix different kernel functions, and (4) naturally accounts for non-stationarities in time series. In a variate of experiments, we show that the SPN-GP model can learn input dependent parameters and hyper-parameters and is on par with or outperforms the traditional GPs as well as state of the art approximations on real-world data.
Tasks	Gaussian Processes, Time Series
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04400v1
PDF	http://arxiv.org/pdf/1809.04400v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-mixtures-of-gaussian-process
Repo	https://github.com/eugene/spngp
Framework	none