July 27, 2019

3295 words 16 mins read

Paper Group ANR 543

Massive Open Online Courses Temporal Profiling for Dropout Prediction. Mining Process Model Descriptions of Daily Life through Event Abstraction. Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention. Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation. Scalable Recollections for Continual Lifelong Lea …

Massive Open Online Courses Temporal Profiling for Dropout Prediction


Title	Massive Open Online Courses Temporal Profiling for Dropout Prediction
Authors	Tom Rolandus Hagedoorn, Gerasimos Spanakis
Abstract	Massive Open Online Courses (MOOCs) are attracting the attention of people all over the world. Regardless the platform, numbers of registrants for online courses are impressive but in the same time, completion rates are disappointing. Understanding the mechanisms of dropping out based on the learner profile arises as a crucial task in MOOCs, since it will allow intervening at the right moment in order to assist the learner in completing the course. In this paper, the dropout behaviour of learners in a MOOC is thoroughly studied by first extracting features that describe the behavior of learners within the course and then by comparing three classifiers (Logistic Regression, Random Forest and AdaBoost) in two tasks: predicting which users will have dropped out by a certain week and predicting which users will drop out on a specific week. The former has showed to be considerably easier, with all three classifiers performing equally well. However, the accuracy for the second task is lower, and Logistic Regression tends to perform slightly better than the other two algorithms. We found that features that reflect an active attitude of the user towards the MOOC, such as submitting their assignment, posting on the Forum and filling their Profile, are strong indicators of persistence.
Tasks
Published	2017-10-09
URL	http://arxiv.org/abs/1710.03323v1
PDF	http://arxiv.org/pdf/1710.03323v1.pdf
PWC	https://paperswithcode.com/paper/massive-open-online-courses-temporal
Repo
Framework

Mining Process Model Descriptions of Daily Life through Event Abstraction


Title	Mining Process Model Descriptions of Daily Life through Event Abstraction
Authors	Niek Tax, Natalia Sidorova, Reinder Haakma, Wil M. P. van der Aalst
Abstract	Process mining techniques focus on extracting insight in processes from event logs. Process mining has the potential to provide valuable insights in (un)healthy habits and to contribute to ambient assisted living solutions when applied on data from smart home environments. However, events recorded in smart home environments are on the level of sensor triggers, at which process discovery algorithms produce overgeneralizing process models that allow for too much behavior and that are difficult to interpret for human experts. We show that abstracting the events to a higher-level interpretation can enable discovery of more precise and more comprehensible models. We present a framework for the extraction of features that can be used for abstraction with supervised learning methods that is based on the XES IEEE standard for event logs. This framework can automatically abstract sensor-level events to their interpretation at the human activity level, after training it on training data for which both the sensor and human activity events are known. We demonstrate our abstraction framework on three real-life smart home event logs and show that the process models that can be discovered after abstraction are more precise indeed.
Tasks
Published	2017-05-25
URL	http://arxiv.org/abs/1705.10202v1
PDF	http://arxiv.org/pdf/1705.10202v1.pdf
PWC	https://paperswithcode.com/paper/mining-process-model-descriptions-of-daily
Repo
Framework

Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention


Title	Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention
Authors	Jinkyu Kim, John Canny
Abstract	Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers etc., can understand what triggered a particular behavior. Here we explore the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network’s output (steering control). Our approach is two-stage. In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network’s output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network’s behavior. We demonstrate the effectiveness of our model on three datasets totaling 16 hours of driving. We first show that training with attention does not degrade the performance of the end-to-end network. Then we show that the network causally cues on a variety of features that are used by humans while driving.
Tasks	Self-Driving Cars, Steering Control
Published	2017-03-30
URL	http://arxiv.org/abs/1703.10631v1
PDF	http://arxiv.org/pdf/1703.10631v1.pdf
PWC	https://paperswithcode.com/paper/interpretable-learning-for-self-driving-cars
Repo
Framework

Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation


Title	Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation
Authors	Huadong Chen, Shujian Huang, David Chiang, Xinyu Dai, Jiajun Chen
Abstract	Pairwise ranking methods are the basis of many widely used discriminative training approaches for structure prediction problems in natural language processing(NLP). Decomposing the problem of ranking hypotheses into pairwise comparisons enables simple and efficient solutions. However, neglecting the global ordering of the hypothesis list may hinder learning. We propose a listwise learning framework for structure prediction problems such as machine translation. Our framework directly models the entire translation list’s ordering to learn parameters which may better fit the given listwise samples. Furthermore, we propose top-rank enhanced loss functions, which are more sensitive to ranking errors at higher positions. Experiments on a large-scale Chinese-English translation task show that both our listwise learning framework and top-rank enhanced listwise losses lead to significant improvements in translation quality.
Tasks	Machine Translation
Published	2017-07-18
URL	http://arxiv.org/abs/1707.05438v1
PDF	http://arxiv.org/pdf/1707.05438v1.pdf
PWC	https://paperswithcode.com/paper/top-rank-enhanced-listwise-optimization-for
Repo
Framework

Scalable Recollections for Continual Lifelong Learning


Title	Scalable Recollections for Continual Lifelong Learning
Authors	Matthew Riemer, Tim Klinger, Djallel Bouneffouf, Michele Franceschini
Abstract	Given the recent success of Deep Learning applied to a variety of single tasks, it is natural to consider more human-realistic settings. Perhaps the most difficult of these settings is that of continual lifelong learning, where the model must learn online over a continuous stream of non-stationary data. A successful continual lifelong learning system must have three key capabilities: it must learn and adapt over time, it must not forget what it has learned, and it must be efficient in both training time and memory. Recent techniques have focused their efforts primarily on the first two capabilities while questions of efficiency remain largely unexplored. In this paper, we consider the problem of efficient and effective storage of experiences over very large time-frames. In particular we consider the case where typical experiences are O(n) bits and memories are limited to O(k) bits for k « n. We present a novel scalable architecture and training algorithm in this challenging domain and provide an extensive evaluation of its performance. Our results show that we can achieve considerable gains on top of state-of-the-art methods such as GEM.
Tasks
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06761v4
PDF	http://arxiv.org/pdf/1711.06761v4.pdf
PWC	https://paperswithcode.com/paper/scalable-recollections-for-continual-lifelong
Repo
Framework

Curvature-aware Manifold Learning


Title	Curvature-aware Manifold Learning
Authors	Yangyang Li
Abstract	Traditional manifold learning algorithms assumed that the embedded manifold is globally or locally isometric to Euclidean space. Under this assumption, they divided manifold into a set of overlapping local patches which are locally isometric to linear subsets of Euclidean space. By analyzing the global or local isometry assumptions it can be shown that the learnt manifold is a flat manifold with zero Riemannian curvature tensor. In general, manifolds may not satisfy these hypotheses. One major limitation of traditional manifold learning is that it does not consider the curvature information of manifold. In order to remove these limitations, we present our curvature-aware manifold learning algorithm called CAML. The purpose of our algorithm is to break the local isometry assumption and to reduce the dimension of the general manifold which is not isometric to Euclidean space. Thus, our method adds the curvature information to the process of manifold learning. The experiments have shown that our method CAML is more stable than other manifold learning algorithms by comparing the neighborhood preserving ratios.
Tasks
Published	2017-06-22
URL	http://arxiv.org/abs/1706.07167v1
PDF	http://arxiv.org/pdf/1706.07167v1.pdf
PWC	https://paperswithcode.com/paper/curvature-aware-manifold-learning
Repo
Framework

Improved Twitter Sentiment Analysis Using Naive Bayes and Custom Language Model


Title	Improved Twitter Sentiment Analysis Using Naive Bayes and Custom Language Model
Authors	Angela Lin
Abstract	In the last couple decades, social network services like Twitter have generated large volumes of data about users and their interests, providing meaningful business intelligence so organizations can better understand and engage their customers. All businesses want to know who is promoting their products, who is complaining about them, and how are these opinions bringing or diminishing value to a company. Companies want to be able to identify their high-value customers and quantify the value each user brings. Many businesses use social media metrics to calculate the user contribution score, which enables them to quantify the value that influential users bring on social media, so the businesses can offer them more differentiated services. However, the score calculation can be refined to provide a better illustration of a user’s contribution. Using Microsoft Azure as a case study, we conducted Twitter sentiment analysis to develop a machine learning classification model that identifies tweet contents and sentiments most illustrative of positive-value user contribution. Using data mining and AI-powered cognitive tools, we analyzed factors of social influence and specifically, promotional language in the developer community. Our predictive model was a combination of a traditional supervised machine learning algorithm and a custom-developed natural language model for identifying promotional tweets, that identifies a product-specific promotion on Twitter with a 90% accuracy rate.
Tasks	Language Modelling, Sentiment Analysis, Twitter Sentiment Analysis
Published	2017-11-10
URL	http://arxiv.org/abs/1711.11081v1
PDF	http://arxiv.org/pdf/1711.11081v1.pdf
PWC	https://paperswithcode.com/paper/improved-twitter-sentiment-analysis-using
Repo
Framework

Similarity Preserving Representation Learning for Time Series Clustering


Title	Similarity Preserving Representation Learning for Time Series Clustering
Authors	Qi Lei, Jinfeng Yi, Roman Vaculin, Lingfei Wu, Inderjit S. Dhillon
Abstract	A considerable amount of clustering algorithms take instance-feature matrices as their inputs. As such, they cannot directly analyze time series data due to its temporal nature, usually unequal lengths, and complex properties. This is a great pity since many of these algorithms are effective, robust, efficient, and easy to use. In this paper, we bridge this gap by proposing an efficient representation learning framework that is able to convert a set of time series with various lengths to an instance-feature matrix. In particular, we guarantee that the pairwise similarities between time series are well preserved after the transformation, thus the learned feature representation is particularly suitable for the time series clustering task. Given a set of $n$ time series, we first construct an $n\times n$ partially-observed similarity matrix by randomly sampling $\mathcal{O}(n \log n)$ pairs of time series and computing their pairwise similarities. We then propose an efficient algorithm that solves a non-convex and NP-hard problem to learn new features based on the partially-observed similarity matrix. By conducting extensive empirical studies, we show that the proposed framework is more effective, efficient, and flexible, compared to other state-of-the-art time series clustering methods.
Tasks	Representation Learning, Time Series, Time Series Analysis, Time Series Clustering
Published	2017-02-12
URL	https://arxiv.org/abs/1702.03584v3
PDF	https://arxiv.org/pdf/1702.03584v3.pdf
PWC	https://paperswithcode.com/paper/similarity-preserving-representation-learning
Repo
Framework

RenderMap: Exploiting the Link Between Perception and Rendering for Dense Mapping


Title	RenderMap: Exploiting the Link Between Perception and Rendering for Dense Mapping
Authors	Julian Ryde, Xuchu, Ding
Abstract	We introduce an approach for the real-time (2Hz) creation of a dense map and alignment of a moving robotic agent within that map by rendering using a Graphics Processing Unit (GPU). This is done by recasting the scan alignment part of the dense mapping process as a rendering task. Alignment errors are computed from rendering the scene, comparing with range data from the sensors, and minimized by an optimizer. The proposed approach takes advantage of the advances in rendering techniques for computer graphics and GPU hardware to accelerate the algorithm. Moreover, it allows one to exploit information not used in classic dense mapping algorithms such as Iterative Closest Point (ICP) by rendering interfaces between the free space, occupied space and the unknown. The proposed approach leverages directly the rendering capabilities of the GPU, in contrast to other GPU-based approaches that deploy the GPU as a general purpose parallel computation platform. We argue that the proposed concept is a general consequence of treating perception problems as inverse problems of rendering. Many perception problems can be recast into a form where much of the computation is replaced by render operations. This is not only efficient since rendering is fast, but also simpler to implement and will naturally benefit from future advancements in GPU speed and rendering techniques. Furthermore, this general concept can go beyond addressing perception problems and can be used for other problem domains such as path planning.
Tasks
Published	2017-02-21
URL	http://arxiv.org/abs/1702.06813v1
PDF	http://arxiv.org/pdf/1702.06813v1.pdf
PWC	https://paperswithcode.com/paper/rendermap-exploiting-the-link-between
Repo
Framework

Generative Encoder-Decoder Models for Task-Oriented Spoken Dialog Systems with Chatting Capability


Title	Generative Encoder-Decoder Models for Task-Oriented Spoken Dialog Systems with Chatting Capability
Authors	Tiancheng Zhao, Allen Lu, Kyusong Lee, Maxine Eskenazi
Abstract	Generative encoder-decoder models offer great promise in developing domain-general dialog systems. However, they have mainly been applied to open-domain conversations. This paper presents a practical and novel framework for building task-oriented dialog systems based on encoder-decoder models. This framework enables encoder-decoder models to accomplish slot-value independent decision-making and interact with external databases. Moreover, this paper shows the flexibility of the proposed method by interleaving chatting capability with a slot-filling system for better out-of-domain recovery. The models were trained on both real-user data from a bus information system and human-human chat data. Results show that the proposed framework achieves good performance in both offline evaluation metrics and in task success rate with human users.
Tasks	Decision Making, Slot Filling
Published	2017-06-26
URL	http://arxiv.org/abs/1706.08476v1
PDF	http://arxiv.org/pdf/1706.08476v1.pdf
PWC	https://paperswithcode.com/paper/generative-encoder-decoder-models-for-task
Repo
Framework

Glitch Classification and Clustering for LIGO with Deep Transfer Learning


Title	Glitch Classification and Clustering for LIGO with Deep Transfer Learning
Authors	Daniel George, Hongyu Shen, E. A. Huerta
Abstract	The detection of gravitational waves with LIGO and Virgo requires a detailed understanding of the response of these instruments in the presence of environmental and instrumental noise. Of particular interest is the study of anomalous non-Gaussian noise transients known as glitches, since their high occurrence rate in LIGO/Virgo data can obscure or even mimic true gravitational wave signals. Therefore, successfully identifying and excising glitches is of utmost importance to detect and characterize gravitational waves. In this article, we present the first application of Deep Learning combined with Transfer Learning for glitch classification, using real data from LIGO’s first discovery campaign labeled by Gravity Spy, showing that knowledge from pre-trained models for real-world object recognition can be transferred for classifying spectrograms of glitches. We demonstrate that this method enables the optimal use of very deep convolutional neural networks for glitch classification given small unbalanced training datasets, significantly reduces the training time, and achieves state-of-the-art accuracy above 98.8%. Once trained via transfer learning, we show that the networks can be truncated and used as feature extractors for unsupervised clustering to automatically group together new classes of glitches and anomalies. This novel capability is of critical importance to identify and remove new types of glitches which will occur as the LIGO/Virgo detectors gradually attain design sensitivity.
Tasks	Object Recognition, Transfer Learning
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07468v2
PDF	http://arxiv.org/pdf/1711.07468v2.pdf
PWC	https://paperswithcode.com/paper/glitch-classification-and-clustering-for-ligo
Repo
Framework

Discriminative conditional restricted Boltzmann machine for discrete choice and latent variable modelling


Title	Discriminative conditional restricted Boltzmann machine for discrete choice and latent variable modelling
Authors	Melvin Wong, Bilal Farooq, Guillaume-Alexandre Bilodeau
Abstract	Conventional methods of estimating latent behaviour generally use attitudinal questions which are subjective and these survey questions may not always be available. We hypothesize that an alternative approach can be used for latent variable estimation through an undirected graphical models. For instance, non-parametric artificial neural networks. In this study, we explore the use of generative non-parametric modelling methods to estimate latent variables from prior choice distribution without the conventional use of measurement indicators. A restricted Boltzmann machine is used to represent latent behaviour factors by analyzing the relationship information between the observed choices and explanatory variables. The algorithm is adapted for latent behaviour analysis in discrete choice scenario and we use a graphical approach to evaluate and understand the semantic meaning from estimated parameter vector values. We illustrate our methodology on a financial instrument choice dataset and perform statistical analysis on parameter sensitivity and stability. Our findings show that through non-parametric statistical tests, we can extract useful latent information on the behaviour of latent constructs through machine learning methods and present strong and significant influence on the choice process. Furthermore, our modelling framework shows robustness in input variability through sampling and validation.
Tasks
Published	2017-06-01
URL	http://arxiv.org/abs/1706.00505v1
PDF	http://arxiv.org/pdf/1706.00505v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-conditional-restricted
Repo
Framework

Underestimated cost of targeted attacks on complex networks


Title	Underestimated cost of targeted attacks on complex networks
Authors	Xiao-Long Ren, Niels Gleinig, Dijana Tolic, Nino Antulov-Fantulin
Abstract	The robustness of complex networks under targeted attacks is deeply connected to the resilience of complex systems, i.e., the ability to make appropriate responses to the attacks. In this article, we investigated the state-of-the-art targeted node attack algorithms and demonstrate that they become very inefficient when the cost of the attack is taken into consideration. In this paper, we made explicit assumption that the cost of removing a node is proportional to the number of adjacent links that are removed, i.e., higher degree nodes have higher cost. Finally, for the case when it is possible to attack links, we propose a simple and efficient edge removal strategy named Hierarchical Power Iterative Normalized cut (HPI-Ncut).The results on real and artificial networks show that the HPI-Ncut algorithm outperforms all the node removal and link removal attack algorithms when the cost of the attack is taken into consideration. In addition, we show that on sparse networks, the complexity of this hierarchical power iteration edge removal algorithm is only $O(n\log^{2+\epsilon}(n))$.
Tasks
Published	2017-10-10
URL	http://arxiv.org/abs/1710.03522v1
PDF	http://arxiv.org/pdf/1710.03522v1.pdf
PWC	https://paperswithcode.com/paper/underestimated-cost-of-targeted-attacks-on
Repo
Framework

Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes


Title	Efficient Modeling of Latent Information in Supervised Learning using Gaussian Processes
Authors	Zhenwen Dai, Mauricio A. Álvarez, Neil D. Lawrence
Abstract	Often in machine learning, data are collected as a combination of multiple conditions, e.g., the voice recordings of multiple persons, each labeled with an ID. How could we build a model that captures the latent information related to these conditions and generalize to a new one with few data? We present a new model called Latent Variable Multiple Output Gaussian Processes (LVMOGP) and that allows to jointly model multiple conditions for regression and generalize to a new condition with a few data points at test time. LVMOGP infers the posteriors of Gaussian processes together with a latent space representing the information about different conditions. We derive an efficient variational inference method for LVMOGP, of which the computational complexity is as low as sparse Gaussian processes. We show that LVMOGP significantly outperforms related Gaussian process methods on various tasks with both synthetic and real data.
Tasks	Gaussian Processes
Published	2017-05-27
URL	http://arxiv.org/abs/1705.09862v1
PDF	http://arxiv.org/pdf/1705.09862v1.pdf
PWC	https://paperswithcode.com/paper/efficient-modeling-of-latent-information-in
Repo
Framework

Kernel Implicit Variational Inference


Title	Kernel Implicit Variational Inference
Authors	Jiaxin Shi, Shengyang Sun, Jun Zhu
Abstract	Recent progress in variational inference has paid much attention to the flexibility of variational posteriors. One promising direction is to use implicit distributions, i.e., distributions without tractable densities as the variational posterior. However, existing methods on implicit posteriors still face challenges of noisy estimation and computational infeasibility when applied to models with high-dimensional latent variables. In this paper, we present a new approach named Kernel Implicit Variational Inference that addresses these challenges. As far as we know, for the first time implicit variational inference is successfully applied to Bayesian neural networks, which shows promising results on both regression and classification tasks.
Tasks
Published	2017-05-29
URL	http://arxiv.org/abs/1705.10119v3
PDF	http://arxiv.org/pdf/1705.10119v3.pdf
PWC	https://paperswithcode.com/paper/kernel-implicit-variational-inference
Repo
Framework