July 28, 2019

2783 words 14 mins read

Paper Group ANR 358

Qualitative Assessment of Recurrent Human Motion. Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning. A rule based algorithm for detecting negative words in Persian. Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems. Applying the Wizard-of-Oz Technique to Multimodal Human-Robot Dialogue. Robust …

Qualitative Assessment of Recurrent Human Motion


Title	Qualitative Assessment of Recurrent Human Motion
Authors	Andre Ebert, Michael Till Beck, Andy Mattausch, Lenz Belzner, Claudia Linnhoff Popien
Abstract	Smartphone applications designed to track human motion in combination with wearable sensors, e.g., during physical exercising, raised huge attention recently. Commonly, they provide quantitative services, such as personalized training instructions or the counting of distances. But qualitative monitoring and assessment is still missing, e.g., to detect malpositions, to prevent injuries, or to optimize training success. We address this issue by presenting a concept for qualitative as well as generic assessment of recurrent human motion by processing multi-dimensional, continuous time series tracked with motion sensors. Therefore, our segmentation procedure extracts individual events of specific length and we propose expressive features to accomplish a qualitative motion assessment by supervised classification. We verified our approach within a comprehensive study encompassing 27 athletes undertaking different body weight exercises. We are able to recognize six different exercise types with a success rate of 100% and to assess them qualitatively with an average success rate of 99.3%.
Tasks	Time Series
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02363v2
PDF	http://arxiv.org/pdf/1703.02363v2.pdf
PWC	https://paperswithcode.com/paper/qualitative-assessment-of-recurrent-human
Repo
Framework

Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning


Title	Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning
Authors	Paweł Budzianowski, Stefan Ultes, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Iñigo Casanueva, Lina Rojas-Barahona, Milica Gašić
Abstract	Human conversation is inherently complex, often spanning many different topics/domains. This makes policy learning for dialogue systems very challenging. Standard flat reinforcement learning methods do not provide an efficient framework for modelling such dialogues. In this paper, we focus on the under-explored problem of multi-domain dialogue management. First, we propose a new method for hierarchical reinforcement learning using the option framework. Next, we show that the proposed architecture learns faster and arrives at a better policy than the existing flat ones do. Moreover, we show how pretrained policies can be adapted to more complex systems with an additional set of new actions. In doing that, we show that our approach has the potential to facilitate policy optimisation for more sophisticated multi-domain dialogue systems.
Tasks	Dialogue Management, Hierarchical Reinforcement Learning
Published	2017-06-19
URL	http://arxiv.org/abs/1706.06210v2
PDF	http://arxiv.org/pdf/1706.06210v2.pdf
PWC	https://paperswithcode.com/paper/sub-domain-modelling-for-dialogue-management
Repo
Framework

A rule based algorithm for detecting negative words in Persian


Title	A rule based algorithm for detecting negative words in Persian
Authors	Reza Takhshid, Adel Rahimi
Abstract	In this paper, we present a novel method for detecting negative words in Persian. We first used an algorithm to an exceptions list which was later modified by hand. We then used the mentioned lists and a Persian polarity corpus in our rule based algorithm to detect negative words.
Tasks
Published	2017-08-19
URL	http://arxiv.org/abs/1708.06708v1
PDF	http://arxiv.org/pdf/1708.06708v1.pdf
PWC	https://paperswithcode.com/paper/a-rule-based-algorithm-for-detecting-negative
Repo
Framework

Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems


Title	Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems
Authors	Layla El Asri, Hannes Schulz, Shikhar Sharma, Jeremie Zumer, Justin Harris, Emery Fine, Rahul Mehrotra, Kaheer Suleman
Abstract	This paper presents the Frames dataset (Frames is available at http://datasets.maluuba.com/Frames), a corpus of 1369 human-human dialogues with an average of 15 turns per dialogue. We developed this dataset to study the role of memory in goal-oriented dialogue systems. Based on Frames, we introduce a task called frame tracking, which extends state tracking to a setting where several states are tracked simultaneously. We propose a baseline model for this task. We show that Frames can also be used to study memory in dialogue management and information presentation through natural language generation.
Tasks	Dialogue Management, Goal-Oriented Dialogue Systems, Text Generation
Published	2017-03-31
URL	http://arxiv.org/abs/1704.00057v2
PDF	http://arxiv.org/pdf/1704.00057v2.pdf
PWC	https://paperswithcode.com/paper/frames-a-corpus-for-adding-memory-to-goal
Repo
Framework

Applying the Wizard-of-Oz Technique to Multimodal Human-Robot Dialogue


Title	Applying the Wizard-of-Oz Technique to Multimodal Human-Robot Dialogue
Authors	Matthew Marge, Claire Bonial, Brendan Byrne, Taylor Cassidy, A. William Evans, Susan G. Hill, Clare Voss
Abstract	Our overall program objective is to provide more natural ways for soldiers to interact and communicate with robots, much like how soldiers communicate with other soldiers today. We describe how the Wizard-of-Oz (WOz) method can be applied to multimodal human-robot dialogue in a collaborative exploration task. While the WOz method can help design robot behaviors, traditional approaches place the burden of decisions on a single wizard. In this work, we consider two wizards to stand in for robot navigation and dialogue management software components. The scenario used to elicit data is one in which a human-robot team is tasked with exploring an unknown environment: a human gives verbal instructions from a remote location and the robot follows them, clarifying possible misunderstandings as needed via dialogue. We found the division of labor between wizards to be workable, which holds promise for future software development.
Tasks	Dialogue Management, Robot Navigation
Published	2017-03-10
URL	http://arxiv.org/abs/1703.03714v1
PDF	http://arxiv.org/pdf/1703.03714v1.pdf
PWC	https://paperswithcode.com/paper/applying-the-wizard-of-oz-technique-to
Repo
Framework

Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes


Title	Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes
Authors	Jeremiah Zhe Liu, Brent Coull
Abstract	This work constructs a hypothesis test for detecting whether an data-generating function $h: R^p \rightarrow R$ belongs to a specific reproducing kernel Hilbert space $\mathcal{H}_0$ , where the structure of $\mathcal{H}_0$ is only partially known. Utilizing the theory of reproducing kernels, we reduce this hypothesis to a simple one-sided score test for a scalar parameter, develop a testing procedure that is robust against the mis-specification of kernel functions, and also propose an ensemble-based estimator for the null model to guarantee test performance in small samples. To demonstrate the utility of the proposed method, we apply our test to the problem of detecting nonlinear interaction between groups of continuous features. We evaluate the finite-sample performance of our test under different data-generating functions and estimation strategies for the null model. Our results reveal interesting connections between notions in machine learning (model underfit/overfit) and those in statistical inference (i.e. Type I error/power of hypothesis test), and also highlight unexpected consequences of common model estimating strategies (e.g. estimating kernel hyperparameters using maximum likelihood estimation) on model inference.
Tasks	Gaussian Processes
Published	2017-10-03
URL	http://arxiv.org/abs/1710.01406v2
PDF	http://arxiv.org/pdf/1710.01406v2.pdf
PWC	https://paperswithcode.com/paper/robust-hypothesis-test-for-nonlinear-effect
Repo
Framework

Lyrics-to-Audio Alignment by Unsupervised Discovery of Repetitive Patterns in Vowel Acoustics


Title	Lyrics-to-Audio Alignment by Unsupervised Discovery of Repetitive Patterns in Vowel Acoustics
Authors	Sungkyun Chang, Kyogu Lee
Abstract	Most of the previous approaches to lyrics-to-audio alignment used a pre-developed automatic speech recognition (ASR) system that innately suffered from several difficulties to adapt the speech model to individual singers. A significant aspect missing in previous works is the self-learnability of repetitive vowel patterns in the singing voice, where the vowel part used is more consistent than the consonant part. Based on this, our system first learns a discriminative subspace of vowel sequences, based on weighted symmetric non-negative matrix factorization (WS-NMF), by taking the self-similarity of a standard acoustic feature as an input. Then, we make use of canonical time warping (CTW), derived from a recent computer vision technique, to find an optimal spatiotemporal transformation between the text and the acoustic sequences. Experiments with Korean and English data sets showed that deploying this method after a pre-developed, unsupervised, singing source separation achieved more promising results than other state-of-the-art unsupervised approaches and an existing ASR-based system.
Tasks	Speech Recognition
Published	2017-01-21
URL	http://arxiv.org/abs/1701.06078v2
PDF	http://arxiv.org/pdf/1701.06078v2.pdf
PWC	https://paperswithcode.com/paper/lyrics-to-audio-alignment-by-unsupervised
Repo
Framework

Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System


Title	Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System
Authors	Claudia Borg, Albert Gatt
Abstract	Maltese is a morphologically rich language with a hybrid morphological system which features both concatenative and non-concatenative processes. This paper analyses the impact of this hybridity on the performance of machine learning techniques for morphological labelling and clustering. In particular, we analyse a dataset of morphologically related word clusters to evaluate the difference in results for concatenative and nonconcatenative clusters. We also describe research carried out in morphological labelling, with a particular focus on the verb category. Two evaluations were carried out, one using an unseen dataset, and another one using a gold standard dataset which was manually labelled. The gold standard dataset was split into concatenative and non-concatenative to analyse the difference in results between the two morphological systems.
Tasks	Morphological Analysis
Published	2017-03-25
URL	http://arxiv.org/abs/1703.08701v1
PDF	http://arxiv.org/pdf/1703.08701v1.pdf
PWC	https://paperswithcode.com/paper/morphological-analysis-for-the-maltese
Repo
Framework

Deep Convolutional Decision Jungle for Image Classification


Title	Deep Convolutional Decision Jungle for Image Classification
Authors	Seungryul Baek, Kwang In Kim, Tae-Kyun Kim
Abstract	We propose a novel method called deep convolutional decision jungle (CDJ) and its learning algorithm for image classification. The CDJ maintains the structure of standard convolutional neural networks (CNNs), i.e. multiple layers of multiple response maps fully connected. Each response map-or node-in both the convolutional and fully-connected layers selectively respond to class labels s.t. each data sample travels via a specific soft route of those activated nodes. The proposed method CDJ automatically learns features, whereas decision forests and jungles require pre-defined feature sets. Compared to CNNs, the method embeds the benefits of using data-dependent discriminative functions, which better handles multi-modal/heterogeneous data; further,the method offers more diverse sparse network responses, which in turn can be used for cost-effective learning/classification. The network is learnt by combining conventional softmax and proposed entropy losses in each layer. The entropy loss,as used in decision tree growing, measures the purity of data activation according to the class label distribution. The back-propagation rule for the proposed loss function is derived from stochastic gradient descent (SGD) optimization of CNNs. We show that our proposed method outperforms state-of-the-art methods on three public image classification benchmarks and one face verification dataset. We also demonstrate the use of auxiliary data labels, when available, which helps our method to learn more discriminative routing and representations and leads to improved classification.
Tasks	Face Verification, Image Classification
Published	2017-06-06
URL	http://arxiv.org/abs/1706.02003v2
PDF	http://arxiv.org/pdf/1706.02003v2.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-decision-jungle-for-image
Repo
Framework

Detecting Human Interventions on the Landscape: KAZE Features, Poisson Point Processes, and a Construction Dataset


Title	Detecting Human Interventions on the Landscape: KAZE Features, Poisson Point Processes, and a Construction Dataset
Authors	Edward Boyda, Colin McCormick, Dan Hammer
Abstract	We present an algorithm capable of identifying a wide variety of human-induced change on the surface of the planet by analyzing matches between local features in time-sequenced remote sensing imagery. We evaluate feature sets, match protocols, and the statistical modeling of feature matches. With application of KAZE features, k-nearest-neighbor descriptor matching, and geometric proximity and bi-directional match consistency checks, average match rates increase more than two-fold over the previous standard. In testing our platform, we developed a small, labeled benchmark dataset expressing large-scale residential, industrial, and civic construction, along with null instances, in California between the years 2010 and 2012. On the benchmark set, our algorithm makes precise, accurate change proposals on two-thirds of scenes. Further, the detection threshold can be tuned so that all or almost all proposed detections are true positives.
Tasks	Point Processes
Published	2017-03-29
URL	http://arxiv.org/abs/1703.10196v1
PDF	http://arxiv.org/pdf/1703.10196v1.pdf
PWC	https://paperswithcode.com/paper/detecting-human-interventions-on-the
Repo
Framework

An Improved Fatigue Detection System Based on Behavioral Characteristics of Driver


Title	An Improved Fatigue Detection System Based on Behavioral Characteristics of Driver
Authors	Rajat Gupta, Kanishk Aman, Nalin Shiva, Yadvendra Singh
Abstract	In recent years, road accidents have increased significantly. One of the major reasons for these accidents, as reported is driver fatigue. Due to continuous and longtime driving, the driver gets exhausted and drowsy which may lead to an accident. Therefore, there is a need for a system to measure the fatigue level of driver and alert him when he/she feels drowsy to avoid accidents. Thus, we propose a system which comprises of a camera installed on the car dashboard. The camera detect the driver’s face and observe the alteration in its facial features and uses these features to observe the fatigue level. Facial features include eyes and mouth. Principle Component Analysis is thus implemented to reduce the features while minimizing the amount of information lost. The parameters thus obtained are processed through Support Vector Classifier for classifying the fatigue level. After that classifier output is sent to the alert unit.
Tasks
Published	2017-09-17
URL	http://arxiv.org/abs/1709.05669v1
PDF	http://arxiv.org/pdf/1709.05669v1.pdf
PWC	https://paperswithcode.com/paper/an-improved-fatigue-detection-system-based-on
Repo
Framework

BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning


Title	BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
Authors	Ziming Zhang, Yuanwei Wu, Guanghui Wang
Abstract	Understanding the global optimality in deep learning (DL) has been attracting more and more attention recently. Conventional DL solvers, however, have not been developed intentionally to seek for such global optimality. In this paper we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. Our BPGrad algorithm is based on the assumption of Lipschitz continuity in DL, and as a result it can adaptively determine the step size for current gradient given the history of previous updates, wherein theoretically no smaller steps can achieve the global optimality. We prove that, by repeating such branch-and-pruning procedure, we can locate the global optimality within finite iterations. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation.
Tasks	Object Recognition
Published	2017-11-19
URL	http://arxiv.org/abs/1711.06959v1
PDF	http://arxiv.org/pdf/1711.06959v1.pdf
PWC	https://paperswithcode.com/paper/bpgrad-towards-global-optimality-in-deep
Repo
Framework

Learning MSO-definable hypotheses on string


Title	Learning MSO-definable hypotheses on string
Authors	Martin Grohe, Christof Löding, Martin Ritzert
Abstract	We study the classification problems over string data for hypotheses specified by formulas of monadic second-order logic MSO. The goal is to design learning algorithms that run in time polynomial in the size of the training set, independently of or at least sublinear in the size of the whole data set. We prove negative as well as positive results. If the data set is an unprocessed string to which our algorithms have local access, then learning in sublinear time is impossible even for hypotheses definable in a small fragment of first-order logic. If we allow for a linear time pre-processing of the string data to build an index data structure, then learning of MSO-definable hypotheses is possible in time polynomial in the size of the training set, independently of the size of the whole data set.
Tasks
Published	2017-08-27
URL	http://arxiv.org/abs/1708.08081v1
PDF	http://arxiv.org/pdf/1708.08081v1.pdf
PWC	https://paperswithcode.com/paper/learning-mso-definable-hypotheses-on-string
Repo
Framework

Predicting Audience’s Laughter Using Convolutional Neural Network


Title	Predicting Audience’s Laughter Using Convolutional Neural Network
Authors	Lei Chen, Chong MIn Lee
Abstract	For the purpose of automatically evaluating speakers’ humor usage, we build a presentation corpus containing humorous utterances based on TED talks. Compared to previous data resources supporting humor recognition research, ours has several advantages, including (a) both positive and negative instances coming from a homogeneous data set, (b) containing a large number of speakers, and (c) being open. Focusing on using lexical cues for humor recognition, we systematically compare a newly emerging text classification method based on Convolutional Neural Networks (CNNs) with a well-established conventional method using linguistic knowledge. The advantages of the CNN method are both getting higher detection accuracies and being able to learn essential features automatically.
Tasks	Text Classification
Published	2017-02-08
URL	http://arxiv.org/abs/1702.02584v2
PDF	http://arxiv.org/pdf/1702.02584v2.pdf
PWC	https://paperswithcode.com/paper/predicting-audiences-laughter-using
Repo
Framework

Sequence-to-Sequence ASR Optimization via Reinforcement Learning


Title	Sequence-to-Sequence ASR Optimization via Reinforcement Learning
Authors	Andros Tjandra, Sakriani Sakti, Satoshi Nakamura
Abstract	Despite the success of sequence-to-sequence approaches in automatic speech recognition (ASR) systems, the models still suffer from several problems, mainly due to the mismatch between the training and inference conditions. In the sequence-to-sequence architecture, the model is trained to predict the grapheme of the current time-step given the input of speech signal and the ground-truth grapheme history of the previous time-steps. However, it remains unclear how well the model approximates real-world speech during inference. Thus, generating the whole transcription from scratch based on previous predictions is complicated and errors can propagate over time. Furthermore, the model is optimized to maximize the likelihood of training data instead of error rate evaluation metrics that actually quantify recognition quality. This paper presents an alternative strategy for training sequence-to-sequence ASR models by adopting the idea of reinforcement learning (RL). Unlike the standard training scheme with maximum likelihood estimation, our proposed approach utilizes the policy gradient algorithm. We can (1) sample the whole transcription based on the model’s prediction in the training process and (2) directly optimize the model with negative Levenshtein distance as the reward. Experimental results demonstrate that we significantly improved the performance compared to a model trained only with maximum likelihood estimation.
Tasks	Speech Recognition
Published	2017-10-30
URL	http://arxiv.org/abs/1710.10774v2
PDF	http://arxiv.org/pdf/1710.10774v2.pdf
PWC	https://paperswithcode.com/paper/sequence-to-sequence-asr-optimization-via
Repo
Framework