Paper Group ANR 358
Qualitative Assessment of Recurrent Human Motion. Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning. A rule based algorithm for detecting negative words in Persian. Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems. Applying the Wizard-of-Oz Technique to Multimodal Human-Robot Dialogue. Robust …
Qualitative Assessment of Recurrent Human Motion
Title | Qualitative Assessment of Recurrent Human Motion |
Authors | Andre Ebert, Michael Till Beck, Andy Mattausch, Lenz Belzner, Claudia Linnhoff Popien |
Abstract | Smartphone applications designed to track human motion in combination with wearable sensors, e.g., during physical exercising, raised huge attention recently. Commonly, they provide quantitative services, such as personalized training instructions or the counting of distances. But qualitative monitoring and assessment is still missing, e.g., to detect malpositions, to prevent injuries, or to optimize training success. We address this issue by presenting a concept for qualitative as well as generic assessment of recurrent human motion by processing multi-dimensional, continuous time series tracked with motion sensors. Therefore, our segmentation procedure extracts individual events of specific length and we propose expressive features to accomplish a qualitative motion assessment by supervised classification. We verified our approach within a comprehensive study encompassing 27 athletes undertaking different body weight exercises. We are able to recognize six different exercise types with a success rate of 100% and to assess them qualitatively with an average success rate of 99.3%. |
Tasks | Time Series |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02363v2 |
http://arxiv.org/pdf/1703.02363v2.pdf | |
PWC | https://paperswithcode.com/paper/qualitative-assessment-of-recurrent-human |
Repo | |
Framework | |
Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning
Title | Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning |
Authors | Paweł Budzianowski, Stefan Ultes, Pei-Hao Su, Nikola Mrkšić, Tsung-Hsien Wen, Iñigo Casanueva, Lina Rojas-Barahona, Milica Gašić |
Abstract | Human conversation is inherently complex, often spanning many different topics/domains. This makes policy learning for dialogue systems very challenging. Standard flat reinforcement learning methods do not provide an efficient framework for modelling such dialogues. In this paper, we focus on the under-explored problem of multi-domain dialogue management. First, we propose a new method for hierarchical reinforcement learning using the option framework. Next, we show that the proposed architecture learns faster and arrives at a better policy than the existing flat ones do. Moreover, we show how pretrained policies can be adapted to more complex systems with an additional set of new actions. In doing that, we show that our approach has the potential to facilitate policy optimisation for more sophisticated multi-domain dialogue systems. |
Tasks | Dialogue Management, Hierarchical Reinforcement Learning |
Published | 2017-06-19 |
URL | http://arxiv.org/abs/1706.06210v2 |
http://arxiv.org/pdf/1706.06210v2.pdf | |
PWC | https://paperswithcode.com/paper/sub-domain-modelling-for-dialogue-management |
Repo | |
Framework | |
A rule based algorithm for detecting negative words in Persian
Title | A rule based algorithm for detecting negative words in Persian |
Authors | Reza Takhshid, Adel Rahimi |
Abstract | In this paper, we present a novel method for detecting negative words in Persian. We first used an algorithm to an exceptions list which was later modified by hand. We then used the mentioned lists and a Persian polarity corpus in our rule based algorithm to detect negative words. |
Tasks | |
Published | 2017-08-19 |
URL | http://arxiv.org/abs/1708.06708v1 |
http://arxiv.org/pdf/1708.06708v1.pdf | |
PWC | https://paperswithcode.com/paper/a-rule-based-algorithm-for-detecting-negative |
Repo | |
Framework | |
Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems
Title | Frames: A Corpus for Adding Memory to Goal-Oriented Dialogue Systems |
Authors | Layla El Asri, Hannes Schulz, Shikhar Sharma, Jeremie Zumer, Justin Harris, Emery Fine, Rahul Mehrotra, Kaheer Suleman |
Abstract | This paper presents the Frames dataset (Frames is available at http://datasets.maluuba.com/Frames), a corpus of 1369 human-human dialogues with an average of 15 turns per dialogue. We developed this dataset to study the role of memory in goal-oriented dialogue systems. Based on Frames, we introduce a task called frame tracking, which extends state tracking to a setting where several states are tracked simultaneously. We propose a baseline model for this task. We show that Frames can also be used to study memory in dialogue management and information presentation through natural language generation. |
Tasks | Dialogue Management, Goal-Oriented Dialogue Systems, Text Generation |
Published | 2017-03-31 |
URL | http://arxiv.org/abs/1704.00057v2 |
http://arxiv.org/pdf/1704.00057v2.pdf | |
PWC | https://paperswithcode.com/paper/frames-a-corpus-for-adding-memory-to-goal |
Repo | |
Framework | |
Applying the Wizard-of-Oz Technique to Multimodal Human-Robot Dialogue
Title | Applying the Wizard-of-Oz Technique to Multimodal Human-Robot Dialogue |
Authors | Matthew Marge, Claire Bonial, Brendan Byrne, Taylor Cassidy, A. William Evans, Susan G. Hill, Clare Voss |
Abstract | Our overall program objective is to provide more natural ways for soldiers to interact and communicate with robots, much like how soldiers communicate with other soldiers today. We describe how the Wizard-of-Oz (WOz) method can be applied to multimodal human-robot dialogue in a collaborative exploration task. While the WOz method can help design robot behaviors, traditional approaches place the burden of decisions on a single wizard. In this work, we consider two wizards to stand in for robot navigation and dialogue management software components. The scenario used to elicit data is one in which a human-robot team is tasked with exploring an unknown environment: a human gives verbal instructions from a remote location and the robot follows them, clarifying possible misunderstandings as needed via dialogue. We found the division of labor between wizards to be workable, which holds promise for future software development. |
Tasks | Dialogue Management, Robot Navigation |
Published | 2017-03-10 |
URL | http://arxiv.org/abs/1703.03714v1 |
http://arxiv.org/pdf/1703.03714v1.pdf | |
PWC | https://paperswithcode.com/paper/applying-the-wizard-of-oz-technique-to |
Repo | |
Framework | |
Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes
Title | Robust Hypothesis Test for Nonlinear Effect with Gaussian Processes |
Authors | Jeremiah Zhe Liu, Brent Coull |
Abstract | This work constructs a hypothesis test for detecting whether an data-generating function $h: R^p \rightarrow R$ belongs to a specific reproducing kernel Hilbert space $\mathcal{H}_0$ , where the structure of $\mathcal{H}_0$ is only partially known. Utilizing the theory of reproducing kernels, we reduce this hypothesis to a simple one-sided score test for a scalar parameter, develop a testing procedure that is robust against the mis-specification of kernel functions, and also propose an ensemble-based estimator for the null model to guarantee test performance in small samples. To demonstrate the utility of the proposed method, we apply our test to the problem of detecting nonlinear interaction between groups of continuous features. We evaluate the finite-sample performance of our test under different data-generating functions and estimation strategies for the null model. Our results reveal interesting connections between notions in machine learning (model underfit/overfit) and those in statistical inference (i.e. Type I error/power of hypothesis test), and also highlight unexpected consequences of common model estimating strategies (e.g. estimating kernel hyperparameters using maximum likelihood estimation) on model inference. |
Tasks | Gaussian Processes |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01406v2 |
http://arxiv.org/pdf/1710.01406v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-hypothesis-test-for-nonlinear-effect |
Repo | |
Framework | |
Lyrics-to-Audio Alignment by Unsupervised Discovery of Repetitive Patterns in Vowel Acoustics
Title | Lyrics-to-Audio Alignment by Unsupervised Discovery of Repetitive Patterns in Vowel Acoustics |
Authors | Sungkyun Chang, Kyogu Lee |
Abstract | Most of the previous approaches to lyrics-to-audio alignment used a pre-developed automatic speech recognition (ASR) system that innately suffered from several difficulties to adapt the speech model to individual singers. A significant aspect missing in previous works is the self-learnability of repetitive vowel patterns in the singing voice, where the vowel part used is more consistent than the consonant part. Based on this, our system first learns a discriminative subspace of vowel sequences, based on weighted symmetric non-negative matrix factorization (WS-NMF), by taking the self-similarity of a standard acoustic feature as an input. Then, we make use of canonical time warping (CTW), derived from a recent computer vision technique, to find an optimal spatiotemporal transformation between the text and the acoustic sequences. Experiments with Korean and English data sets showed that deploying this method after a pre-developed, unsupervised, singing source separation achieved more promising results than other state-of-the-art unsupervised approaches and an existing ASR-based system. |
Tasks | Speech Recognition |
Published | 2017-01-21 |
URL | http://arxiv.org/abs/1701.06078v2 |
http://arxiv.org/pdf/1701.06078v2.pdf | |
PWC | https://paperswithcode.com/paper/lyrics-to-audio-alignment-by-unsupervised |
Repo | |
Framework | |
Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System
Title | Morphological Analysis for the Maltese Language: The Challenges of a Hybrid System |
Authors | Claudia Borg, Albert Gatt |
Abstract | Maltese is a morphologically rich language with a hybrid morphological system which features both concatenative and non-concatenative processes. This paper analyses the impact of this hybridity on the performance of machine learning techniques for morphological labelling and clustering. In particular, we analyse a dataset of morphologically related word clusters to evaluate the difference in results for concatenative and nonconcatenative clusters. We also describe research carried out in morphological labelling, with a particular focus on the verb category. Two evaluations were carried out, one using an unseen dataset, and another one using a gold standard dataset which was manually labelled. The gold standard dataset was split into concatenative and non-concatenative to analyse the difference in results between the two morphological systems. |
Tasks | Morphological Analysis |
Published | 2017-03-25 |
URL | http://arxiv.org/abs/1703.08701v1 |
http://arxiv.org/pdf/1703.08701v1.pdf | |
PWC | https://paperswithcode.com/paper/morphological-analysis-for-the-maltese |
Repo | |
Framework | |
Deep Convolutional Decision Jungle for Image Classification
Title | Deep Convolutional Decision Jungle for Image Classification |
Authors | Seungryul Baek, Kwang In Kim, Tae-Kyun Kim |
Abstract | We propose a novel method called deep convolutional decision jungle (CDJ) and its learning algorithm for image classification. The CDJ maintains the structure of standard convolutional neural networks (CNNs), i.e. multiple layers of multiple response maps fully connected. Each response map-or node-in both the convolutional and fully-connected layers selectively respond to class labels s.t. each data sample travels via a specific soft route of those activated nodes. The proposed method CDJ automatically learns features, whereas decision forests and jungles require pre-defined feature sets. Compared to CNNs, the method embeds the benefits of using data-dependent discriminative functions, which better handles multi-modal/heterogeneous data; further,the method offers more diverse sparse network responses, which in turn can be used for cost-effective learning/classification. The network is learnt by combining conventional softmax and proposed entropy losses in each layer. The entropy loss,as used in decision tree growing, measures the purity of data activation according to the class label distribution. The back-propagation rule for the proposed loss function is derived from stochastic gradient descent (SGD) optimization of CNNs. We show that our proposed method outperforms state-of-the-art methods on three public image classification benchmarks and one face verification dataset. We also demonstrate the use of auxiliary data labels, when available, which helps our method to learn more discriminative routing and representations and leads to improved classification. |
Tasks | Face Verification, Image Classification |
Published | 2017-06-06 |
URL | http://arxiv.org/abs/1706.02003v2 |
http://arxiv.org/pdf/1706.02003v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-decision-jungle-for-image |
Repo | |
Framework | |
Detecting Human Interventions on the Landscape: KAZE Features, Poisson Point Processes, and a Construction Dataset
Title | Detecting Human Interventions on the Landscape: KAZE Features, Poisson Point Processes, and a Construction Dataset |
Authors | Edward Boyda, Colin McCormick, Dan Hammer |
Abstract | We present an algorithm capable of identifying a wide variety of human-induced change on the surface of the planet by analyzing matches between local features in time-sequenced remote sensing imagery. We evaluate feature sets, match protocols, and the statistical modeling of feature matches. With application of KAZE features, k-nearest-neighbor descriptor matching, and geometric proximity and bi-directional match consistency checks, average match rates increase more than two-fold over the previous standard. In testing our platform, we developed a small, labeled benchmark dataset expressing large-scale residential, industrial, and civic construction, along with null instances, in California between the years 2010 and 2012. On the benchmark set, our algorithm makes precise, accurate change proposals on two-thirds of scenes. Further, the detection threshold can be tuned so that all or almost all proposed detections are true positives. |
Tasks | Point Processes |
Published | 2017-03-29 |
URL | http://arxiv.org/abs/1703.10196v1 |
http://arxiv.org/pdf/1703.10196v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-human-interventions-on-the |
Repo | |
Framework | |
An Improved Fatigue Detection System Based on Behavioral Characteristics of Driver
Title | An Improved Fatigue Detection System Based on Behavioral Characteristics of Driver |
Authors | Rajat Gupta, Kanishk Aman, Nalin Shiva, Yadvendra Singh |
Abstract | In recent years, road accidents have increased significantly. One of the major reasons for these accidents, as reported is driver fatigue. Due to continuous and longtime driving, the driver gets exhausted and drowsy which may lead to an accident. Therefore, there is a need for a system to measure the fatigue level of driver and alert him when he/she feels drowsy to avoid accidents. Thus, we propose a system which comprises of a camera installed on the car dashboard. The camera detect the driver’s face and observe the alteration in its facial features and uses these features to observe the fatigue level. Facial features include eyes and mouth. Principle Component Analysis is thus implemented to reduce the features while minimizing the amount of information lost. The parameters thus obtained are processed through Support Vector Classifier for classifying the fatigue level. After that classifier output is sent to the alert unit. |
Tasks | |
Published | 2017-09-17 |
URL | http://arxiv.org/abs/1709.05669v1 |
http://arxiv.org/pdf/1709.05669v1.pdf | |
PWC | https://paperswithcode.com/paper/an-improved-fatigue-detection-system-based-on |
Repo | |
Framework | |
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
Title | BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning |
Authors | Ziming Zhang, Yuanwei Wu, Guanghui Wang |
Abstract | Understanding the global optimality in deep learning (DL) has been attracting more and more attention recently. Conventional DL solvers, however, have not been developed intentionally to seek for such global optimality. In this paper we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. Our BPGrad algorithm is based on the assumption of Lipschitz continuity in DL, and as a result it can adaptively determine the step size for current gradient given the history of previous updates, wherein theoretically no smaller steps can achieve the global optimality. We prove that, by repeating such branch-and-pruning procedure, we can locate the global optimality within finite iterations. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation. |
Tasks | Object Recognition |
Published | 2017-11-19 |
URL | http://arxiv.org/abs/1711.06959v1 |
http://arxiv.org/pdf/1711.06959v1.pdf | |
PWC | https://paperswithcode.com/paper/bpgrad-towards-global-optimality-in-deep |
Repo | |
Framework | |
Learning MSO-definable hypotheses on string
Title | Learning MSO-definable hypotheses on string |
Authors | Martin Grohe, Christof Löding, Martin Ritzert |
Abstract | We study the classification problems over string data for hypotheses specified by formulas of monadic second-order logic MSO. The goal is to design learning algorithms that run in time polynomial in the size of the training set, independently of or at least sublinear in the size of the whole data set. We prove negative as well as positive results. If the data set is an unprocessed string to which our algorithms have local access, then learning in sublinear time is impossible even for hypotheses definable in a small fragment of first-order logic. If we allow for a linear time pre-processing of the string data to build an index data structure, then learning of MSO-definable hypotheses is possible in time polynomial in the size of the training set, independently of the size of the whole data set. |
Tasks | |
Published | 2017-08-27 |
URL | http://arxiv.org/abs/1708.08081v1 |
http://arxiv.org/pdf/1708.08081v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-mso-definable-hypotheses-on-string |
Repo | |
Framework | |
Predicting Audience’s Laughter Using Convolutional Neural Network
Title | Predicting Audience’s Laughter Using Convolutional Neural Network |
Authors | Lei Chen, Chong MIn Lee |
Abstract | For the purpose of automatically evaluating speakers’ humor usage, we build a presentation corpus containing humorous utterances based on TED talks. Compared to previous data resources supporting humor recognition research, ours has several advantages, including (a) both positive and negative instances coming from a homogeneous data set, (b) containing a large number of speakers, and (c) being open. Focusing on using lexical cues for humor recognition, we systematically compare a newly emerging text classification method based on Convolutional Neural Networks (CNNs) with a well-established conventional method using linguistic knowledge. The advantages of the CNN method are both getting higher detection accuracies and being able to learn essential features automatically. |
Tasks | Text Classification |
Published | 2017-02-08 |
URL | http://arxiv.org/abs/1702.02584v2 |
http://arxiv.org/pdf/1702.02584v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-audiences-laughter-using |
Repo | |
Framework | |
Sequence-to-Sequence ASR Optimization via Reinforcement Learning
Title | Sequence-to-Sequence ASR Optimization via Reinforcement Learning |
Authors | Andros Tjandra, Sakriani Sakti, Satoshi Nakamura |
Abstract | Despite the success of sequence-to-sequence approaches in automatic speech recognition (ASR) systems, the models still suffer from several problems, mainly due to the mismatch between the training and inference conditions. In the sequence-to-sequence architecture, the model is trained to predict the grapheme of the current time-step given the input of speech signal and the ground-truth grapheme history of the previous time-steps. However, it remains unclear how well the model approximates real-world speech during inference. Thus, generating the whole transcription from scratch based on previous predictions is complicated and errors can propagate over time. Furthermore, the model is optimized to maximize the likelihood of training data instead of error rate evaluation metrics that actually quantify recognition quality. This paper presents an alternative strategy for training sequence-to-sequence ASR models by adopting the idea of reinforcement learning (RL). Unlike the standard training scheme with maximum likelihood estimation, our proposed approach utilizes the policy gradient algorithm. We can (1) sample the whole transcription based on the model’s prediction in the training process and (2) directly optimize the model with negative Levenshtein distance as the reward. Experimental results demonstrate that we significantly improved the performance compared to a model trained only with maximum likelihood estimation. |
Tasks | Speech Recognition |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.10774v2 |
http://arxiv.org/pdf/1710.10774v2.pdf | |
PWC | https://paperswithcode.com/paper/sequence-to-sequence-asr-optimization-via |
Repo | |
Framework | |