Paper Group ANR 194
Efficient Estimation in the Tails of Gaussian Copulas. Modeling trajectories of mental health: challenges and opportunities. A deep learning model for estimating story points. Automatically Building Face Datasets of New Domains from Weakly Labeled Data with Pretrained Models. Image Segmentation Using Overlapping Group Sparsity. Learning Relational …
Efficient Estimation in the Tails of Gaussian Copulas
Title | Efficient Estimation in the Tails of Gaussian Copulas |
Authors | Kalyani Nagaraj, Jie Xu, Raghu Pasupathy, Soumyadip Ghosh |
Abstract | We consider the question of efficient estimation in the tails of Gaussian copulas. Our special focus is estimating expectations over multi-dimensional constrained sets that have a small implied measure under the Gaussian copula. We propose three estimators, all of which rely on a simple idea: identify certain \emph{dominating} point(s) of the feasible set, and appropriately shift and scale an exponential distribution for subsequent use within an importance sampling measure. As we show, the efficiency of such estimators depends crucially on the local structure of the feasible set around the dominating points. The first of our proposed estimators $\estOpt$ is the “full-information” estimator that actively exploits such local structure to achieve bounded relative error in Gaussian settings. The second and third estimators $\estExp$, $\estLap$ are “partial-information” estimators, for use when complete information about the constraint set is not available, they do not exhibit bounded relative error but are shown to achieve polynomial efficiency. We provide sharp asymptotics for all three estimators. For the NORTA setting where no ready information about the dominating points or the feasible set structure is assumed, we construct a multinomial mixture of the partial-information estimator $\estLap$ resulting in a fourth estimator $\estNt$ with polynomial efficiency, and implementable through the ecoNORTA algorithm. Numerical results on various example problems are remarkable, and consistent with theory. |
Tasks | |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01375v1 |
http://arxiv.org/pdf/1607.01375v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-estimation-in-the-tails-of-gaussian |
Repo | |
Framework | |
Modeling trajectories of mental health: challenges and opportunities
Title | Modeling trajectories of mental health: challenges and opportunities |
Authors | Lauren Erdman, Ekansh Sharma, Eva Unternahrer, Shantala Hari Dass, Kieran ODonnell, Sara Mostafavi, Rachel Edgar, Michael Kobor, Helene Gaudreau, Michael Meaney, Anna Goldenberg |
Abstract | More than two thirds of mental health problems have their onset during childhood or adolescence. Identifying children at risk for mental illness later in life and predicting the type of illness is not easy. We set out to develop a platform to define subtypes of childhood social-emotional development using longitudinal, multifactorial trait-based measures. Subtypes discovered through this study could ultimately advance psychiatric knowledge of the early behavioural signs of mental illness. To this extent we have examined two types of models: latent class mixture models and GP-based models. Our findings indicate that while GP models come close in accuracy of predicting future trajectories, LCMMs predict the trajectories as well in a fraction of the time. Unfortunately, neither of the models are currently accurate enough to lead to immediate clinical impact. The available data related to the development of childhood mental health is often sparse with only a few time points measured and require novel methods with improved efficiency and accuracy. |
Tasks | |
Published | 2016-12-04 |
URL | http://arxiv.org/abs/1612.01055v1 |
http://arxiv.org/pdf/1612.01055v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-trajectories-of-mental-health |
Repo | |
Framework | |
A deep learning model for estimating story points
Title | A deep learning model for estimating story points |
Authors | Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, Trang Pham, Aditya Ghose, Tim Menzies |
Abstract | Although there has been substantial research in software analytics for effort estimation in traditional software projects, little work has been done for estimation in agile projects, especially estimating user stories or issues. Story points are the most common unit of measure used for estimating the effort involved in implementing a user story or resolving an issue. In this paper, we offer for the \emph{first} time a comprehensive dataset for story points-based estimation that contains 23,313 issues from 16 open source projects. We also propose a prediction model for estimating story points based on a novel combination of two powerful deep learning architectures: long short-term memory and recurrent highway network. Our prediction system is \emph{end-to-end} trainable from raw input data to prediction outcomes without any manual feature engineering. An empirical evaluation demonstrates that our approach consistently outperforms three common effort estimation baselines and two alternatives in both Mean Absolute Error and the Standardized Accuracy. |
Tasks | Feature Engineering |
Published | 2016-09-02 |
URL | http://arxiv.org/abs/1609.00489v2 |
http://arxiv.org/pdf/1609.00489v2.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-model-for-estimating-story |
Repo | |
Framework | |
Automatically Building Face Datasets of New Domains from Weakly Labeled Data with Pretrained Models
Title | Automatically Building Face Datasets of New Domains from Weakly Labeled Data with Pretrained Models |
Authors | Shengyong Ding, Junyu Wu, Wei Xu, Hongyang Chao |
Abstract | Training data are critical in face recognition systems. However, labeling a large scale face data for a particular domain is very tedious. In this paper, we propose a method to automatically and incrementally construct datasets from massive weakly labeled data of the target domain which are readily available on the Internet under the help of a pretrained face model. More specifically, given a large scale weakly labeled dataset in which each face image is associated with a label, i.e. the name of an identity, we create a graph for each identity with edges linking matched faces verified by the existing model under a tight threshold. Then we use the maximal subgraph as the cleaned data for that identity. With the cleaned dataset, we update the existing face model and use the new model to filter the original dataset to get a larger cleaned dataset. We collect a large weakly labeled dataset containing 530,560 Asian face images of 7,962 identities from the Internet, which will be published for the study of face recognition. By running the filtering process, we obtain a cleaned datasets (99.7+% purity) of size 223,767 (recall 70.9%). On our testing dataset of Asian faces, the model trained by the cleaned dataset achieves recognition rate 93.1%, which obviously outperforms the model trained by the public dataset CASIA whose recognition rate is 85.9%. |
Tasks | Face Recognition |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08107v1 |
http://arxiv.org/pdf/1611.08107v1.pdf | |
PWC | https://paperswithcode.com/paper/automatically-building-face-datasets-of-new |
Repo | |
Framework | |
Image Segmentation Using Overlapping Group Sparsity
Title | Image Segmentation Using Overlapping Group Sparsity |
Authors | Shervin Minaee, Yao Wang |
Abstract | Sparse decomposition has been widely used for different applications, such as source separation, image classification and image denoising. This paper presents a new algorithm for segmentation of an image into background and foreground text and graphics using sparse decomposition. First, the background is represented using a suitable smooth model, which is a linear combination of a few smoothly varying basis functions, and the foreground text and graphics are modeled as a sparse component overlaid on the smooth background. Then the background and foreground are separated using a sparse decomposition framework and imposing some prior information, which promote the smoothness of background, and the sparsity and connectivity of foreground pixels. This algorithm has been tested on a dataset of images extracted from HEVC standard test sequences for screen content coding, and is shown to outperform prior methods, including least absolute deviation fitting, k-means clustering based segmentation in DjVu, and shape primitive extraction and coding algorithm. |
Tasks | Denoising, Image Classification, Image Denoising, Semantic Segmentation |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.07909v4 |
http://arxiv.org/pdf/1611.07909v4.pdf | |
PWC | https://paperswithcode.com/paper/image-segmentation-using-overlapping-group |
Repo | |
Framework | |
Learning Relational Dependency Networks for Relation Extraction
Title | Learning Relational Dependency Networks for Relation Extraction |
Authors | Dileep Viswanathan, Ameet Soni, Jude Shavlik, Sriraam Natarajan |
Abstract | We consider the task of KBP slot filling – extracting relation information from newswire documents for knowledge base construction. We present our pipeline, which employs Relational Dependency Networks (RDNs) to learn linguistic patterns for relation extraction. Additionally, we demonstrate how several components such as weak supervision, word2vec features, joint learning and the use of human advice, can be incorporated in this relational framework. We evaluate the different components in the benchmark KBP 2015 task and show that RDNs effectively model a diverse set of features and perform competitively with current state-of-the-art relation extraction. |
Tasks | Relation Extraction, Slot Filling |
Published | 2016-07-01 |
URL | http://arxiv.org/abs/1607.00424v1 |
http://arxiv.org/pdf/1607.00424v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-relational-dependency-networks-for |
Repo | |
Framework | |
Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding
Title | Sequential Convolutional Neural Networks for Slot Filling in Spoken Language Understanding |
Authors | Ngoc Thang Vu |
Abstract | We investigate the usage of convolutional neural networks (CNNs) for the slot filling task in spoken language understanding. We propose a novel CNN architecture for sequence labeling which takes into account the previous context words with preserved order information and pays special attention to the current word with its surrounding context. Moreover, it combines the information from the past and the future words for classification. Our proposed CNN architecture outperforms even the previously best ensembling recurrent neural network model and achieves state-of-the-art results with an F1-score of 95.61% on the ATIS benchmark dataset without using any additional linguistic knowledge and resources. |
Tasks | Slot Filling, Spoken Language Understanding |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07783v1 |
http://arxiv.org/pdf/1606.07783v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-convolutional-neural-networks-for |
Repo | |
Framework | |
Conservative Bandits
Title | Conservative Bandits |
Authors | Yifan Wu, Roshan Shariff, Tor Lattimore, Csaba Szepesvári |
Abstract | We study a novel multi-armed bandit problem that models the challenge faced by a company wishing to explore new strategies to maximize revenue whilst simultaneously maintaining their revenue above a fixed baseline, uniformly over time. While previous work addressed the problem under the weaker requirement of maintaining the revenue constraint only at a given fixed time in the future, the algorithms previously proposed are unsuitable due to their design under the more stringent constraints. We consider both the stochastic and the adversarial settings, where we propose, natural, yet novel strategies and analyze the price for maintaining the constraints. Amongst other things, we prove both high probability and expectation bounds on the regret, while we also consider both the problem of maintaining the constraints with high probability or expectation. For the adversarial setting the price of maintaining the constraint appears to be higher, at least for the algorithm considered. A lower bound is given showing that the algorithm for the stochastic setting is almost optimal. Empirical results obtained in synthetic environments complement our theoretical findings. |
Tasks | |
Published | 2016-02-13 |
URL | http://arxiv.org/abs/1602.04282v1 |
http://arxiv.org/pdf/1602.04282v1.pdf | |
PWC | https://paperswithcode.com/paper/conservative-bandits |
Repo | |
Framework | |
Achieving Human Parity in Conversational Speech Recognition
Title | Achieving Human Parity in Conversational Speech Recognition |
Authors | W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig |
Abstract | Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we measure the human error rate on the widely used NIST 2000 test set, and find that our latest automated system has reached human parity. The error rate of professional transcribers is 5.9% for the Switchboard portion of the data, in which newly acquainted pairs of people discuss an assigned topic, and 11.3% for the CallHome portion where friends and family members have open-ended conversations. In both cases, our automated system establishes a new state of the art, and edges past the human benchmark, achieving error rates of 5.8% and 11.0%, respectively. The key to our system’s performance is the use of various convolutional and LSTM acoustic model architectures, combined with a novel spatial smoothing method and lattice-free MMI acoustic training, multiple recurrent neural network language modeling approaches, and a systematic use of system combination. |
Tasks | Language Modelling, Speech Recognition |
Published | 2016-10-17 |
URL | http://arxiv.org/abs/1610.05256v2 |
http://arxiv.org/pdf/1610.05256v2.pdf | |
PWC | https://paperswithcode.com/paper/achieving-human-parity-in-conversational |
Repo | |
Framework | |
Learning to Start for Sequence to Sequence Architecture
Title | Learning to Start for Sequence to Sequence Architecture |
Authors | Qingfu Zhu, Weinan Zhang, Lianqiang Zhou, Ting Liu |
Abstract | The sequence to sequence architecture is widely used in the response generation and neural machine translation to model the potential relationship between two sentences. It typically consists of two parts: an encoder that reads from the source sentence and a decoder that generates the target sentence word by word according to the encoder’s output and the last generated word. However, it faces to the cold start problem when generating the first word as there is no previous word to refer. Existing work mainly use a special start symbol to generate the first word. An obvious drawback of these work is that there is not a learnable relationship between words and the start symbol. Furthermore, it may lead to the error accumulation for decoding when the first word is incorrectly generated. In this paper, we proposed a novel approach to learning to generate the first word in the sequence to sequence architecture rather than using the start symbol. Experimental results on the task of response generation of short text conversation show that the proposed approach outperforms the state-of-the-art approach in both of the automatic and manual evaluations. |
Tasks | Machine Translation, Short-Text Conversation |
Published | 2016-08-19 |
URL | http://arxiv.org/abs/1608.05554v1 |
http://arxiv.org/pdf/1608.05554v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-start-for-sequence-to-sequence |
Repo | |
Framework | |
A Characterization of the Non-Uniqueness of Nonnegative Matrix Factorizations
Title | A Characterization of the Non-Uniqueness of Nonnegative Matrix Factorizations |
Authors | W. Pan, F. Doshi-Velez |
Abstract | Nonnegative matrix factorization (NMF) is a popular dimension reduction technique that produces interpretable decomposition of the data into parts. However, this decompostion is not generally identifiable (even up to permutation and scaling). While other studies have provide criteria under which NMF is identifiable, we present the first (to our knowledge) characterization of the non-identifiability of NMF. We describe exactly when and how non-uniqueness can occur, which has important implications for algorithms to efficiently discover alternate solutions, if they exist. |
Tasks | Dimensionality Reduction |
Published | 2016-04-03 |
URL | http://arxiv.org/abs/1604.00653v1 |
http://arxiv.org/pdf/1604.00653v1.pdf | |
PWC | https://paperswithcode.com/paper/a-characterization-of-the-non-uniqueness-of |
Repo | |
Framework | |
Shape-based defect classification for Non Destructive Testing
Title | Shape-based defect classification for Non Destructive Testing |
Authors | Gianni D’Angelo, Salvatore Rampone |
Abstract | The aim of this work is to classify the aerospace structure defects detected by eddy current non-destructive testing. The proposed method is based on the assumption that the defect is bound to the reaction of the probe coil impedance during the test. Impedance plane analysis is used to extract a feature vector from the shape of the coil impedance in the complex plane, through the use of some geometric parameters. Shape recognition is tested with three different machine-learning based classifiers: decision trees, neural networks and Naive Bayes. The performance of the proposed detection system are measured in terms of accuracy, sensitivity, specificity, precision and Matthews correlation coefficient. Several experiments are performed on dataset of eddy current signal samples for aircraft structures. The obtained results demonstrate the usefulness of our approach and the competiveness against existing descriptors. |
Tasks | |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05518v1 |
http://arxiv.org/pdf/1610.05518v1.pdf | |
PWC | https://paperswithcode.com/paper/shape-based-defect-classification-for-non |
Repo | |
Framework | |
Generative One-Class Models for Text-based Person Retrieval in Forensic Applications
Title | Generative One-Class Models for Text-based Person Retrieval in Forensic Applications |
Authors | David Gerónimo, Hedvig Kjellström |
Abstract | Automatic forensic image analysis assists criminal investigation experts in the search for suspicious persons, abnormal behaviors detection and identity matching in images. In this paper we propose a person retrieval system that uses textual queries (e.g., “black trousers and green shirt”) as descriptions and a one-class generative color model with outlier filtering to represent the images both to train the models and to perform the search. The method is evaluated in terms of its efficiency in fulfilling the needs of a forensic retrieval system: limited annotation, robustness, extensibility, adaptability and computational cost. The proposed generative method is compared to a corresponding discriminative approach. Experiments are carried out using a range of queries in three different databases. The experiments show that the two evaluated algorithms provide average retrieval performance and adaptable to new datasets. The proposed generative algorithm has some advantages over the discriminative one, specifically its capability to work with very few training samples and its much lower computational requirements when the number of training examples increases. |
Tasks | Person Retrieval |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05915v1 |
http://arxiv.org/pdf/1611.05915v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-one-class-models-for-text-based |
Repo | |
Framework | |
A Graph-Based Semi-Supervised k Nearest-Neighbor Method for Nonlinear Manifold Distributed Data Classification
Title | A Graph-Based Semi-Supervised k Nearest-Neighbor Method for Nonlinear Manifold Distributed Data Classification |
Authors | Enmei Tu, Yaqian Zhang, Lin Zhu, Jie Yang, Nikola Kasabov |
Abstract | $k$ Nearest Neighbors ($k$NN) is one of the most widely used supervised learning algorithms to classify Gaussian distributed data, but it does not achieve good results when it is applied to nonlinear manifold distributed data, especially when a very limited amount of labeled samples are available. In this paper, we propose a new graph-based $k$NN algorithm which can effectively handle both Gaussian distributed data and nonlinear manifold distributed data. To achieve this goal, we first propose a constrained Tired Random Walk (TRW) by constructing an $R$-level nearest-neighbor strengthened tree over the graph, and then compute a TRW matrix for similarity measurement purposes. After this, the nearest neighbors are identified according to the TRW matrix and the class label of a query point is determined by the sum of all the TRW weights of its nearest neighbors. To deal with online situations, we also propose a new algorithm to handle sequential samples based a local neighborhood reconstruction. Comparison experiments are conducted on both synthetic data sets and real-world data sets to demonstrate the validity of the proposed new $k$NN algorithm and its improvements to other version of $k$NN algorithms. Given the widespread appearance of manifold structures in real-world problems and the popularity of the traditional $k$NN algorithm, the proposed manifold version $k$NN shows promising potential for classifying manifold-distributed data. |
Tasks | |
Published | 2016-06-03 |
URL | http://arxiv.org/abs/1606.00985v1 |
http://arxiv.org/pdf/1606.00985v1.pdf | |
PWC | https://paperswithcode.com/paper/a-graph-based-semi-supervised-k-nearest |
Repo | |
Framework | |
Shesop Healthcare: Stress and influenza classification using support vector machine kernel
Title | Shesop Healthcare: Stress and influenza classification using support vector machine kernel |
Authors | Andrien Ivander Wijaya, Ary Setijadi Prihatmanto, Rifki Wijaya |
Abstract | Shesop is an integrated system to make human lives more easily and to help people in terms of healthcare. Stress and influenza classification is a part of Shesop’s application for a healthcare devices such as smartwatch, polar and fitbit. The main objective of this paper is to classify a new data and inform whether you are stress, depressed, caught by influenza or not. We will use the heart rate data taken for months in Bandung, analyze the data and find the Heart rate variance that constantly related with the stress and flu level. After we found the variable, we will use the variable as an input to the support vector machine learning. We will use the lagrangian and kernel technique to transform 2D data into 3D data so we can use the linear classification in 3D space. In the end, we could use the machine learning’s result to classify new data and get the final result immediately: stress or not, influenza or not. |
Tasks | |
Published | 2016-07-16 |
URL | http://arxiv.org/abs/1607.04770v1 |
http://arxiv.org/pdf/1607.04770v1.pdf | |
PWC | https://paperswithcode.com/paper/shesop-healthcare-stress-and-influenza |
Repo | |
Framework | |