Paper Group ANR 70
Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization. Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages. Unsupervised Domain Adaptation with Residual Transfer Networks. On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation. Attributes as Semantic Un …
Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization
Title | Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization |
Authors | Shalmali Joshi, Suriya Gunasekar, David Sontag, Joydeep Ghosh |
Abstract | This work proposes a new algorithm for automated and simultaneous phenotyping of multiple co-occurring medical conditions, also referred as comorbidities, using clinical notes from the electronic health records (EHRs). A basic latent factor estimation technique of non-negative matrix factorization (NMF) is augmented with domain specific constraints to obtain sparse latent factors that are anchored to a fixed set of chronic conditions. The proposed anchoring mechanism ensures a one-to-one identifiable and interpretable mapping between the latent factors and the target comorbidities. Qualitative assessment of the empirical results by clinical experts suggests that the proposed model learns clinically interpretable phenotypes while being predictive of 30 day mortality. The proposed method can be readily adapted to any non-negative EHR data across various healthcare institutions. |
Tasks | |
Published | 2016-08-02 |
URL | http://arxiv.org/abs/1608.00704v3 |
http://arxiv.org/pdf/1608.00704v3.pdf | |
PWC | https://paperswithcode.com/paper/identifiable-phenotyping-using-constrained |
Repo | |
Framework | |
Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages
Title | Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages |
Authors | Muhammad Imran, Prasenjit Mitra, Carlos Castillo |
Abstract | Microblogging platforms such as Twitter provide active communication channels during mass convergence and emergency events such as earthquakes, typhoons. During the sudden onset of a crisis situation, affected people post useful information on Twitter that can be used for situational awareness and other humanitarian disaster response efforts, if processed timely and effectively. Processing social media information pose multiple challenges such as parsing noisy, brief and informal messages, learning information categories from the incoming stream of messages and classifying them into different classes among others. One of the basic necessities of many of these tasks is the availability of data, in particular human-annotated data. In this paper, we present human-annotated Twitter corpora collected during 19 different crises that took place between 2013 and 2015. To demonstrate the utility of the annotations, we train machine learning classifiers. Moreover, we publish first largest word2vec word embeddings trained on 52 million crisis-related tweets. To deal with tweets language issues, we present human-annotated normalized lexical resources for different lexical variations. |
Tasks | Word Embeddings |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.05894v2 |
http://arxiv.org/pdf/1605.05894v2.pdf | |
PWC | https://paperswithcode.com/paper/twitter-as-a-lifeline-human-annotated-twitter |
Repo | |
Framework | |
Unsupervised Domain Adaptation with Residual Transfer Networks
Title | Unsupervised Domain Adaptation with Residual Transfer Networks |
Authors | Mingsheng Long, Han Zhu, Jianmin Wang, Michael I. Jordan |
Abstract | The recent success of deep neural networks relies on massive amounts of labeled data. For a target task where labeled data is unavailable, domain adaptation can transfer a learner from a different source domain. In this paper, we propose a new approach to domain adaptation in deep networks that can jointly learn adaptive classifiers and transferable features from labeled data in the source domain and unlabeled data in the target domain. We relax a shared-classifier assumption made by previous methods and assume that the source classifier and target classifier differ by a residual function. We enable classifier adaptation by plugging several layers into deep network to explicitly learn the residual function with reference to the target classifier. We fuse features of multiple layers with tensor product and embed them into reproducing kernel Hilbert spaces to match distributions for feature adaptation. The adaptation can be achieved in most feed-forward models by extending them with new residual layers and loss functions, which can be trained efficiently via back-propagation. Empirical evidence shows that the new approach outperforms state of the art methods on standard domain adaptation benchmarks. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2016-02-14 |
URL | http://arxiv.org/abs/1602.04433v2 |
http://arxiv.org/pdf/1602.04433v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-domain-adaptation-with-residual |
Repo | |
Framework | |
On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation
Title | On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation |
Authors | Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu |
Abstract | We propose to train bi-directional neural network language model(NNLM) with noise contrastive estimation(NCE). Experiments are conducted on a rescore task on the PTB data set. It is shown that NCE-trained bi-directional NNLM outperformed the one trained by conventional maximum likelihood training. But still(regretfully), it did not out-perform the baseline uni-directional NNLM. |
Tasks | Language Modelling |
Published | 2016-02-19 |
URL | http://arxiv.org/abs/1602.06064v3 |
http://arxiv.org/pdf/1602.06064v3.pdf | |
PWC | https://paperswithcode.com/paper/on-training-bi-directional-neural-network |
Repo | |
Framework | |
Attributes as Semantic Units between Natural Language and Visual Recognition
Title | Attributes as Semantic Units between Natural Language and Visual Recognition |
Authors | Marcus Rohrbach |
Abstract | Impressive progress has been made in the fields of computer vision and natural language processing. However, it remains a challenge to find the best point of interaction for these very different modalities. In this chapter we discuss how attributes allow us to exchange information between the two modalities and in this way lead to an interaction on a semantic level. Specifically we discuss how attributes allow using knowledge mined from language resources for recognizing novel visual categories, how we can generate sentence description about images and video, how we can ground natural language in visual content, and finally, how we can answer natural language questions about images. |
Tasks | |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03249v1 |
http://arxiv.org/pdf/1604.03249v1.pdf | |
PWC | https://paperswithcode.com/paper/attributes-as-semantic-units-between-natural |
Repo | |
Framework | |
Towards Deep Compositional Networks
Title | Towards Deep Compositional Networks |
Authors | Domen Tabernik, Matej Kristan, Jeremy L. Wyatt, Aleš Leonardis |
Abstract | Hierarchical feature learning based on convolutional neural networks (CNN) has recently shown significant potential in various computer vision tasks. While allowing high-quality discriminative feature learning, the downside of CNNs is the lack of explicit structure in features, which often leads to overfitting, absence of reconstruction from partial observations and limited generative abilities. Explicit structure is inherent in hierarchical compositional models, however, these lack the ability to optimize a well-defined cost function. We propose a novel analytic model of a basic unit in a layered hierarchical model with both explicit compositional structure and a well-defined discriminative cost function. Our experiments on two datasets show that the proposed compositional model performs on a par with standard CNNs on discriminative tasks, while, due to explicit modeling of the structure in the feature units, affording a straight-forward visualization of parts and faster inference due to separability of the units. Actions |
Tasks | |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03795v1 |
http://arxiv.org/pdf/1609.03795v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-deep-compositional-networks |
Repo | |
Framework | |
Boosting Variational Inference
Title | Boosting Variational Inference |
Authors | Fangjian Guo, Xiangyu Wang, Kai Fan, Tamara Broderick, David B. Dunson |
Abstract | Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates posterior approximation as an optimization problem: to find the closest distribution to the exact posterior over some family of distributions. For practical reasons, the family of distributions in VI is usually constrained so that it does not include the exact posterior, even as a limit point. Thus, no matter how long VI is run, the resulting approximation will not approach the exact posterior. We propose to instead consider a more flexible approximating family consisting of all possible finite mixtures of a parametric base distribution (e.g., Gaussian). For efficient inference, we borrow ideas from gradient boosting to develop an algorithm we call boosting variational inference (BVI). BVI iteratively improves the current approximation by mixing it with a new component from the base distribution family and thereby yields progressively more accurate posterior approximations as more computing time is spent. Unlike a number of common VI variants including mean-field VI, BVI is able to capture multimodality, general posterior covariance, and nonstandard posterior shapes. |
Tasks | |
Published | 2016-11-17 |
URL | http://arxiv.org/abs/1611.05559v2 |
http://arxiv.org/pdf/1611.05559v2.pdf | |
PWC | https://paperswithcode.com/paper/boosting-variational-inference |
Repo | |
Framework | |
Evolutionary Approaches to Optimization Problems in Chimera Topologies
Title | Evolutionary Approaches to Optimization Problems in Chimera Topologies |
Authors | Roberto Santana, Zheng Zhu, Helmut G. Katzgraber |
Abstract | Chimera graphs define the topology of one of the first commercially available quantum computers. A variety of optimization problems have been mapped to this topology to evaluate the behavior of quantum enhanced optimization heuristics in relation to other optimizers, being able to efficiently solve problems classically to use them as benchmarks for quantum machines. In this paper we investigate for the first time the use of Evolutionary Algorithms (EAs) on Ising spin glass instances defined on the Chimera topology. Three genetic algorithms (GAs) and three estimation of distribution algorithms (EDAs) are evaluated over $1000$ hard instances of the Ising spin glass constructed from Sidon sets. We focus on determining whether the information about the topology of the graph can be used to improve the results of EAs and on identifying the characteristics of the Ising instances that influence the success rate of GAs and EDAs. |
Tasks | |
Published | 2016-08-17 |
URL | http://arxiv.org/abs/1608.05105v1 |
http://arxiv.org/pdf/1608.05105v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-approaches-to-optimization |
Repo | |
Framework | |
Saliency Detection for Improving Object Proposals
Title | Saliency Detection for Improving Object Proposals |
Authors | Shuhan Chen, Jindong Li, Xuelong Hu, Ping Zhou |
Abstract | Object proposals greatly benefit object detection task in recent state-of-the-art works. However, the existing object proposals usually have low localization accuracy at high intersection over union threshold. To address it, we apply saliency detection to each bounding box to improve their quality in this paper. We first present a geodesic saliency detection method in contour, which is designed to find closed contours. Then, we apply it to each candidate box with multi-sizes, and refined boxes can be easily produced in the obtained saliency maps which are further used to calculate saliency scores for proposal ranking. Experiments on PASCAL VOC 2007 test dataset demonstrate the proposed refinement approach can greatly improve existing models. |
Tasks | Object Detection, Saliency Detection |
Published | 2016-03-14 |
URL | http://arxiv.org/abs/1603.04146v3 |
http://arxiv.org/pdf/1603.04146v3.pdf | |
PWC | https://paperswithcode.com/paper/saliency-detection-for-improving-object |
Repo | |
Framework | |
Decoding Emotional Experience through Physiological Signal Processing
Title | Decoding Emotional Experience through Physiological Signal Processing |
Authors | Maria S. Perez-Rosero, Behnaz Rezaei, Murat Akcakaya, Sarah Ostadabbas |
Abstract | There is an increasing consensus among re- searchers that making a computer emotionally intelligent with the ability to decode human affective states would allow a more meaningful and natural way of human-computer interactions (HCIs). One unobtrusive and non-invasive way of recognizing human affective states entails the exploration of how physiological signals vary under different emotional experiences. In particular, this paper explores the correlation between autonomically-mediated changes in multimodal body signals and discrete emotional states. In order to fully exploit the information in each modality, we have provided an innovative classification approach for three specific physiological signals including Electromyogram (EMG), Blood Volume Pressure (BVP) and Galvanic Skin Response (GSR). These signals are analyzed as inputs to an emotion recognition paradigm based on fusion of a series of weak learners. Our proposed classification approach showed 88.1% recognition accuracy, which outperformed the conventional Support Vector Machine (SVM) classifier with 17% accuracy improvement. Furthermore, in order to avoid information redundancy and the resultant over-fitting, a feature reduction method is proposed based on a correlation analysis to optimize the number of features required for training and validating each weak learner. Results showed that despite the feature space dimensionality reduction from 27 to 18 features, our methodology preserved the recognition accuracy of about 85.0%. This reduction in complexity will get us one step closer towards embedding this human emotion encoder in the wireless and wearable HCI platforms. |
Tasks | Dimensionality Reduction, Emotion Recognition |
Published | 2016-06-01 |
URL | http://arxiv.org/abs/1606.00370v1 |
http://arxiv.org/pdf/1606.00370v1.pdf | |
PWC | https://paperswithcode.com/paper/decoding-emotional-experience-through |
Repo | |
Framework | |
Thesis: Multiple Kernel Learning for Object Categorization
Title | Thesis: Multiple Kernel Learning for Object Categorization |
Authors | Dinesh Govindaraj |
Abstract | Object Categorization is a challenging problem, especially when the images have clutter background, occlusions or different lighting conditions. In the past, many descriptors have been proposed which aid object categorization even in such adverse conditions. Each descriptor has its own merits and de-merits. Some descriptors are invariant to transformations while the others are more discriminative. Past research has shown that, employing multiple descriptors rather than any single descriptor leads to better recognition. The problem of learning the optimal combination of the available descriptors for a particular classification task is studied. Multiple Kernel Learning (MKL) framework has been developed for learning an optimal combination of descriptors for object categorization. Existing MKL formulations often employ block l-1 norm regularization which is equivalent to selecting a single kernel from a library of kernels. Since essentially a single descriptor is selected, the existing formulations maybe sub- optimal for object categorization. A MKL formulation based on block l-infinity norm regularization has been developed, which chooses an optimal combination of kernels as opposed to selecting a single kernel. A Composite Multiple Kernel Learning(CKL) formulation based on mixed l-infinity and l-1 norm regularization has been developed. These formulations end in Second Order Cone Programs(SOCP). Other efficient alter- native algorithms for these formulation have been implemented. Empirical results on benchmark datasets show significant improvement using these new MKL formulations. |
Tasks | |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03247v1 |
http://arxiv.org/pdf/1604.03247v1.pdf | |
PWC | https://paperswithcode.com/paper/thesis-multiple-kernel-learning-for-object |
Repo | |
Framework | |
Abstractive Meeting Summarization UsingDependency Graph Fusion
Title | Abstractive Meeting Summarization UsingDependency Graph Fusion |
Authors | Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama |
Abstract | Automatic summarization techniques on meeting conversations developed so far have been primarily extractive, resulting in poor summaries. To improve this, we propose an approach to generate abstractive summaries by fusing important content from several utterances. Any meeting is generally comprised of several discussion topic segments. For each topic segment within a meeting conversation, we aim to generate a one sentence summary from the most important utterances using an integer linear programming-based sentence fusion approach. Experimental results show that our method can generate more informative summaries than the baselines. |
Tasks | Meeting Summarization |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.07035v1 |
http://arxiv.org/pdf/1609.07035v1.pdf | |
PWC | https://paperswithcode.com/paper/abstractive-meeting-summarization |
Repo | |
Framework | |
Learning theory estimates with observations from general stationary stochastic processes
Title | Learning theory estimates with observations from general stationary stochastic processes |
Authors | Hanyuan Hang, Yunlong Feng, Ingo Steinwart, Johan A. K. Suykens |
Abstract | This paper investigates the supervised learning problem with observations drawn from certain general stationary stochastic processes. Here by \emph{general}, we mean that many stationary stochastic processes can be included. We show that when the stochastic processes satisfy a generalized Bernstein-type inequality, a unified treatment on analyzing the learning schemes with various mixing processes can be conducted and a sharp oracle inequality for generic regularized empirical risk minimization schemes can be established. The obtained oracle inequality is then applied to derive convergence rates for several learning schemes such as empirical risk minimization (ERM), least squares support vector machines (LS-SVMs) using given generic kernels, and SVMs using Gaussian kernels for both least squares and quantile regression. It turns out that for i.i.d.~processes, our learning rates for ERM recover the optimal rates. On the other hand, for non-i.i.d.~processes including geometrically $\alpha$-mixing Markov processes, geometrically $\alpha$-mixing processes with restricted decay, $\phi$-mixing processes, and (time-reversed) geometrically $\mathcal{C}$-mixing processes, our learning rates for SVMs with Gaussian kernels match, up to some arbitrarily small extra term in the exponent, the optimal rates. For the remaining cases, our rates are at least close to the optimal rates. As a by-product, the assumed generalized Bernstein-type inequality also provides an interpretation of the so-called “effective number of observations” for various mixing processes. |
Tasks | |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.02887v1 |
http://arxiv.org/pdf/1605.02887v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-theory-estimates-with-observations |
Repo | |
Framework | |
Optimal Parameter Settings for the $(1+(λ, λ))$ Genetic Algorithm
Title | Optimal Parameter Settings for the $(1+(λ, λ))$ Genetic Algorithm |
Authors | Benjamin Doerr |
Abstract | The $(1+(\lambda,\lambda))$ genetic algorithm is one of the few algorithms for which a super-constant speed-up through the use of crossover could be proven. So far, this algorithm has been used with parameters based also on intuitive considerations. In this work, we rigorously regard the whole parameter space and show that the asymptotic time complexity proven by Doerr and Doerr (GECCO 2015) for the intuitive choice is best possible among all settings for population size, mutation probability, and crossover bias. |
Tasks | |
Published | 2016-04-04 |
URL | http://arxiv.org/abs/1604.01088v2 |
http://arxiv.org/pdf/1604.01088v2.pdf | |
PWC | https://paperswithcode.com/paper/optimal-parameter-settings-for-the-1-genetic |
Repo | |
Framework | |
Human collective intelligence as distributed Bayesian inference
Title | Human collective intelligence as distributed Bayesian inference |
Authors | Peter M. Krafft, Julia Zheng, Wei Pan, Nicolás Della Penna, Yaniv Altshuler, Erez Shmueli, Joshua B. Tenenbaum, Alex Pentland |
Abstract | Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where investors mimic each others’ trades using real money in foreign exchange and other asset markets. We find that in this setting people use a decision mechanism in which popularity is treated as a prior distribution for which decisions are best to make. This mechanism is boundedly rational at the individual level, but we prove that in the aggregate implements a type of approximate “Thompson sampling”—a well-known and highly effective single-agent Bayesian machine learning algorithm for sequential decision-making. The perspective of distributed Bayesian inference therefore reveals how collective rationality emerges from the boundedly rational decision mechanisms people use. |
Tasks | Bayesian Inference, Decision Making |
Published | 2016-08-05 |
URL | http://arxiv.org/abs/1608.01987v1 |
http://arxiv.org/pdf/1608.01987v1.pdf | |
PWC | https://paperswithcode.com/paper/human-collective-intelligence-as-distributed |
Repo | |
Framework | |