May 6, 2019

2896 words 14 mins read

Paper Group ANR 214

Paper Group ANR 214

Inferring Fine-grained Details on User Activities and Home Location from Social Media: Detecting Drinking-While-Tweeting Patterns in Communities. Video Fill in the Blank with Merging LSTMs. WordNet2Vec: Corpora Agnostic Word Vectorization Method. Semantic Scan: Detecting Subtle, Spatially Localized Events in Text Streams. Anomaly Detection with the …

Inferring Fine-grained Details on User Activities and Home Location from Social Media: Detecting Drinking-While-Tweeting Patterns in Communities

Title Inferring Fine-grained Details on User Activities and Home Location from Social Media: Detecting Drinking-While-Tweeting Patterns in Communities
Authors Nabil Hossain, Tianran Hu, Roghayeh Feizi, Ann Marie White, Jiebo Luo, Henry Kautz
Abstract Nearly all previous work on geo-locating latent states and activities from social media confounds general discussions about activities, self-reports of users participating in those activities at times in the past or future, and self-reports made at the immediate time and place the activity occurs. Activities, such as alcohol consumption, may occur at different places and types of places, and it is important not only to detect the local regions where these activities occur, but also to analyze the degree of participation in them by local residents. In this paper, we develop new machine learning based methods for fine-grained localization of activities and home locations from Twitter data. We apply these methods to discover and compare alcohol consumption patterns in a large urban area, New York City, and a more suburban and rural area, Monroe County. We find positive correlations between the rate of alcohol consumption reported among a community’s Twitter users and the density of alcohol outlets, demonstrating that the degree of correlation varies significantly between urban and suburban areas. While our experiments are focused on alcohol use, our methods for locating homes and distinguishing temporally-specific self-reports are applicable to a broad range of behaviors and latent states.
Tasks
Published 2016-03-10
URL http://arxiv.org/abs/1603.03181v1
PDF http://arxiv.org/pdf/1603.03181v1.pdf
PWC https://paperswithcode.com/paper/inferring-fine-grained-details-on-user
Repo
Framework

Video Fill in the Blank with Merging LSTMs

Title Video Fill in the Blank with Merging LSTMs
Authors Amir Mazaheri, Dong Zhang, Mubarak Shah
Abstract Given a video and its incomplete textural description with missing words, the Video-Fill-in-the-Blank (ViFitB) task is to automatically find the missing word. The contextual information of the sentences are important to infer the missing words; the visual cues are even more crucial to get a more accurate inference. In this paper, we presents a new method which intuitively takes advantage of the structure of the sentences and employs merging LSTMs (to merge two LSTMs) to tackle the problem with embedded textural and visual cues. In the experiments, we have demonstrated the superior performance of the proposed method on the challenging “Movie Fill-in-the-Blank” dataset.
Tasks
Published 2016-10-13
URL http://arxiv.org/abs/1610.04062v1
PDF http://arxiv.org/pdf/1610.04062v1.pdf
PWC https://paperswithcode.com/paper/video-fill-in-the-blank-with-merging-lstms
Repo
Framework

WordNet2Vec: Corpora Agnostic Word Vectorization Method

Title WordNet2Vec: Corpora Agnostic Word Vectorization Method
Authors Roman Bartusiak, Łukasz Augustyniak, Tomasz Kajdanowicz, Przemysław Kazienko, Maciej Piasecki
Abstract A complex nature of big data resources demands new methods for structuring especially for textual content. WordNet is a good knowledge source for comprehensive abstraction of natural language as its good implementations exist for many languages. Since WordNet embeds natural language in the form of a complex network, a transformation mechanism WordNet2Vec is proposed in the paper. It creates vectors for each word from WordNet. These vectors encapsulate general position - role of a given word towards all other words in the natural language. Any list or set of such vectors contains knowledge about the context of its component within the whole language. Such word representation can be easily applied to many analytic tasks like classification or clustering. The usefulness of the WordNet2Vec method was demonstrated in sentiment analysis, i.e. classification with transfer learning for the real Amazon opinion textual dataset.
Tasks Sentiment Analysis, Transfer Learning
Published 2016-06-10
URL http://arxiv.org/abs/1606.03335v1
PDF http://arxiv.org/pdf/1606.03335v1.pdf
PWC https://paperswithcode.com/paper/wordnet2vec-corpora-agnostic-word
Repo
Framework

Semantic Scan: Detecting Subtle, Spatially Localized Events in Text Streams

Title Semantic Scan: Detecting Subtle, Spatially Localized Events in Text Streams
Authors Abhinav Maurya, Kenton Murray, Yandong Liu, Chris Dyer, William W. Cohen, Daniel B. Neill
Abstract Early detection and precise characterization of emerging topics in text streams can be highly useful in applications such as timely and targeted public health interventions and discovering evolving regional business trends. Many methods have been proposed for detecting emerging events in text streams using topic modeling. However, these methods have numerous shortcomings that make them unsuitable for rapid detection of locally emerging events on massive text streams. In this paper, we describe Semantic Scan (SS) that has been developed specifically to overcome these shortcomings in detecting new spatially compact events in text streams. Semantic Scan integrates novel contrastive topic modeling with online document assignment and principled likelihood ratio-based spatial scanning to identify emerging events with unexpected patterns of keywords hidden in text streams. This enables more timely and accurate detection and characterization of anomalous, spatially localized emerging events. Semantic Scan does not require manual intervention or labeled training data, and is robust to noise in real-world text data since it identifies anomalous text patterns that occur in a cluster of new documents rather than an anomaly in a single new document. We compare Semantic Scan to alternative state-of-the-art methods such as Topics over Time, Online LDA, and Labeled LDA on two real-world tasks: (i) a disease surveillance task monitoring free-text Emergency Department chief complaints in Allegheny County, and (ii) an emerging business trend detection task based on Yelp reviews. On both tasks, we find that Semantic Scan provides significantly better event detection and characterization accuracy than competing approaches, while providing up to an order of magnitude speedup.
Tasks
Published 2016-02-13
URL http://arxiv.org/abs/1602.04393v1
PDF http://arxiv.org/pdf/1602.04393v1.pdf
PWC https://paperswithcode.com/paper/semantic-scan-detecting-subtle-spatially
Repo
Framework

Anomaly Detection with the Voronoi Diagram Evolutionary Algorithm

Title Anomaly Detection with the Voronoi Diagram Evolutionary Algorithm
Authors Marti Luis, Fansi-Tchango Arsene, Navarro Laurent, Marc Schoenauer
Abstract This paper presents the Voronoi diagram-based evolutionary algorithm (VorEAl). VorEAl partitions input space in abnormal/normal subsets using Voronoi diagrams. Diagrams are evolved using a multi-objective bio-inspired approach in order to conjointly optimize classification metrics while also being able to represent areas of the data space that are not present in the training dataset. As part of the paper VorEAl is experimentally validated and contrasted with similar approaches.
Tasks Anomaly Detection
Published 2016-10-27
URL http://arxiv.org/abs/1610.08640v1
PDF http://arxiv.org/pdf/1610.08640v1.pdf
PWC https://paperswithcode.com/paper/anomaly-detection-with-the-voronoi-diagram
Repo
Framework

Phase Retrieval Meets Statistical Learning Theory: A Flexible Convex Relaxation

Title Phase Retrieval Meets Statistical Learning Theory: A Flexible Convex Relaxation
Authors Sohail Bahmani, Justin Romberg
Abstract We propose a flexible convex relaxation for the phase retrieval problem that operates in the natural domain of the signal. Therefore, we avoid the prohibitive computational cost associated with “lifting” and semidefinite programming (SDP) in methods such as PhaseLift and compete with recently developed non-convex techniques for phase retrieval. We relax the quadratic equations for phaseless measurements to inequality constraints each of which representing a symmetric “slab”. Through a simple convex program, our proposed estimator finds an extreme point of the intersection of these slabs that is best aligned with a given anchor vector. We characterize geometric conditions that certify success of the proposed estimator. Furthermore, using classic results in statistical learning theory, we show that for random measurements the geometric certificates hold with high probability at an optimal sample complexity. Phase transition of our estimator is evaluated through simulations. Our numerical experiments also suggest that the proposed method can solve phase retrieval problems with coded diffraction measurements as well.
Tasks
Published 2016-10-13
URL http://arxiv.org/abs/1610.04210v2
PDF http://arxiv.org/pdf/1610.04210v2.pdf
PWC https://paperswithcode.com/paper/phase-retrieval-meets-statistical-learning
Repo
Framework

Deep Kinematic Pose Regression

Title Deep Kinematic Pose Regression
Authors Xingyi Zhou, Xiao Sun, Wei Zhang, Shuang Liang, Yichen Wei
Abstract Learning articulated object pose is inherently difficult because the pose is high dimensional but has many structural constraints. Most existing work do not model such constraints and does not guarantee the geometric validity of their pose estimation, therefore requiring a post-processing to recover the correct geometry if desired, which is cumbersome and sub-optimal. In this work, we propose to directly embed a kinematic object model into the deep neutral network learning for general articulated object pose estimation. The kinematic function is defined on the appropriately parameterized object motion variables. It is differentiable and can be used in the gradient descent based optimization in network training. The prior knowledge on the object geometric model is fully exploited and the structure is guaranteed to be valid. We show convincing experiment results on a toy example and the 3D human pose estimation problem. For the latter we achieve state-of-the-art result on Human3.6M dataset.
Tasks 3D Human Pose Estimation, Pose Estimation
Published 2016-09-17
URL http://arxiv.org/abs/1609.05317v1
PDF http://arxiv.org/pdf/1609.05317v1.pdf
PWC https://paperswithcode.com/paper/deep-kinematic-pose-regression
Repo
Framework

Efficient Dictionary Learning with Sparseness-Enforcing Projections

Title Efficient Dictionary Learning with Sparseness-Enforcing Projections
Authors Markus Thom, Matthias Rapp, Günther Palm
Abstract Learning dictionaries suitable for sparse coding instead of using engineered bases has proven effective in a variety of image processing tasks. This paper studies the optimization of dictionaries on image data where the representation is enforced to be explicitly sparse with respect to a smooth, normalized sparseness measure. This involves the computation of Euclidean projections onto level sets of the sparseness measure. While previous algorithms for this optimization problem had at least quasi-linear time complexity, here the first algorithm with linear time complexity and constant space complexity is proposed. The key for this is the mathematically rigorous derivation of a characterization of the projection’s result based on a soft-shrinkage function. This theory is applied in an original algorithm called Easy Dictionary Learning (EZDL), which learns dictionaries with a simple and fast-to-compute Hebbian-like learning rule. The new algorithm is efficient, expressive and particularly simple to implement. It is demonstrated that despite its simplicity, the proposed learning algorithm is able to generate a rich variety of dictionaries, in particular a topographic organization of atoms or separable atoms. Further, the dictionaries are as expressive as those of benchmark learning algorithms in terms of the reproduction quality on entire images, and result in an equivalent denoising performance. EZDL learns approximately 30 % faster than the already very efficient Online Dictionary Learning algorithm, and is therefore eligible for rapid data set analysis and problems with vast quantities of learning samples.
Tasks Denoising, Dictionary Learning
Published 2016-04-16
URL http://arxiv.org/abs/1604.04767v1
PDF http://arxiv.org/pdf/1604.04767v1.pdf
PWC https://paperswithcode.com/paper/efficient-dictionary-learning-with-sparseness
Repo
Framework

Classifiers for centrality determination in proton-nucleus and nucleus-nucleus collisions

Title Classifiers for centrality determination in proton-nucleus and nucleus-nucleus collisions
Authors Igor Altsybeev, Vladimir Kovalenko
Abstract Centrality, as a geometrical property of the collision, is crucial for the physical interpretation of nucleus-nucleus and proton-nucleus experimental data. However, it cannot be directly accessed in event-by-event data analysis. Common methods for centrality estimation in A-A and p-A collisions usually rely on a single detector (either on the signal in zero-degree calorimeters or on the multiplicity in some semi-central rapidity range). In the present work, we made an attempt to develop an approach for centrality determination that is based on machine-learning techniques and utilizes information from several detector subsystems simultaneously. Different event classifiers are suggested and evaluated for their selectivity power in terms of the number of nucleons-participants and the impact parameter of the collision. Finer centrality resolution may allow to reduce impact from so-called volume fluctuations on physical observables being studied in heavy-ion experiments like ALICE at the LHC and fixed target experiment NA61/SHINE on SPS.
Tasks
Published 2016-11-30
URL http://arxiv.org/abs/1612.00312v1
PDF http://arxiv.org/pdf/1612.00312v1.pdf
PWC https://paperswithcode.com/paper/classifiers-for-centrality-determination-in
Repo
Framework

Context-dependent feature analysis with random forests

Title Context-dependent feature analysis with random forests
Authors Antonio Sutera, Gilles Louppe, Vân Anh Huynh-Thu, Louis Wehenkel, Pierre Geurts
Abstract In many cases, feature selection is often more complicated than identifying a single subset of input variables that would together explain the output. There may be interactions that depend on contextual information, i.e., variables that reveal to be relevant only in some specific circumstances. In this setting, the contribution of this paper is to extend the random forest variable importances framework in order (i) to identify variables whose relevance is context-dependent and (ii) to characterize as precisely as possible the effect of contextual information on these variables. The usage and the relevance of our framework for highlighting context-dependent variables is illustrated on both artificial and real datasets.
Tasks Feature Selection
Published 2016-05-12
URL http://arxiv.org/abs/1605.03848v1
PDF http://arxiv.org/pdf/1605.03848v1.pdf
PWC https://paperswithcode.com/paper/context-dependent-feature-analysis-with
Repo
Framework

How Much Did it Rain? Predicting Real Rainfall Totals Based on Radar Data

Title How Much Did it Rain? Predicting Real Rainfall Totals Based on Radar Data
Authors Adam Lesnikowski
Abstract We applied a variety of parametric and non-parametric machine learning models to predict the probability distribution of rainfall based on 1M training examples over a single year across several U.S. states. Our top performing model based on a squared loss objective was a cross-validated parametric k-nearest-neighbor predictor that took about six days to compute, and was competitive in a world-wide competition.
Tasks
Published 2016-08-06
URL http://arxiv.org/abs/1608.02126v1
PDF http://arxiv.org/pdf/1608.02126v1.pdf
PWC https://paperswithcode.com/paper/how-much-did-it-rain-predicting-real-rainfall
Repo
Framework

Pessimistic Uplift Modeling

Title Pessimistic Uplift Modeling
Authors Atef Shaar, Talel Abdessalem, Olivier Segard
Abstract Uplift modeling is a machine learning technique that aims to model treatment effects heterogeneity. It has been used in business and health sectors to predict the effect of a specific action on a given individual. Despite its advantages, uplift models show high sensitivity to noise and disturbance, which leads to unreliable results. In this paper we show different approaches to address the problem of uplift modeling, we demonstrate how disturbance in data can affect uplift measurement. We propose a new approach, we call it Pessimistic Uplift Modeling, that minimizes disturbance effects. We compared our approach with the existing uplift methods, on simulated and real data-sets. The experiments show that our approach outperforms the existing approaches, especially in the case of high noise data environment.
Tasks
Published 2016-03-31
URL http://arxiv.org/abs/1603.09738v2
PDF http://arxiv.org/pdf/1603.09738v2.pdf
PWC https://paperswithcode.com/paper/pessimistic-uplift-modeling
Repo
Framework

Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding

Title Composing Music with Grammar Argumented Neural Networks and Note-Level Encoding
Authors Zheng Sun, Jiaqi Liu, Zewang Zhang, Jingwen Chen, Zhao Huo, Ching Hua Lee, Xiao Zhang
Abstract Creating aesthetically pleasing pieces of art, including music, has been a long-term goal for artificial intelligence research. Despite recent successes of long-short term memory (LSTM) recurrent neural networks (RNNs) in sequential learning, LSTM neural networks have not, by themselves, been able to generate natural-sounding music conforming to music theory. To transcend this inadequacy, we put forward a novel method for music composition that combines the LSTM with Grammars motivated by music theory. The main tenets of music theory are encoded as grammar argumented (GA) filters on the training data, such that the machine can be trained to generate music inheriting the naturalness of human-composed pieces from the original dataset while adhering to the rules of music theory. Unlike previous approaches, pitches and durations are encoded as one semantic entity, which we refer to as note-level encoding. This allows easy implementation of music theory grammars, as well as closer emulation of the thinking pattern of a musician. Although the GA rules are applied to the training data and never directly to the LSTM music generation, our machine still composes music that possess high incidences of diatonic scale notes, small pitch intervals and chords, in deference to music theory.
Tasks Music Generation
Published 2016-11-16
URL http://arxiv.org/abs/1611.05416v2
PDF http://arxiv.org/pdf/1611.05416v2.pdf
PWC https://paperswithcode.com/paper/composing-music-with-grammar-argumented
Repo
Framework

Song From PI: A Musically Plausible Network for Pop Music Generation

Title Song From PI: A Musically Plausible Network for Pop Music Generation
Authors Hang Chu, Raquel Urtasun, Sanja Fidler
Abstract We present a novel framework for generating pop music. Our model is a hierarchical Recurrent Neural Network, where the layers and the structure of the hierarchy encode our prior knowledge about how pop music is composed. In particular, the bottom layers generate the melody, while the higher levels produce the drums and chords. We conduct several human studies that show strong preference of our generated music over that produced by the recent method by Google. We additionally show two applications of our framework: neural dancing and karaoke, as well as neural story singing.
Tasks Music Generation
Published 2016-11-10
URL http://arxiv.org/abs/1611.03477v1
PDF http://arxiv.org/pdf/1611.03477v1.pdf
PWC https://paperswithcode.com/paper/song-from-pi-a-musically-plausible-network
Repo
Framework

The Search Problem in Mixture Models

Title The Search Problem in Mixture Models
Authors Avik Ray, Joe Neeman, Sujay Sanghavi, Sanjay Shakkottai
Abstract We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the “search problem” in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.
Tasks Topic Models
Published 2016-10-04
URL http://arxiv.org/abs/1610.00843v2
PDF http://arxiv.org/pdf/1610.00843v2.pdf
PWC https://paperswithcode.com/paper/the-search-problem-in-mixture-models
Repo
Framework
comments powered by Disqus