Paper Group ANR 346
A Probabilistic-Based Model for Binary CSP. Automatic time-series phenotyping using massive feature extraction. Flu Detector: Estimating influenza-like illness rates from online user-generated content. Community Detection in Political Twitter Networks using Nonnegative Matrix Factorization Methods. Faster Eigenvector Computation via Shift-and-Inver …
A Probabilistic-Based Model for Binary CSP
Title | A Probabilistic-Based Model for Binary CSP |
Authors | Amine Balafrej, Xavier Lorca, Charlotte Truchet |
Abstract | This work introduces a probabilistic-based model for binary CSP that provides a fine grained analysis of its internal structure. Assuming that a domain modification could occur in the CSP, it shows how to express, in a predictive way, the probability that a domain value becomes inconsistent, then it express the expectation of the number of arc-inconsistent values in each domain of the constraint network. Thus, it express the expectation of the number of arc-inconsistent values for the whole constraint network. Next, it provides bounds for each of these three probabilistic indicators. Finally, a polytime algorithm, which propagates the probabilistic information, is presented. |
Tasks | |
Published | 2016-06-13 |
URL | http://arxiv.org/abs/1606.03894v1 |
http://arxiv.org/pdf/1606.03894v1.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-based-model-for-binary-csp |
Repo | |
Framework | |
Automatic time-series phenotyping using massive feature extraction
Title | Automatic time-series phenotyping using massive feature extraction |
Authors | Ben D Fulcher, Nick S Jones |
Abstract | Across a far-reaching diversity of scientific and industrial applications, a general key problem involves relating the structure of time-series data to a meaningful outcome, such as detecting anomalous events from sensor recordings, or diagnosing patients from physiological time-series measurements like heart rate or brain activity. Currently, researchers must devote considerable effort manually devising, or searching for, properties of their time series that are suitable for the particular analysis problem at hand. Addressing this non-systematic and time-consuming procedure, here we introduce a new tool, hctsa, that selects interpretable and useful properties of time series automatically, by comparing implementations over 7700 time-series features drawn from diverse scientific literatures. Using two exemplar biological applications, we show how hctsa allows researchers to leverage decades of time-series research to quantify and understand informative structure in their time-series data. |
Tasks | Time Series |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.05296v1 |
http://arxiv.org/pdf/1612.05296v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-time-series-phenotyping-using |
Repo | |
Framework | |
Flu Detector: Estimating influenza-like illness rates from online user-generated content
Title | Flu Detector: Estimating influenza-like illness rates from online user-generated content |
Authors | Vasileios Lampos |
Abstract | We provide a brief technical description of an online platform for disease monitoring, titled as the Flu Detector (fludetector.cs.ucl.ac.uk). Flu Detector, in its current version (v.0.5), uses either Twitter or Google search data in conjunction with statistical Natural Language Processing models to estimate the rate of influenza-like illness in the population of England. Its back-end is a live service that collects online data, utilises modern technologies for large-scale text processing, and finally applies statistical inference models that are trained offline. The front-end visualises the various disease rate estimates. Notably, the models based on Google data achieve a high level of accuracy with respect to the most recent four flu seasons in England (2012/13 to 2015/16). This highlighted Flu Detector as having a great potential of becoming a complementary source to the domestic traditional flu surveillance schemes. |
Tasks | |
Published | 2016-12-11 |
URL | http://arxiv.org/abs/1612.03494v1 |
http://arxiv.org/pdf/1612.03494v1.pdf | |
PWC | https://paperswithcode.com/paper/flu-detector-estimating-influenza-like |
Repo | |
Framework | |
Community Detection in Political Twitter Networks using Nonnegative Matrix Factorization Methods
Title | Community Detection in Political Twitter Networks using Nonnegative Matrix Factorization Methods |
Authors | Mert Ozer, Nyunsu Kim, Hasan Davulcu |
Abstract | Community detection is a fundamental task in social network analysis. In this paper, first we develop an endorsement filtered user connectivity network by utilizing Heider’s structural balance theory and certain Twitter triad patterns. Next, we develop three Nonnegative Matrix Factorization frameworks to investigate the contributions of different types of user connectivity and content information in community detection. We show that user content and endorsement filtered connectivity information are complementary to each other in clustering politically motivated users into pure political communities. Word usage is the strongest indicator of users’ political orientation among all content categories. Incorporating user-word matrix and word similarity regularizer provides the missing link in connectivity only methods which suffer from detection of artificially large number of clusters for Twitter networks. |
Tasks | Community Detection |
Published | 2016-08-05 |
URL | http://arxiv.org/abs/1608.01771v1 |
http://arxiv.org/pdf/1608.01771v1.pdf | |
PWC | https://paperswithcode.com/paper/community-detection-in-political-twitter |
Repo | |
Framework | |
Faster Eigenvector Computation via Shift-and-Invert Preconditioning
Title | Faster Eigenvector Computation via Shift-and-Invert Preconditioning |
Authors | Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford |
Abstract | We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $\Sigma$ – i.e. computing a unit vector $x$ such that $x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $\Sigma = A^TA$, we show how to compute an $\epsilon$ approximate top eigenvector in time $\tilde O([nnz(A) + \frac{dsr(A)}{gap^2} ] \log 1/\epsilon )$ and $\tilde O([\frac{nnz(A)^{3/4} (dsr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon )$. Here $nnz(A)$ is the number of nonzeros in $A$, $sr(A)$ is the stable rank, $gap$ is the relative eigengap. By separating the $gap$ dependence from the $nnz(A)$ term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on $sr(A)$ and $\epsilon$. Our second running time improves these further when $nnz(A) \le \frac{dsr(A)}{gap^2}$. Online Eigenvector Estimation: Given a distribution $D$ with covariance matrix $\Sigma$ and a vector $x_0$ which is an $O(gap)$ approximate top eigenvector for $\Sigma$, we show how to refine to an $\epsilon$ approximation using $ O(\frac{var(D)}{gap*\epsilon})$ samples from $D$. Here $var(D)$ is a natural notion of variance. Combining our algorithm with previous work to initialize $x_0$, we obtain improved sample complexity and runtime results under a variety of assumptions on $D$. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims. |
Tasks | Stochastic Optimization |
Published | 2016-05-26 |
URL | http://arxiv.org/abs/1605.08754v1 |
http://arxiv.org/pdf/1605.08754v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-eigenvector-computation-via-shift-and |
Repo | |
Framework | |
Successor Features for Transfer in Reinforcement Learning
Title | Successor Features for Transfer in Reinforcement Learning |
Authors | André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, Hado van Hasselt, David Silver |
Abstract | Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. We propose a transfer framework for the scenario where the reward function changes between tasks but the environment’s dynamics remain the same. Our approach rests on two key ideas: “successor features”, a value function representation that decouples the dynamics of the environment from the rewards, and “generalized policy improvement”, a generalization of dynamic programming’s policy improvement operation that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows the free exchange of information across tasks. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place. We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice, significantly outperforming alternative methods in a sequence of navigation tasks and in the control of a simulated robotic arm. |
Tasks | |
Published | 2016-06-16 |
URL | http://arxiv.org/abs/1606.05312v2 |
http://arxiv.org/pdf/1606.05312v2.pdf | |
PWC | https://paperswithcode.com/paper/successor-features-for-transfer-in |
Repo | |
Framework | |
Image segmentation based on the hybrid total variation model and the K-means clustering strategy
Title | Image segmentation based on the hybrid total variation model and the K-means clustering strategy |
Authors | Baoli Shi, Zhi-Feng Pang, Jing Xu |
Abstract | The performance of image segmentation highly relies on the original inputting image. When the image is contaminated by some noises or blurs, we can not obtain the efficient segmentation result by using direct segmentation methods. In order to efficiently segment the contaminated image, this paper proposes a two step method based on the hybrid total variation model with a box constraint and the K-means clustering method. In the first step, the hybrid model is based on the weighted convex combination between the total variation functional and the high-order total variation as the regularization term to obtain the original clustering data. In order to deal with non-smooth regularization term, we solve this model by employing the alternating split Bregman method. Then, in the second step, the segmentation can be obtained by thresholding this clustering data into different phases, where the thresholds can be given by using the K-means clustering method. Numerical comparisons show that our proposed model can provide more efficient segmentation results dealing with the noise image and blurring image. |
Tasks | Semantic Segmentation |
Published | 2016-05-30 |
URL | http://arxiv.org/abs/1605.09116v1 |
http://arxiv.org/pdf/1605.09116v1.pdf | |
PWC | https://paperswithcode.com/paper/image-segmentation-based-on-the-hybrid-total |
Repo | |
Framework | |
Coordinate Friendly Structures, Algorithms and Applications
Title | Coordinate Friendly Structures, Algorithms and Applications |
Authors | Zhimin Peng, Tianyu Wu, Yangyang Xu, Ming Yan, Wotao Yin |
Abstract | This paper focuses on coordinate update methods, which are useful for solving problems involving large or high-dimensional datasets. They decompose a problem into simple subproblems, where each updates one, or a small block of, variables while fixing others. These methods can deal with linear and nonlinear mappings, smooth and nonsmooth functions, as well as convex and nonconvex problems. In addition, they are easy to parallelize. The great performance of coordinate update methods depends on solving simple sub-problems. To derive simple subproblems for several new classes of applications, this paper systematically studies coordinate-friendly operators that perform low-cost coordinate updates. Based on the discovered coordinate friendly operators, as well as operator splitting techniques, we obtain new coordinate update algorithms for a variety of problems in machine learning, image processing, as well as sub-areas of optimization. Several problems are treated with coordinate update for the first time in history. The obtained algorithms are scalable to large instances through parallel and even asynchronous computing. We present numerical examples to illustrate how effective these algorithms are. |
Tasks | |
Published | 2016-01-05 |
URL | http://arxiv.org/abs/1601.00863v3 |
http://arxiv.org/pdf/1601.00863v3.pdf | |
PWC | https://paperswithcode.com/paper/coordinate-friendly-structures-algorithms-and |
Repo | |
Framework | |
Automatic Visual Theme Discovery from Joint Image and Text Corpora
Title | Automatic Visual Theme Discovery from Joint Image and Text Corpora |
Authors | Ke Sun, Xianxu Hou, Qian Zhang, Guoping Qiu |
Abstract | A popular approach to semantic image understanding is to manually tag images with keywords and then learn a mapping from vi- sual features to keywords. Manually tagging images is a subjective pro- cess and the same or very similar visual contents are often tagged with different keywords. Furthermore, not all tags have the same descriptive power for visual contents and large vocabulary available from natural language could result in a very diverse set of keywords. In this paper, we propose an unsupervised visual theme discovery framework as a better (more compact, efficient and effective) alternative to semantic represen- tation of visual contents. We first show that tag based annotation lacks consistency and compactness for describing visually similar contents. We then learn the visual similarity between tags based on the visual features of the images containing the tags. At the same time, we use a natural language processing technique (word embedding) to measure the seman- tic similarity between tags. Finally, we cluster tags into visual themes based on their visual similarity and semantic similarity measures using a spectral clustering algorithm. We conduct user studies to evaluate the effectiveness and rationality of the visual themes discovered by our unsu- pervised algorithm and obtains promising result. We then design three common computer vision tasks, example based image search, keyword based image search and image labelling to explore potential applica- tion of our visual themes discovery framework. In experiments, visual themes significantly outperforms tags on semantic image understand- ing and achieve state-of-art performance in all three tasks. This again demonstrate the effectiveness and versatility of proposed framework. |
Tasks | Image Retrieval, Semantic Similarity, Semantic Textual Similarity |
Published | 2016-09-07 |
URL | http://arxiv.org/abs/1609.01859v1 |
http://arxiv.org/pdf/1609.01859v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-visual-theme-discovery-from-joint |
Repo | |
Framework | |
U-CATCH: Using Color ATtribute of image patCHes in binary descriptors
Title | U-CATCH: Using Color ATtribute of image patCHes in binary descriptors |
Authors | Alisher Abdulkhaev, Ozgur Yilmaz |
Abstract | In this study, we propose a simple yet very effective method for extracting color information through binary feature description framework. Our method expands the dimension of binary comparisons into RGB and YCbCr spaces, showing more than 100% matching improve ment compared to non-color binary descriptors for a wide range of hard-to-match cases. The proposed method is general and can be applied to any binary descriptor to make it color sensitive. It is faster than classical binary descriptors for RGB sampling due to the abandonment of grayscale conversion and has almost identical complexity (insignificant compared to smoothing operation) for YCbCr sampling. |
Tasks | |
Published | 2016-03-14 |
URL | https://arxiv.org/abs/1603.04408v3 |
https://arxiv.org/pdf/1603.04408v3.pdf | |
PWC | https://paperswithcode.com/paper/u-catch-using-color-attribute-of-image |
Repo | |
Framework | |
Very Efficient Training of Convolutional Neural Networks using Fast Fourier Transform and Overlap-and-Add
Title | Very Efficient Training of Convolutional Neural Networks using Fast Fourier Transform and Overlap-and-Add |
Authors | Tyler Highlander, Andres Rodriguez |
Abstract | Convolutional neural networks (CNNs) are currently state-of-the-art for various classification tasks, but are computationally expensive. Propagating through the convolutional layers is very slow, as each kernel in each layer must sequentially calculate many dot products for a single forward and backward propagation which equates to $\mathcal{O}(N^{2}n^{2})$ per kernel per layer where the inputs are $N \times N$ arrays and the kernels are $n \times n$ arrays. Convolution can be efficiently performed as a Hadamard product in the frequency domain. The bottleneck is the transformation which has a cost of $\mathcal{O}(N^{2}\log_2 N)$ using the fast Fourier transform (FFT). However, the increase in efficiency is less significant when $N\gg n$ as is the case in CNNs. We mitigate this by using the “overlap-and-add” technique reducing the computational complexity to $\mathcal{O}(N^2\log_2 n)$ per kernel. This method increases the algorithm’s efficiency in both the forward and backward propagation, reducing the training and testing time for CNNs. Our empirical results show our method reduces computational time by a factor of up to 16.3 times the traditional convolution implementation for a 8 $\times$ 8 kernel and a 224 $\times$ 224 image. |
Tasks | |
Published | 2016-01-25 |
URL | http://arxiv.org/abs/1601.06815v1 |
http://arxiv.org/pdf/1601.06815v1.pdf | |
PWC | https://paperswithcode.com/paper/very-efficient-training-of-convolutional |
Repo | |
Framework | |
Emerging Dimension Weights in a Conceptual Spaces Model of Concept Combination
Title | Emerging Dimension Weights in a Conceptual Spaces Model of Concept Combination |
Authors | Martha Lewis, Jonathan Lawry |
Abstract | We investigate the generation of new concepts from combinations of properties as an artificial language develops. To do so, we have developed a new framework for conjunctive concept combination. This framework gives a semantic grounding to the weighted sum approach to concept combination seen in the literature. We implement the framework in a multi-agent simulation of language evolution and show that shared combination weights emerge. The expected value and the variance of these weights across agents may be predicted from the distribution of elements in the conceptual space, as determined by the underlying environment, together with the rate at which agents adopt others’ concepts. When this rate is smaller, the agents are able to converge to weights with lower variance. However, the time taken to converge to a steady state distribution of weights is longer. |
Tasks | |
Published | 2016-01-25 |
URL | http://arxiv.org/abs/1601.06763v1 |
http://arxiv.org/pdf/1601.06763v1.pdf | |
PWC | https://paperswithcode.com/paper/emerging-dimension-weights-in-a-conceptual |
Repo | |
Framework | |
A New Distance Measure for Non-Identical Data with Application to Image Classification
Title | A New Distance Measure for Non-Identical Data with Application to Image Classification |
Authors | Muthukaruppan Swaminathan, Pankaj Kumar Yadav, Obdulio Piloto, Tobias Sjöblom, Ian Cheong |
Abstract | Distance measures are part and parcel of many computer vision algorithms. The underlying assumption in all existing distance measures is that feature elements are independent and identically distributed. However, in real-world settings, data generally originate from heterogeneous sources even if they do possess a common data-generating mechanism. Since these sources are not identically distributed by necessity, the assumption of identical distribution is inappropriate. Here, we use statistical analysis to show that feature elements of local image descriptors are indeed non-identically distributed. To test the effect of omitting the unified distribution assumption, we created a new distance measure called the Poisson-Binomial Radius (PBR). PBR is a bin-to-bin distance which accounts for the dispersion of bin-to-bin information. PBR’s performance was evaluated on twelve benchmark data sets covering six different classification and recognition applications: texture, material, leaf, scene, ear biometrics and category-level image classification. Results from these experiments demonstrate that PBR outperforms state-of-the-art distance measures for most of the data sets and achieves comparable performance on the rest, suggesting that accounting for different distributions in distance measures can improve performance in classification and recognition tasks. |
Tasks | Image Classification |
Published | 2016-10-31 |
URL | http://arxiv.org/abs/1610.09766v1 |
http://arxiv.org/pdf/1610.09766v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-distance-measure-for-non-identical-data |
Repo | |
Framework | |
Simultaneous Control and Human Feedback in the Training of a Robotic Agent with Actor-Critic Reinforcement Learning
Title | Simultaneous Control and Human Feedback in the Training of a Robotic Agent with Actor-Critic Reinforcement Learning |
Authors | Kory W. Mathewson, Patrick M. Pilarski |
Abstract | This paper contributes a preliminary report on the advantages and disadvantages of incorporating simultaneous human control and feedback signals in the training of a reinforcement learning robotic agent. While robotic human-machine interfaces have become increasingly complex in both form and function, control remains challenging for users. This has resulted in an increasing gap between user control approaches and the number of robotic motors which can be controlled. One way to address this gap is to shift some autonomy to the robot. Semi-autonomous actions of the robotic agent can then be shaped by human feedback, simplifying user control. Most prior work on agent shaping by humans has incorporated training with feedback, or has included indirect control signals. By contrast, in this paper we explore how a human can provide concurrent feedback signals and real-time myoelectric control signals to train a robot’s actor-critic reinforcement learning control system. Using both a physical and a simulated robotic system, we compare training performance on a simple movement task when reward is derived from the environment, when reward is provided by the human, and combinations of these two approaches. Our results indicate that some benefit can be gained with the inclusion of human generated feedback. |
Tasks | |
Published | 2016-06-22 |
URL | http://arxiv.org/abs/1606.06979v1 |
http://arxiv.org/pdf/1606.06979v1.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-control-and-human-feedback-in |
Repo | |
Framework | |
Mining Compatible/Incompatible Entities from Question and Answering via Yes/No Answer Classification using Distant Label Expansion
Title | Mining Compatible/Incompatible Entities from Question and Answering via Yes/No Answer Classification using Distant Label Expansion |
Authors | Hu Xu, Lei Shu, Jingyuan Zhang, Philip S. Yu |
Abstract | Product Community Question Answering (PCQA) provides useful information about products and their features (aspects) that may not be well addressed by product descriptions and reviews. We observe that a product’s compatibility issues with other products are frequently discussed in PCQA and such issues are more frequently addressed in accessories, i.e., via a yes/no question “Does this mouse work with windows 10?". In this paper, we address the problem of extracting compatible and incompatible products from yes/no questions in PCQA. This problem can naturally have a two-stage framework: first, we perform Complementary Entity (product) Recognition (CER) on yes/no questions; second, we identify the polarities of yes/no answers to assign the complementary entities a compatibility label (compatible, incompatible or unknown). We leverage an existing unsupervised method for the first stage and a 3-class classifier by combining a distant PU-learning method (learning from positive and unlabeled examples) together with a binary classifier for the second stage. The benefit of using distant PU-learning is that it can help to expand more implicit yes/no answers without using any human annotated data. We conduct experiments on 4 products to show that the proposed method is effective. |
Tasks | Community Question Answering, Question Answering |
Published | 2016-12-14 |
URL | http://arxiv.org/abs/1612.04499v1 |
http://arxiv.org/pdf/1612.04499v1.pdf | |
PWC | https://paperswithcode.com/paper/mining-compatibleincompatible-entities-from |
Repo | |
Framework | |