Paper Group ANR 209
Improving Multi-Document Summarization via Text Classification. Bayesian Non-parametric model to Target Gamification Notifications Using Big Data. Maximin Action Identification: A New Bandit Framework for Games. Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging. Nonlinear Adaptive Algorithms on Rank-O …
Improving Multi-Document Summarization via Text Classification
Title | Improving Multi-Document Summarization via Text Classification |
Authors | Ziqiang Cao, Wenjie Li, Sujian Li, Furu Wei |
Abstract | Developed so far, multi-document summarization has reached its bottleneck due to the lack of sufficient training data and diverse categories of documents. Text classification just makes up for these deficiencies. In this paper, we propose a novel summarization system called TCSum, which leverages plentiful text classification data to improve the performance of multi-document summarization. TCSum projects documents onto distributed representations which act as a bridge between text classification and summarization. It also utilizes the classification results to produce summaries of different styles. Extensive experiments on DUC generic multi-document summarization datasets show that, TCSum can achieve the state-of-the-art performance without using any hand-crafted features and has the capability to catch the variations of summary styles with respect to different text categories. |
Tasks | Document Summarization, Multi-Document Summarization, Text Classification |
Published | 2016-11-28 |
URL | http://arxiv.org/abs/1611.09238v1 |
http://arxiv.org/pdf/1611.09238v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-multi-document-summarization-via |
Repo | |
Framework | |
Bayesian Non-parametric model to Target Gamification Notifications Using Big Data
Title | Bayesian Non-parametric model to Target Gamification Notifications Using Big Data |
Authors | Meisam Hejazi Nia, Brian Ratchford |
Abstract | I suggest an approach that helps the online marketers to target their Gamification elements to users by modifying the order of the list of tasks that they send to users. It is more realistic and flexible as it allows the model to learn more parameters when the online marketers collect more data. The targeting approach is scalable and quick, and it can be used over streaming data. |
Tasks | |
Published | 2016-11-04 |
URL | http://arxiv.org/abs/1611.02154v1 |
http://arxiv.org/pdf/1611.02154v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-non-parametric-model-to-target |
Repo | |
Framework | |
Maximin Action Identification: A New Bandit Framework for Games
Title | Maximin Action Identification: A New Bandit Framework for Games |
Authors | Aurélien Garivier, Emilie Kaufmann, Wouter Koolen |
Abstract | We study an original problem of pure exploration in a strategic bandit model motivated by Monte Carlo Tree Search. It consists in identifying the best action in a game, when the player may sample random outcomes of sequentially chosen pairs of actions. We propose two strategies for the fixed-confidence setting: Maximin-LUCB, based on lower-and upper-confidence bounds; and Maximin-Racing, which operates by successively eliminating the sub-optimal actions. We discuss the sample complexity of both methods and compare their performance empirically. We sketch a lower bound analysis, and possible connections to an optimal algorithm. |
Tasks | |
Published | 2016-02-15 |
URL | http://arxiv.org/abs/1602.04676v1 |
http://arxiv.org/pdf/1602.04676v1.pdf | |
PWC | https://paperswithcode.com/paper/maximin-action-identification-a-new-bandit |
Repo | |
Framework | |
Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging
Title | Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging |
Authors | Zachary DeVito, Michael Mara, Michael Zollhöfer, Gilbert Bernstein, Jonathan Ragan-Kelley, Christian Theobalt, Pat Hanrahan, Matthew Fisher, Matthias Nießner |
Abstract | Many graphics and vision problems can be expressed as non-linear least squares optimizations of objective functions over visual data, such as images and meshes. The mathematical descriptions of these functions are extremely concise, but their implementation in real code is tedious, especially when optimized for real-time performance on modern GPUs in interactive applications. In this work, we propose a new language, Opt (available under http://optlang.org), for writing these objective functions over image- or graph-structured unknowns concisely and at a high level. Our compiler automatically transforms these specifications into state-of-the-art GPU solvers based on Gauss-Newton or Levenberg-Marquardt methods. Opt can generate different variations of the solver, so users can easily explore tradeoffs in numerical precision, matrix-free methods, and solver approaches. In our results, we implement a variety of real-world graphics and vision applications. Their energy functions are expressible in tens of lines of code, and produce highly-optimized GPU solver implementations. These solver have performance competitive with the best published hand-tuned, application-specific GPU solvers, and orders of magnitude beyond a general-purpose auto-generated solver. |
Tasks | |
Published | 2016-04-22 |
URL | http://arxiv.org/abs/1604.06525v3 |
http://arxiv.org/pdf/1604.06525v3.pdf | |
PWC | https://paperswithcode.com/paper/opt-a-domain-specific-language-for-non-linear |
Repo | |
Framework | |
Nonlinear Adaptive Algorithms on Rank-One Tensor Models
Title | Nonlinear Adaptive Algorithms on Rank-One Tensor Models |
Authors | Felipe C. Pinheiro, Cassio G. Lopes |
Abstract | This work proposes a low complexity nonlinearity model and develops adaptive algorithms over it. The model is based on the decomposable—or rank-one, in tensor language—Volterra kernels. It may also be described as a product of FIR filters, which explains its low-complexity. The rank-one model is also interesting because it comes from a well-posed problem in approximation theory. The paper uses such model in an estimation theory context to develop an exact gradient-type algorithm, from which adaptive algorithms such as the least mean squares (LMS) filter and its data-reuse version—the TRUE-LMS—are derived. Stability and convergence issues are addressed. The algorithms are then tested in simulations, which show its good performance when compared to other nonlinear processing algorithms in the literature. |
Tasks | |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07520v1 |
http://arxiv.org/pdf/1610.07520v1.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-adaptive-algorithms-on-rank-one |
Repo | |
Framework | |
Distributed Real-Time Sentiment Analysis for Big Data Social Streams
Title | Distributed Real-Time Sentiment Analysis for Big Data Social Streams |
Authors | Amir Hossein Akhavan Rahnama |
Abstract | Big data trend has enforced the data-centric systems to have continuous fast data streams. In recent years, real-time analytics on stream data has formed into a new research field, which aims to answer queries about what-is-happening-now with a negligible delay. The real challenge with real-time stream data processing is that it is impossible to store instances of data, and therefore online analytical algorithms are utilized. To perform real-time analytics, pre-processing of data should be performed in a way that only a short summary of stream is stored in main memory. In addition, due to high speed of arrival, average processing time for each instance of data should be in such a way that incoming instances are not lost without being captured. Lastly, the learner needs to provide high analytical accuracy measures. Sentinel is a distributed system written in Java that aims to solve this challenge by enforcing both the processing and learning process to be done in distributed form. Sentinel is built on top of Apache Storm, a distributed computing platform. Sentinels learner, Vertical Hoeffding Tree, is a parallel decision tree-learning algorithm based on the VFDT, with ability of enabling parallel classification in distributed environments. Sentinel also uses SpaceSaving to keep a summary of the data stream and stores its summary in a synopsis data structure. Application of Sentinel on Twitter Public Stream API is shown and the results are discussed. |
Tasks | Sentiment Analysis |
Published | 2016-12-27 |
URL | http://arxiv.org/abs/1612.08543v1 |
http://arxiv.org/pdf/1612.08543v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-real-time-sentiment-analysis-for |
Repo | |
Framework | |
Classifier Risk Estimation under Limited Labeling Resources
Title | Classifier Risk Estimation under Limited Labeling Resources |
Authors | Anurag Kumar, Bhiksha Raj |
Abstract | In this paper we propose strategies for estimating performance of a classifier when labels cannot be obtained for the whole test set. The number of test instances which can be labeled is very small compared to the whole test data size. The goal then is to obtain a precise estimate of classifier performance using as little labeling resource as possible. Specifically, we try to answer, how to select a subset of the large test set for labeling such that the performance of a classifier estimated on this subset is as close as possible to the one on the whole test set. We propose strategies based on stratified sampling for selecting this subset. We show that these strategies can reduce the variance in estimation of classifier accuracy by a significant amount compared to simple random sampling (over 65% in several cases). Hence, our proposed methods are much more precise compared to random sampling for accuracy estimation under restricted labeling resources. The reduction in number of samples required (compared to random sampling) to estimate the classifier accuracy with only 1% error is high as 60% in some cases. |
Tasks | |
Published | 2016-07-09 |
URL | http://arxiv.org/abs/1607.02665v2 |
http://arxiv.org/pdf/1607.02665v2.pdf | |
PWC | https://paperswithcode.com/paper/classifier-risk-estimation-under-limited |
Repo | |
Framework | |
Deep Learning on FPGAs: Past, Present, and Future
Title | Deep Learning on FPGAs: Past, Present, and Future |
Authors | Griffin Lacey, Graham W. Taylor, Shawki Areibi |
Abstract | The rapid growth of data size and accessibility in recent years has instigated a shift of philosophy in algorithm design for artificial intelligence. Instead of engineering algorithms by hand, the ability to learn composable systems automatically from massive amounts of data has led to ground-breaking performance in important domains such as computer vision, speech recognition, and natural language processing. The most popular class of techniques used in these domains is called deep learning, and is seeing significant attention from industry. However, these models require incredible amounts of data and compute power to train, and are limited by the need for better hardware acceleration to accommodate scaling beyond current data and model sizes. While the current solution has been to use clusters of graphics processing units (GPU) as general purpose processors (GPGPU), the use of field programmable gate arrays (FPGA) provide an interesting alternative. Current trends in design tools for FPGAs have made them more compatible with the high-level software practices typically practiced in the deep learning community, making FPGAs more accessible to those who build and deploy models. Since FPGA architectures are flexible, this could also allow researchers the ability to explore model-level optimizations beyond what is possible on fixed architectures such as GPUs. As well, FPGAs tend to provide high performance per watt of power consumption, which is of particular importance for application scientists interested in large scale server-based deployment or resource-limited embedded applications. This review takes a look at deep learning and FPGAs from a hardware acceleration perspective, identifying trends and innovations that make these technologies a natural fit, and motivates a discussion on how FPGAs may best serve the needs of the deep learning community moving forward. |
Tasks | Speech Recognition |
Published | 2016-02-13 |
URL | http://arxiv.org/abs/1602.04283v1 |
http://arxiv.org/pdf/1602.04283v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-on-fpgas-past-present-and |
Repo | |
Framework | |
Minimizing the Maximal Loss: How and Why?
Title | Minimizing the Maximal Loss: How and Why? |
Authors | Shai Shalev-Shwartz, Yonatan Wexler |
Abstract | A commonly used learning rule is to approximately minimize the \emph{average} loss over the training set. Other learning algorithms, such as AdaBoost and hard-SVM, aim at minimizing the \emph{maximal} loss over the training set. The average loss is more popular, particularly in deep learning, due to three main reasons. First, it can be conveniently minimized using online algorithms, that process few examples at each iteration. Second, it is often argued that there is no sense to minimize the loss on the training set too much, as it will not be reflected in the generalization loss. Last, the maximal loss is not robust to outliers. In this paper we describe and analyze an algorithm that can convert any online algorithm to a minimizer of the maximal loss. We prove that in some situations better accuracy on the training set is crucial to obtain good performance on unseen examples. Last, we propose robust versions of the approach that can handle outliers. |
Tasks | |
Published | 2016-02-04 |
URL | http://arxiv.org/abs/1602.01690v2 |
http://arxiv.org/pdf/1602.01690v2.pdf | |
PWC | https://paperswithcode.com/paper/minimizing-the-maximal-loss-how-and-why |
Repo | |
Framework | |
Estimating Structured Vector Autoregressive Model
Title | Estimating Structured Vector Autoregressive Model |
Authors | Igor Melnyk, Arindam Banerjee |
Abstract | While considerable advances have been made in estimating high-dimensional structured models from independent data using Lasso-type models, limited progress has been made for settings when the samples are dependent. We consider estimating structured VAR (vector auto-regressive models), where the structure can be captured by any suitable norm, e.g., Lasso, group Lasso, order weighted Lasso, sparse group Lasso, etc. In VAR setting with correlated noise, although there is strong dependence over time and covariates, we establish bounds on the non-asymptotic estimation error of structured VAR parameters. Surprisingly, the estimation error is of the same order as that of the corresponding Lasso-type estimator with independent samples, and the analysis holds for any norm. Our analysis relies on results in generic chaining, sub-exponential martingales, and spectral representation of VAR models. Experimental results on synthetic data with a variety of structures as well as real aviation data are presented, validating theoretical results. |
Tasks | |
Published | 2016-02-21 |
URL | http://arxiv.org/abs/1602.06606v2 |
http://arxiv.org/pdf/1602.06606v2.pdf | |
PWC | https://paperswithcode.com/paper/estimating-structured-vector-autoregressive |
Repo | |
Framework | |
Computing Human-Understandable Strategies
Title | Computing Human-Understandable Strategies |
Authors | Sam Ganzfried, Farzana Yusuf |
Abstract | Algorithms for equilibrium computation generally make no attempt to ensure that the computed strategies are understandable by humans. For instance the strategies for the strongest poker agents are represented as massive binary files. In many situations, we would like to compute strategies that can actually be implemented by humans, who may have computational limitations and may only be able to remember a small number of features or components of the strategies that have been computed. We study poker games where private information distributions can be arbitrary. We create a large training set of game instances and solutions, by randomly selecting the information probabilities, and present algorithms that learn from the training instances in order to perform well in games with unseen information distributions. We are able to conclude several new fundamental rules about poker strategy that can be easily implemented by humans. |
Tasks | |
Published | 2016-12-19 |
URL | http://arxiv.org/abs/1612.06340v2 |
http://arxiv.org/pdf/1612.06340v2.pdf | |
PWC | https://paperswithcode.com/paper/computing-human-understandable-strategies |
Repo | |
Framework | |
A Greedy Algorithm to Cluster Specialists
Title | A Greedy Algorithm to Cluster Specialists |
Authors | Sébastien Arnold |
Abstract | Several recent deep neural networks experiments leverage the generalist-specialist paradigm for classification. However, no formal study compared the performance of different clustering algorithms for class assignment. In this paper we perform such a study, suggest slight modifications to the clustering procedures, and propose a novel algorithm designed to optimize the performance of of the specialist-generalist classification system. Our experiments on the CIFAR-10 and CIFAR-100 datasets allow us to investigate situations for varying number of classes on similar data. We find that our \emph{greedy pairs} clustering algorithm consistently outperforms other alternatives, while the choice of the confusion matrix has little impact on the final performance. |
Tasks | |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03666v1 |
http://arxiv.org/pdf/1609.03666v1.pdf | |
PWC | https://paperswithcode.com/paper/a-greedy-algorithm-to-cluster-specialists |
Repo | |
Framework | |
Speech Signal Analysis for the Estimation of Heart Rates Under Different Emotional States
Title | Speech Signal Analysis for the Estimation of Heart Rates Under Different Emotional States |
Authors | Aibek Ryskaliyev, Sanzhar Askaruly, Alex Pappachen James |
Abstract | A non-invasive method for the monitoring of heart activity can help to reduce the deaths caused by heart disorders such as stroke, arrhythmia and heart attack. The human voice can be considered as a biometric data that can be used for estimation of heart rate. In this paper, we propose a method for estimating the heart rate from human speech dynamically using voice signal analysis and by the development of an empirical linear predictor model. The correlation between the voice signal and heart rate are established by classifiers and prediction of the heart rates with or without emotions are done using linear models. The prediction accuracy was tested using the data collected from 15 subjects, it is about 4050 samples of speech signals and corresponding electrocardiogram samples. The proposed approach can use for early non-invasive detection of heart rate changes that can be correlated to an emotional state of the individual and also can be used as a tool for diagnosis of heart conditions in real-time situations. |
Tasks | |
Published | 2016-08-12 |
URL | http://arxiv.org/abs/1608.03720v1 |
http://arxiv.org/pdf/1608.03720v1.pdf | |
PWC | https://paperswithcode.com/paper/speech-signal-analysis-for-the-estimation-of |
Repo | |
Framework | |
QUOTE: “Querying” Users as Oracles in Tag Engines - A Semi-Supervised Learning Approach to Personalized Image Tagging
Title | QUOTE: “Querying” Users as Oracles in Tag Engines - A Semi-Supervised Learning Approach to Personalized Image Tagging |
Authors | Amandianeze O. Nwana, Tsuhan Chen |
Abstract | One common trend in image tagging research is to focus on visually relevant tags, and this tends to ignore the personal and social aspect of tags, especially on photoblogging websites such as Flickr. Previous work has correctly identified that many of the tags that users provide on images are not visually relevant (i.e. representative of the salient content in the image) and they go on to treat such tags as noise, ignoring that the users chose to provide those tags over others that could have been more visually relevant. Another common assumption about user generated tags for images is that the order of these tags provides no useful information for the prediction of tags on future images. This assumption also tends to define usefulness in terms of what is visually relevant to the image. For general tagging or labeling applications that focus on providing visual information about image content, these assumptions are reasonable, but when considering personalized image tagging applications, these assumptions are at best too rigid, ignoring user choice and preferences. We challenge the aforementioned assumptions, and provide a machine learning approach to the problem of personalized image tagging with the following contributions: 1.) We reformulate the personalized image tagging problem as a search/retrieval ranking problem, 2.) We leverage the order of tags, which does not always reflect visual relevance, provided by the user in the past as a cue to their tag preferences, similar to click data, 3.) We propose a technique to augment sparse user tag data (semi-supervision), and 4.) We demonstrate the efficacy of our method on a subset of Flickr images, showing improvement over previous state-of-art methods. |
Tasks | |
Published | 2016-01-20 |
URL | http://arxiv.org/abs/1601.06440v1 |
http://arxiv.org/pdf/1601.06440v1.pdf | |
PWC | https://paperswithcode.com/paper/quote-querying-users-as-oracles-in-tag |
Repo | |
Framework | |
An ABC interpretation of the multiple auxiliary variable method
Title | An ABC interpretation of the multiple auxiliary variable method |
Authors | Dennis Prangle, Richard G. Everitt |
Abstract | We show that the auxiliary variable method (M{\o}ller et al., 2006; Murray et al., 2006) for inference of Markov random fields can be viewed as an approximate Bayesian computation method for likelihood estimation. |
Tasks | |
Published | 2016-04-27 |
URL | http://arxiv.org/abs/1604.08102v1 |
http://arxiv.org/pdf/1604.08102v1.pdf | |
PWC | https://paperswithcode.com/paper/an-abc-interpretation-of-the-multiple |
Repo | |
Framework | |