Paper Group ANR 266
M4CD: A Robust Change Detection Method for Intelligent Visual Surveillance. Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum. Distribution-Free Uncertainty Quantification for Kernel Methods by Gradient Perturbations. Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds. ORGaNICs: A Theory of Workin …
M4CD: A Robust Change Detection Method for Intelligent Visual Surveillance
Title | M4CD: A Robust Change Detection Method for Intelligent Visual Surveillance |
Authors | Kunfeng Wang, Chao Gou, Fei-Yue Wang |
Abstract | In this paper, we propose a robust change detection method for intelligent visual surveillance. This method, named M4CD, includes three major steps. Firstly, a sample-based background model that integrates color and texture cues is built and updated over time. Secondly, multiple heterogeneous features (including brightness variation, chromaticity variation, and texture variation) are extracted by comparing the input frame with the background model, and a multi-source learning strategy is designed to online estimate the probability distributions for both foreground and background. The three features are approximately conditionally independent, making multi-source learning feasible. Pixel-wise foreground posteriors are then estimated with Bayes rule. Finally, the Markov random field (MRF) optimization and heuristic post-processing techniques are used sequentially to improve accuracy. In particular, a two-layer MRF model is constructed to represent pixel-based and superpixel-based contextual constraints compactly. Experimental results on the CDnet dataset indicate that M4CD is robust under complex environments and ranks among the top methods. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04979v1 |
http://arxiv.org/pdf/1802.04979v1.pdf | |
PWC | https://paperswithcode.com/paper/m4cd-a-robust-change-detection-method-for |
Repo | |
Framework | |
Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum
Title | Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum |
Authors | Omer Levy, Kenton Lee, Nicholas FitzGerald, Luke Zettlemoyer |
Abstract | LSTMs were introduced to combat vanishing gradients in simple RNNs by augmenting them with gated additive recurrent connections. We present an alternative view to explain the success of LSTMs: the gates themselves are versatile recurrent models that provide more representational power than previously appreciated. We do this by decoupling the LSTM’s gates from the embedded simple RNN, producing a new class of RNNs where the recurrence computes an element-wise weighted sum of context-independent functions of the input. Ablations on a range of problems demonstrate that the gating mechanism alone performs as well as an LSTM in most settings, strongly suggesting that the gates are doing much more in practice than just alleviating vanishing gradients. |
Tasks | |
Published | 2018-05-09 |
URL | http://arxiv.org/abs/1805.03716v1 |
http://arxiv.org/pdf/1805.03716v1.pdf | |
PWC | https://paperswithcode.com/paper/long-short-term-memory-as-a-dynamically |
Repo | |
Framework | |
Distribution-Free Uncertainty Quantification for Kernel Methods by Gradient Perturbations
Title | Distribution-Free Uncertainty Quantification for Kernel Methods by Gradient Perturbations |
Authors | Balázs Csanád Csáji, Krisztián Balázs Kis |
Abstract | We propose a data-driven approach to quantify the uncertainty of models constructed by kernel methods. Our approach minimizes the needed distributional assumptions, hence, instead of working with, for example, Gaussian processes or exponential families, it only requires knowledge about some mild regularity of the measurement noise, such as it is being symmetric or exchangeable. We show, by building on recent results from finite-sample system identification, that by perturbing the residuals in the gradient of the objective function, information can be extracted about the amount of uncertainty our model has. Particularly, we provide an algorithm to build exact, non-asymptotically guaranteed, distribution-free confidence regions for ideal, noise-free representations of the function we try to estimate. For the typical convex quadratic problems and symmetric noises, the regions are star convex centered around a given nominal estimate, and have efficient ellipsoidal outer approximations. Finally, we illustrate the ideas on typical kernel methods, such as LS-SVC, KRR, $\varepsilon$-SVR and kernelized LASSO. |
Tasks | Gaussian Processes |
Published | 2018-12-23 |
URL | https://arxiv.org/abs/1812.09632v2 |
https://arxiv.org/pdf/1812.09632v2.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-quantification-for-kernel-methods |
Repo | |
Framework | |
Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds
Title | Dimensionality Reduction has Quantifiable Imperfections: Two Geometric Bounds |
Authors | Kry Yik Chau Lui, Gavin Weiguang Ding, Ruitong Huang, Robert J. McCann |
Abstract | In this paper, we investigate Dimensionality reduction (DR) maps in an information retrieval setting from a quantitative topology point of view. In particular, we show that no DR maps can achieve perfect precision and perfect recall simultaneously. Thus a continuous DR map must have imperfect precision. We further prove an upper bound on the precision of Lipschitz continuous DR maps. While precision is a natural measure in an information retrieval setting, it does not measure `how’ wrong the retrieved data is. We therefore propose a new measure based on Wasserstein distance that comes with similar theoretical guarantee. A key technical step in our proofs is a particular optimization problem of the $L_2$-Wasserstein distance over a constrained set of distributions. We provide a complete solution to this optimization problem, which can be of independent interest on the technical side. | |
Tasks | Dimensionality Reduction, Information Retrieval |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1811.00115v1 |
http://arxiv.org/pdf/1811.00115v1.pdf | |
PWC | https://paperswithcode.com/paper/dimensionality-reduction-has-quantifiable |
Repo | |
Framework | |
ORGaNICs: A Theory of Working Memory in Brains and Machines
Title | ORGaNICs: A Theory of Working Memory in Brains and Machines |
Authors | David J. Heeger, Wayne E. Mackey |
Abstract | Working memory is a cognitive process that is responsible for temporarily holding and manipulating information. Most of the empirical neuroscience research on working memory has focused on measuring sustained activity in prefrontal cortex (PFC) and/or parietal cortex during simple delayed-response tasks, and most of the models of working memory have been based on neural integrators. But working memory means much more than just holding a piece of information online. We describe a new theory of working memory, based on a recurrent neural circuit that we call ORGaNICs (Oscillatory Recurrent GAted Neural Integrator Circuits). ORGaNICs are a variety of Long Short Term Memory units (LSTMs), imported from machine learning and artificial intelligence. ORGaNICs can be used to explain the complex dynamics of delay-period activity in prefrontal cortex (PFC) during a working memory task. The theory is analytically tractable so that we can characterize the dynamics, and the theory provides a means for reading out information from the dynamically varying responses at any point in time, in spite of the complex dynamics. ORGaNICs can be implemented with a biophysical (electrical circuit) model of pyramidal cells, combined with shunting inhibition via a thalamocortical loop. Although introduced as a computational theory of working memory, ORGaNICs are also applicable to models of sensory processing, motor preparation and motor control. ORGaNICs offer computational advantages compared to other varieties of LSTMs that are commonly used in AI applications. Consequently, ORGaNICs are a framework for canonical computation in brains and machines. |
Tasks | |
Published | 2018-03-16 |
URL | http://arxiv.org/abs/1803.06288v4 |
http://arxiv.org/pdf/1803.06288v4.pdf | |
PWC | https://paperswithcode.com/paper/organics-a-theory-of-working-memory-in-brains |
Repo | |
Framework | |
Universal Perceptual Grouping
Title | Universal Perceptual Grouping |
Authors | Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Honggang Zhang |
Abstract | In this work we aim to develop a universal sketch grouper. That is, a grouper that can be applied to sketches of any category in any domain to group constituent strokes/segments into semantically meaningful object parts. The first obstacle to this goal is the lack of large-scale datasets with grouping annotation. To overcome this, we contribute the largest sketch perceptual grouping (SPG) dataset to date, consisting of 20,000 unique sketches evenly distributed over 25 object categories. Furthermore, we propose a novel deep universal perceptual grouping model. The model is learned with both generative and discriminative losses. The generative losses improve the generalisation ability of the model to unseen object categories and datasets. The discriminative losses include a local grouping loss and a novel global grouping loss to enforce global grouping consistency. We show that the proposed model significantly outperforms the state-of-the-art groupers. Further, we show that our grouper is useful for a number of sketch analysis tasks including sketch synthesis and fine-grained sketch-based image retrieval (FG-SBIR). |
Tasks | Image Retrieval, Sketch-Based Image Retrieval |
Published | 2018-08-07 |
URL | http://arxiv.org/abs/1808.02312v1 |
http://arxiv.org/pdf/1808.02312v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-perceptual-grouping |
Repo | |
Framework | |
Targeted Nonlinear Adversarial Perturbations in Images and Videos
Title | Targeted Nonlinear Adversarial Perturbations in Images and Videos |
Authors | Roberto Rey-de-Castro, Herschel Rabitz |
Abstract | We introduce a method for learning adversarial perturbations targeted to individual images or videos. The learned perturbations are found to be sparse while at the same time containing a high level of feature detail. Thus, the extracted perturbations allow a form of object or action recognition and provide insights into what features the studied deep neural network models consider important when reaching their classification decisions. From an adversarial point of view, the sparse perturbations successfully confused the models into misclassifying, although the perturbed samples still belonged to the same original class by visual examination. This is discussed in terms of a prospective data augmentation scheme. The sparse yet high-quality perturbations may also be leveraged for image or video compression. |
Tasks | Data Augmentation, Temporal Action Localization, Video Compression |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1809.00958v1 |
http://arxiv.org/pdf/1809.00958v1.pdf | |
PWC | https://paperswithcode.com/paper/targeted-nonlinear-adversarial-perturbations |
Repo | |
Framework | |
Autoencoders Learn Generative Linear Models
Title | Autoencoders Learn Generative Linear Models |
Authors | Thanh V. Nguyen, Raymond K. W. Wong, Chinmay Hegde |
Abstract | We provide a series of results for unsupervised learning with autoencoders. Specifically, we study shallow two-layer autoencoder architectures with shared weights. We focus on three generative models for data that are common in statistical machine learning: (i) the mixture-of-gaussians model, (ii) the sparse coding model, and (iii) the sparsity model with non-negative coefficients. For each of these models, we prove that under suitable choices of hyperparameters, architectures, and initialization, autoencoders learned by gradient descent can successfully recover the parameters of the corresponding model. To our knowledge, this is the first result that rigorously studies the dynamics of gradient descent for weight-sharing autoencoders. Our analysis can be viewed as theoretical evidence that shallow autoencoder modules indeed can be used as feature learning mechanisms for a variety of data models, and may shed insight on how to train larger stacked architectures with autoencoders as basic building blocks. |
Tasks | |
Published | 2018-06-02 |
URL | http://arxiv.org/abs/1806.00572v3 |
http://arxiv.org/pdf/1806.00572v3.pdf | |
PWC | https://paperswithcode.com/paper/autoencoders-learn-generative-linear-models |
Repo | |
Framework | |
SMILK, linking natural language and data from the web
Title | SMILK, linking natural language and data from the web |
Authors | Cédric Lopez, Molka Dhouib, Elena Cabrio, Catherine Faron Zucker, Fabien Gandon, Frédérique Segond |
Abstract | As part of the SMILK Joint Lab, we studied the use of Natural Language Processing to: (1) enrich knowledge bases and link data on the web, and conversely (2) use this linked data to contribute to the improvement of text analysis and the annotation of textual content, and to support knowledge extraction. The evaluation focused on brand-related information retrieval in the field of cosmetics. This article describes each step of our approach: the creation of ProVoc, an ontology to describe products and brands; the automatic population of a knowledge base mainly based on ProVoc from heterogeneous textual resources; and the evaluation of an application which that takes the form of a browser plugin providing additional knowledge to users browsing the web. |
Tasks | Information Retrieval |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1901.02055v1 |
http://arxiv.org/pdf/1901.02055v1.pdf | |
PWC | https://paperswithcode.com/paper/smilk-linking-natural-language-and-data-from |
Repo | |
Framework | |
ICA based on Split Generalized Gaussian
Title | ICA based on Split Generalized Gaussian |
Authors | P. Spurek, P. Rola, J. Tabor, A. Czechowski |
Abstract | Independent Component Analysis (ICA) - one of the basic tools in data analysis - aims to find a coordinate system in which the components of the data are independent. Most popular ICA methods use kurtosis as a metric of non-Gaussianity to maximize, such as FastICA and JADE. However, their assumption of fourth-order moment (kurtosis) may not always be satisfied in practice. One of the possible solution is to use third-order moment (skewness) instead of kurtosis, which was applied in $ICA_{SG}$ and EcoICA. In this paper we present a competitive approach to ICA based on the Split Generalized Gaussian distribution (SGGD), which is well adapted to heavy-tailed as well as asymmetric data. Consequently, we obtain a method which works better than the classical approaches, in both cases: heavy tails and non-symmetric data. \end{abstract} |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.05550v1 |
http://arxiv.org/pdf/1802.05550v1.pdf | |
PWC | https://paperswithcode.com/paper/ica-based-on-split-generalized-gaussian |
Repo | |
Framework | |
Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders
Title | Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders |
Authors | Yansen Wang, Chenyi Liu, Minlie Huang, Liqiang Nie |
Abstract | Asking good questions in large-scale, open-domain conversational systems is quite significant yet rather untouched. This task, substantially different from traditional question generation, requires to question not only with various patterns but also on diverse and relevant topics. We observe that a good question is a natural composition of {\it interrogatives}, {\it topic words}, and {\it ordinary words}. Interrogatives lexicalize the pattern of questioning, topic words address the key information for topic transition in dialogue, and ordinary words play syntactical and grammatical roles in making a natural sentence. We devise two typed decoders (\textit{soft typed decoder} and \textit{hard typed decoder}) in which a type distribution over the three types is estimated and used to modulate the final generation distribution. Extensive experiments show that the typed decoders outperform state-of-the-art baselines and can generate more meaningful questions. |
Tasks | Question Generation |
Published | 2018-05-13 |
URL | http://arxiv.org/abs/1805.04843v1 |
http://arxiv.org/pdf/1805.04843v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-ask-questions-in-open-domain |
Repo | |
Framework | |
Traditional Wisdom and Monte Carlo Tree Search Face-to-Face in the Card Game Scopone
Title | Traditional Wisdom and Monte Carlo Tree Search Face-to-Face in the Card Game Scopone |
Authors | Stefano Di Palma, Pier Luca Lanzi |
Abstract | We present the design of a competitive artificial intelligence for Scopone, a popular Italian card game. We compare rule-based players using the most established strategies (one for beginners and two for advanced players) against players using Monte Carlo Tree Search (MCTS) and Information Set Monte Carlo Tree Search (ISMCTS) with different reward functions and simulation strategies. MCTS requires complete information about the game state and thus implements a cheating player while ISMCTS can deal with incomplete information and thus implements a fair player. Our results show that, as expected, the cheating MCTS outperforms all the other strategies; ISMCTS is stronger than all the rule-based players implementing well-known and most advanced strategies and it also turns out to be a challenging opponent for human players. |
Tasks | |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.06813v1 |
http://arxiv.org/pdf/1807.06813v1.pdf | |
PWC | https://paperswithcode.com/paper/traditional-wisdom-and-monte-carlo-tree |
Repo | |
Framework | |
Convergence of Online Mirror Descent
Title | Convergence of Online Mirror Descent |
Authors | Yunwen Lei, Ding-Xuan Zhou |
Abstract | In this paper we consider online mirror descent (OMD) algorithms, a class of scalable online learning algorithms exploiting data geometric structures through mirror maps. Necessary and sufficient conditions are presented in terms of the step size sequence ${\eta_t}{t}$ for the convergence of an OMD algorithm with respect to the expected Bregman distance induced by the mirror map. The condition is $\lim{t\to\infty}\eta_t=0, \sum_{t=1}^{\infty}\eta_t=\infty$ in the case of positive variances. It is reduced to $\sum_{t=1}^{\infty}\eta_t=\infty$ in the case of zero variances for which the linear convergence may be achieved by taking a constant step size sequence. A sufficient condition on the almost sure convergence is also given. We establish tight error bounds under mild conditions on the mirror map, the loss function, and the regularizer. Our results are achieved by some novel analysis on the one-step progress of the OMD algorithm using smoothness and strong convexity of the mirror map and the loss function. |
Tasks | |
Published | 2018-02-18 |
URL | https://arxiv.org/abs/1802.06357v2 |
https://arxiv.org/pdf/1802.06357v2.pdf | |
PWC | https://paperswithcode.com/paper/convergence-of-online-mirror-descent |
Repo | |
Framework | |
Impersonation: Modeling Persona in Smart Responses to Email
Title | Impersonation: Modeling Persona in Smart Responses to Email |
Authors | Rajeev Gupta, Ranganath Kondapally, Chakrapani Ravi Kiran |
Abstract | In this paper, we present design, implementation, and effectiveness of generating personalized suggestions for email replies. To personalize email responses based on users style and personality, we model the users persona based on her past responses to emails. This model is added to the language-based model created across users using past responses of the all user emails. A users model captures the typical responses of the user given a particular context. The context includes the email received, recipient of the email, and other external signals such as calendar activities, preferences, etc. The context along with users personality (e.g., extrovert, formal, reserved, etc.) is used to suggest responses. These responses can be a mixture of multiple modes: email replies (textual), audio clips, etc. This helps in making responses mimic the user as much as possible and helps the user to be more productive while retaining her mark in the responses. |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04456v1 |
http://arxiv.org/pdf/1806.04456v1.pdf | |
PWC | https://paperswithcode.com/paper/impersonation-modeling-persona-in-smart |
Repo | |
Framework | |
MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language
Title | MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language |
Authors | Hamid Reza Vaezi Joze, Oscar Koller |
Abstract | Sign language recognition is a challenging and often underestimated problem comprising multi-modal articulators (handshape, orientation, movement, upper body and face) that integrate asynchronously on multiple streams. Learning powerful statistical models in such a scenario requires much data, particularly to apply recent advances of the field. However, labeled data is a scarce resource for sign language due to the enormous cost of transcribing these unwritten languages. We propose the first real-life large-scale sign language data set comprising over 25,000 annotated videos, which we thoroughly evaluate with state-of-the-art methods from sign and related action recognition. Unlike the current state-of-the-art, the data set allows to investigate the generalization to unseen individuals (signer-independent test) in a realistic setting with over 200 signers. Previous work mostly deals with limited vocabulary tasks, while here, we cover a large class count of 1000 signs in challenging and unconstrained real-life recording conditions. We further propose I3D, known from video classifications, as a powerful and suitable architecture for sign language recognition, outperforming the current state-of-the-art by a large margin. The data set is publicly available to the community. |
Tasks | Sign Language Recognition, Temporal Action Localization, Video Classification |
Published | 2018-12-03 |
URL | https://arxiv.org/abs/1812.01053v2 |
https://arxiv.org/pdf/1812.01053v2.pdf | |
PWC | https://paperswithcode.com/paper/ms-asl-a-large-scale-data-set-and-benchmark |
Repo | |
Framework | |