Paper Group ANR 214
Streaming kernel regression with provably adaptive mean, variance, and regularization. The neighborhood lattice for encoding partial correlations in a Hilbert space. Strategically knowing how. An Ensemble Framework for Detecting Community Changes in Dynamic Networks. Learning Approximately Objective Priors. Micro-Doppler Based Human-Robot Classific …
Streaming kernel regression with provably adaptive mean, variance, and regularization
Title | Streaming kernel regression with provably adaptive mean, variance, and regularization |
Authors | Audrey Durand, Odalric-Ambrym Maillard, Joelle Pineau |
Abstract | We consider the problem of streaming kernel regression, when the observations arrive sequentially and the goal is to recover the underlying mean function, assumed to belong to an RKHS. The variance of the noise is not assumed to be known. In this context, we tackle the problem of tuning the regularization parameter adaptively at each time step, while maintaining tight confidence bounds estimates on the value of the mean function at each point. To this end, we first generalize existing results for finite-dimensional linear regression with fixed regularization and known variance to the kernel setup with a regularization parameter allowed to be a measurable function of past observations. Then, using appropriate self-normalized inequalities we build upper and lower bound estimates for the variance, leading to Bersntein-like concentration bounds. The later is used in order to define the adaptive regularization. The bounds resulting from our technique are valid uniformly over all observation points and all time steps, and are compared against the literature with numerical experiments. Finally, the potential of these tools is illustrated by an application to kernelized bandits, where we revisit the Kernel UCB and Kernel Thompson Sampling procedures, and show the benefits of the novel adaptive kernel tuning strategy. |
Tasks | |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00768v1 |
http://arxiv.org/pdf/1708.00768v1.pdf | |
PWC | https://paperswithcode.com/paper/streaming-kernel-regression-with-provably |
Repo | |
Framework | |
The neighborhood lattice for encoding partial correlations in a Hilbert space
Title | The neighborhood lattice for encoding partial correlations in a Hilbert space |
Authors | Arash A. Amini, Bryon Aragam, Qing Zhou |
Abstract | Neighborhood regression has been a successful approach in graphical and structural equation modeling, with applications to learning undirected and directed graphical models. We extend these ideas by defining and studying an algebraic structure called the neighborhood lattice based on a generalized notion of neighborhood regression. We show that this algebraic structure has the potential to provide an economic encoding of all conditional independence statements in a Gaussian distribution (or conditional uncorrelatedness in general), even in the cases where no graphical model exists that could “perfectly” encode all such statements. We study the computational complexity of computing these structures and show that under a sparsity assumption, they can be computed in polynomial time, even in the absence of the assumption of perfectness to a graph. On the other hand, assuming perfectness, we show how these neighborhood lattices may be “graphically” computed using the separation properties of the so-called partial correlation graph. We also draw connections with directed acyclic graphical models and Bayesian networks. We derive these results using an abstract generalization of partial uncorrelatedness, called partial orthogonality, which allows us to use algebraic properties of projection operators on Hilbert spaces to significantly simplify and extend existing ideas and arguments. Consequently, our results apply to a wide range of random objects and data structures, such as random vectors, data matrices, and functions. |
Tasks | Dimensionality Reduction |
Published | 2017-11-03 |
URL | http://arxiv.org/abs/1711.00991v2 |
http://arxiv.org/pdf/1711.00991v2.pdf | |
PWC | https://paperswithcode.com/paper/the-neighborhood-lattice-for-encoding-partial |
Repo | |
Framework | |
Strategically knowing how
Title | Strategically knowing how |
Authors | Raul Fervari, Andreas Herzig, Yanjun Li, Yanjing Wang |
Abstract | In this paper, we propose a single-agent logic of goal-directed knowing how extending the standard epistemic logic of knowing that with a new knowing how operator. The semantics of the new operator is based on the idea that knowing how to achieve $\phi$ means that there exists a (uniform) strategy such that the agent knows that it can make sure $\phi$. We give an intuitive axiomatization of our logic and prove the soundness, completeness, and decidability of the logic. The crucial axioms relating knowing that and knowing how illustrate our understanding of knowing how in this setting. This logic can be used in representing both knowledge-that and knowledge-how. |
Tasks | |
Published | 2017-05-15 |
URL | http://arxiv.org/abs/1705.05254v1 |
http://arxiv.org/pdf/1705.05254v1.pdf | |
PWC | https://paperswithcode.com/paper/strategically-knowing-how |
Repo | |
Framework | |
An Ensemble Framework for Detecting Community Changes in Dynamic Networks
Title | An Ensemble Framework for Detecting Community Changes in Dynamic Networks |
Authors | Timothy La Fond, Geoffrey Sanders, Christine Klymko, Van Emden Henson |
Abstract | Dynamic networks, especially those representing social networks, undergo constant evolution of their community structure over time. Nodes can migrate between different communities, communities can split into multiple new communities, communities can merge together, etc. In order to represent dynamic networks with evolving communities it is essential to use a dynamic model rather than a static one. Here we use a dynamic stochastic block model where the underlying block model is different at different times. In order to represent the structural changes expressed by this dynamic model the network will be split into discrete time segments and a clustering algorithm will assign block memberships for each segment. In this paper we show that using an ensemble of clustering assignments accommodates for the variance in scalable clustering algorithms and produces superior results in terms of pairwise-precision and pairwise-recall. We also demonstrate that the dynamic clustering produced by the ensemble can be visualized as a flowchart which encapsulates the community evolution succinctly. |
Tasks | |
Published | 2017-07-24 |
URL | http://arxiv.org/abs/1708.08136v1 |
http://arxiv.org/pdf/1708.08136v1.pdf | |
PWC | https://paperswithcode.com/paper/an-ensemble-framework-for-detecting-community |
Repo | |
Framework | |
Learning Approximately Objective Priors
Title | Learning Approximately Objective Priors |
Authors | Eric Nalisnick, Padhraic Smyth |
Abstract | Informative Bayesian priors are often difficult to elicit, and when this is the case, modelers usually turn to noninformative or objective priors. However, objective priors such as the Jeffreys and reference priors are not tractable to derive for many models of interest. We address this issue by proposing techniques for learning reference prior approximations: we select a parametric family and optimize a black-box lower bound on the reference prior objective to find the member of the family that serves as a good approximation. We experimentally demonstrate the method’s effectiveness by recovering Jeffreys priors and learning the Variational Autoencoder’s reference prior. |
Tasks | |
Published | 2017-04-04 |
URL | http://arxiv.org/abs/1704.01168v3 |
http://arxiv.org/pdf/1704.01168v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-approximately-objective-priors |
Repo | |
Framework | |
Micro-Doppler Based Human-Robot Classification Using Ensemble and Deep Learning Approaches
Title | Micro-Doppler Based Human-Robot Classification Using Ensemble and Deep Learning Approaches |
Authors | Sherif Abdulatif, Qian Wei, Fady Aziz, Bernhard Kleiner, Urs Schneider |
Abstract | Radar sensors can be used for analyzing the induced frequency shifts due to micro-motions in both range and velocity dimensions identified as micro-Doppler ($\boldsymbol{\mu}$-D) and micro-Range ($\boldsymbol{\mu}$-R), respectively. Different moving targets will have unique $\boldsymbol{\mu}$-D and $\boldsymbol{\mu}$-R signatures that can be used for target classification. Such classification can be used in numerous fields, such as gait recognition, safety and surveillance. In this paper, a 25 GHz FMCW Single-Input Single-Output (SISO) radar is used in industrial safety for real-time human-robot identification. Due to the real-time constraint, joint Range-Doppler (R-D) maps are directly analyzed for our classification problem. Furthermore, a comparison between the conventional classical learning approaches with handcrafted extracted features, ensemble classifiers and deep learning approaches is presented. For ensemble classifiers, restructured range and velocity profiles are passed directly to ensemble trees, such as gradient boosting and random forest without feature extraction. Finally, a Deep Convolutional Neural Network (DCNN) is used and raw R-D images are directly fed into the constructed network. DCNN shows a superior performance of 99% accuracy in identifying humans from robots on a single R-D map. |
Tasks | Gait Recognition |
Published | 2017-11-25 |
URL | http://arxiv.org/abs/1711.09177v3 |
http://arxiv.org/pdf/1711.09177v3.pdf | |
PWC | https://paperswithcode.com/paper/micro-doppler-based-human-robot |
Repo | |
Framework | |
Locality preserving projection on SPD matrix Lie group: algorithm and analysis
Title | Locality preserving projection on SPD matrix Lie group: algorithm and analysis |
Authors | Yangyang Li, Ruqian Lu |
Abstract | Symmetric positive definite (SPD) matrices used as feature descriptors in image recognition are usually high dimensional. Traditional manifold learning is only applicable for reducing the dimension of high-dimensional vector-form data. For high-dimensional SPD matrices, directly using manifold learning algorithms to reduce the dimension of matrix-form data is impossible. The SPD matrix must first be transformed into a long vector, and then the dimension of this vector must be reduced. However, this approach breaks the spatial structure of the SPD matrix space. To overcome this limitation, we propose a new dimension reduction algorithm on SPD matrix space to transform high-dimensional SPD matrices into low-dimensional SPD matrices. Our work is based on the fact that the set of all SPD matrices with the same size has a Lie group structure, and we aim to transform the manifold learning to the SPD matrix Lie group. We use the basic idea of the manifold learning algorithm called locality preserving projection (LPP) to construct the corresponding Laplacian matrix on the SPD matrix Lie group. Thus, we call our approach Lie-LPP to emphasize its Lie group character. We present a detailed algorithm analysis and show through experiments that Lie-LPP achieves effective results on human action recognition and human face recognition. |
Tasks | Dimensionality Reduction, Face Recognition, Temporal Action Localization |
Published | 2017-03-28 |
URL | http://arxiv.org/abs/1703.09499v2 |
http://arxiv.org/pdf/1703.09499v2.pdf | |
PWC | https://paperswithcode.com/paper/locality-preserving-projection-on-spd-matrix |
Repo | |
Framework | |
MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving
Title | MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving |
Authors | Sauhaarda Chowdhuri, Tushar Pankaj, Karl Zipser |
Abstract | Autonomous driving requires operation in different behavioral modes ranging from lane following and intersection crossing to turning and stopping. However, most existing deep learning approaches to autonomous driving do not consider the behavioral mode in the training strategy. This paper describes a technique for learning multiple distinct behavioral modes in a single deep neural network through the use of multi-modal multi-task learning. We study the effectiveness of this approach, denoted MultiNet, using self-driving model cars for driving in unstructured environments such as sidewalks and unpaved roads. Using labeled data from over one hundred hours of driving our fleet of 1/10th scale model cars, we trained different neural networks to predict the steering angle and driving speed of the vehicle in different behavioral modes. We show that in each case, MultiNet networks outperform networks trained on individual modes while using a fraction of the total number of parameters. |
Tasks | Autonomous Driving, Multi-Task Learning |
Published | 2017-09-16 |
URL | http://arxiv.org/abs/1709.05581v4 |
http://arxiv.org/pdf/1709.05581v4.pdf | |
PWC | https://paperswithcode.com/paper/multinet-multi-modal-multi-task-learning-for |
Repo | |
Framework | |
A polynomial time algorithm for the Lambek calculus with brackets of bounded order
Title | A polynomial time algorithm for the Lambek calculus with brackets of bounded order |
Authors | Max Kanovich, Stepan Kuznetsov, Glyn Morrill, Andre Scedrov |
Abstract | Lambek calculus is a logical foundation of categorial grammar, a linguistic paradigm of grammar as logic and parsing as deduction. Pentus (2010) gave a polynomial-time algorithm for determ- ining provability of bounded depth formulas in the Lambek calculus with empty antecedents allowed. Pentus’ algorithm is based on tabularisation of proof nets. Lambek calculus with brackets is a conservative extension of Lambek calculus with bracket modalities, suitable for the modeling of syntactical domains. In this paper we give an algorithm for provability the Lambek calculus with brackets allowing empty antecedents. Our algorithm runs in polynomial time when both the formula depth and the bracket nesting depth are bounded. It combines a Pentus-style tabularisation of proof nets with an automata-theoretic treatment of bracketing. |
Tasks | |
Published | 2017-05-01 |
URL | http://arxiv.org/abs/1705.00694v2 |
http://arxiv.org/pdf/1705.00694v2.pdf | |
PWC | https://paperswithcode.com/paper/a-polynomial-time-algorithm-for-the-lambek |
Repo | |
Framework | |
Domain Adaptation from Synthesis to Reality in Single-model Detector for Video Smoke Detection
Title | Domain Adaptation from Synthesis to Reality in Single-model Detector for Video Smoke Detection |
Authors | Gao Xu, Yongming Zhang, Qixing Zhang, Gaohua Lin, Jinjun Wang |
Abstract | This paper proposes a method for video smoke detection using synthetic smoke samples. The virtual data can automatically offer precise and rich annotated samples. However, the learning of smoke representations will be hurt by the appearance gap between real and synthetic smoke samples. The existed researches mainly work on the adaptation to samples extracted from original annotated samples. These methods take the object detection and domain adaptation as two independent parts. To train a strong detector with rich synthetic samples, we construct the adaptation to the detection layer of state-of-the-art single-model detectors (SSD and MS-CNN). The training procedure is an end-to-end stage. The classification, location and adaptation are combined in the learning. The performance of the proposed model surpasses the original baseline in our experiments. Meanwhile, our results show that the detectors based on the adversarial adaptation are superior to the detectors based on the discrepancy adaptation. Code will be made publicly available on http://smoke.ustc.edu.cn. Moreover, the domain adaptation for two-stage detector is described in Appendix A. |
Tasks | Domain Adaptation, Object Detection |
Published | 2017-09-24 |
URL | http://arxiv.org/abs/1709.08142v3 |
http://arxiv.org/pdf/1709.08142v3.pdf | |
PWC | https://paperswithcode.com/paper/domain-adaptation-from-synthesis-to-reality |
Repo | |
Framework | |
A New Family of Near-metrics for Universal Similarity
Title | A New Family of Near-metrics for Universal Similarity |
Authors | Chu Wang, Iraj Saniee, William S. Kennedy, Chris A. White |
Abstract | We propose a family of near-metrics based on local graph diffusion to capture similarity for a wide class of data sets. These quasi-metametrics, as their names suggest, dispense with one or two standard axioms of metric spaces, specifically distinguishability and symmetry, so that similarity between data points of arbitrary type and form could be measured broadly and effectively. The proposed near-metric family includes the forward k-step diffusion and its reverse, typically on the graph consisting of data objects and their features. By construction, this family of near-metrics is particularly appropriate for categorical data, continuous data, and vector representations of images and text extracted via deep learning approaches. We conduct extensive experiments to evaluate the performance of this family of similarity measures and compare and contrast with traditional measures of similarity used for each specific application and with the ground truth when available. We show that for structured data including categorical and continuous data, the near-metrics corresponding to normalized forward k-step diffusion (k small) work as one of the best performing similarity measures; for vector representations of text and images including those extracted from deep learning, the near-metrics derived from normalized and reverse k-step graph diffusion (k very small) exhibit outstanding ability to distinguish data points from different classes. |
Tasks | |
Published | 2017-07-21 |
URL | http://arxiv.org/abs/1707.06903v3 |
http://arxiv.org/pdf/1707.06903v3.pdf | |
PWC | https://paperswithcode.com/paper/a-new-family-of-near-metrics-for-universal |
Repo | |
Framework | |
Delving Deeper into MOOC Student Dropout Prediction
Title | Delving Deeper into MOOC Student Dropout Prediction |
Authors | Jacob Whitehill, Kiran Mohan, Daniel Seaton, Yigal Rosen, Dustin Tingley |
Abstract | In order to obtain reliable accuracy estimates for automatic MOOC dropout predictors, it is important to train and test them in a manner consistent with how they will be used in practice. Yet most prior research on MOOC dropout prediction has measured test accuracy on the same course used for training the classifier, which can lead to overly optimistic accuracy estimates. In order to understand better how accuracy is affected by the training+testing regime, we compared the accuracy of a standard dropout prediction architecture (clickstream features + logistic regression) across 4 different training paradigms. Results suggest that (1) training and testing on the same course (“post-hoc”) can overestimate accuracy by several percentage points; (2) dropout classifiers trained on proxy labels based on students’ persistence are surprisingly competitive with post-hoc training (87.33% versus 90.20% AUC averaged over 8 weeks of 40 HarvardX MOOCs); and (3) classifier performance does not vary significantly with the academic discipline. Finally, we also research new dropout prediction architectures based on deep, fully-connected, feed-forward neural networks and find that (4) networks with as many as 5 hidden layers can statistically significantly increase test accuracy over that of logistic regression. |
Tasks | |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06404v1 |
http://arxiv.org/pdf/1702.06404v1.pdf | |
PWC | https://paperswithcode.com/paper/delving-deeper-into-mooc-student-dropout |
Repo | |
Framework | |
A Strategy for an Uncompromising Incremental Learner
Title | A Strategy for an Uncompromising Incremental Learner |
Authors | Ragav Venkatesan, Hemanth Venkateswara, Sethuraman Panchanathan, Baoxin Li |
Abstract | Multi-class supervised learning systems require the knowledge of the entire range of labels they predict. Often when learnt incrementally, they suffer from catastrophic forgetting. To avoid this, generous leeways have to be made to the philosophy of incremental learning that either forces a part of the machine to not learn, or to retrain the machine again with a selection of the historic data. While these hacks work to various degrees, they do not adhere to the spirit of incremental learning. In this article, we redefine incremental learning with stringent conditions that do not allow for any undesirable relaxations and assumptions. We design a strategy involving generative models and the distillation of dark knowledge as a means of hallucinating data along with appropriate targets from past distributions. We call this technique, phantom sampling.We show that phantom sampling helps avoid catastrophic forgetting during incremental learning. Using an implementation based on deep neural networks, we demonstrate that phantom sampling dramatically avoids catastrophic forgetting. We apply these strategies to competitive multi-class incremental learning of deep neural networks. Using various benchmark datasets and through our strategy, we demonstrate that strict incremental learning could be achieved. We further put our strategy to test on challenging cases, including cross-domain increments and incrementing on a novel label space. We also propose a trivial extension to unbounded-continual learning and identify potential for future development. |
Tasks | Continual Learning |
Published | 2017-05-02 |
URL | http://arxiv.org/abs/1705.00744v2 |
http://arxiv.org/pdf/1705.00744v2.pdf | |
PWC | https://paperswithcode.com/paper/a-strategy-for-an-uncompromising-incremental |
Repo | |
Framework | |
Frequency patterns of semantic change: Corpus-based evidence of a near-critical dynamics in language change
Title | Frequency patterns of semantic change: Corpus-based evidence of a near-critical dynamics in language change |
Authors | Quentin Feltgen, Benjamin Fagard, Jean-Pierre Nadal |
Abstract | It is generally believed that, when a linguistic item acquires a new meaning, its overall frequency of use in the language rises with time with an S-shaped growth curve. Yet, this claim has only been supported by a limited number of case studies. In this paper, we provide the first corpus-based quantitative confirmation of the genericity of the S-curve in language change. Moreover, we uncover another generic pattern, a latency phase of variable duration preceding the S-growth, during which the frequency of use of the semantically expanding word remains low and more or less constant. We also propose a usage-based model of language change supported by cognitive considerations, which predicts that both phases, the latency and the fast S-growth, take place. The driving mechanism is a stochastic dynamics, a random walk in the space of frequency of use. The underlying deterministic dynamics highlights the role of a control parameter, the strength of the cognitive impetus governing the onset of change, which tunes the system at the vicinity of a saddle-node bifurcation. In the neighborhood of the critical point, the latency phase corresponds to the diffusion time over the critical region, and the S-growth to the fast convergence that follows. The duration of the two phases is computed as specific first passage times of the random walk process, leading to distributions that fit well the ones extracted from our dataset. We argue that our results are not specific to the studied corpus, but apply to semantic change in general. |
Tasks | |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00203v3 |
http://arxiv.org/pdf/1703.00203v3.pdf | |
PWC | https://paperswithcode.com/paper/frequency-patterns-of-semantic-change-corpus |
Repo | |
Framework | |
Perspectival Knowledge in PSOA RuleML: Representation, Model Theory, and Translation
Title | Perspectival Knowledge in PSOA RuleML: Representation, Model Theory, and Translation |
Authors | Harold Boley, Gen Zou |
Abstract | In Positional-Slotted Object-Applicative (PSOA) RuleML, a predicate application (atom) can have an Object IDentifier (OID) and descriptors that may be positional arguments (tuples) or attribute-value pairs (slots). PSOA RuleML explicitly specifies for each descriptor whether it is to be interpreted under the perspective of the predicate in whose scope it occurs. This predicate-dependency dimension refines the space between oidless, positional atoms (relationships) and oidful, slotted atoms (framepoints): While relationships use only a predicate-scope-sensitive (predicate-dependent) tuple and framepoints use only predicate-scope-insensitive (predicate-independent) slots, PSOA uses a systematics of orthogonal constructs also permitting atoms with (predicate-)independent tuples and atoms with (predicate-)dependent slots. This supports data and knowledge representation where a slot attribute can have different values depending on the predicate. PSOA thus extends object-oriented multi-membership and multiple inheritance. Based on objectification, PSOA laws are given: Besides unscoping and centralization, the semantic restriction and transformation of describution permits rescoping of one atom’s independent descriptors to another atom with the same OID but a different predicate. For inheritance, default descriptors are realized by rules. On top of a metamodel and a Grailog visualization, PSOA’s atom systematics for facts, queries, and rules is explained. The presentation and (XML-)serialization syntaxes of PSOA RuleML are introduced. Its model-theoretic semantics is formalized by extending the interpretation functions for dependent descriptors. The open-source PSOATransRun system realizes PSOA RuleML by a translator to runtime predicates, including for dependent tuples (prdtupterm) and slots (prdsloterm). Our tests show efficiency advantages of dependent and tupled modeling. |
Tasks | |
Published | 2017-12-07 |
URL | https://arxiv.org/abs/1712.02869v3 |
https://arxiv.org/pdf/1712.02869v3.pdf | |
PWC | https://paperswithcode.com/paper/perspectival-knowledge-in-psoa-ruleml |
Repo | |
Framework | |