Paper Group ANR 223
Unsupervised feature learning from finite data by message passing: discontinuous versus continuous phase transition. Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge. `Who would have thought of that!': A Hierarchical Topic Model for Extraction of Sarcasm-prevalent Topics and Sarcasm …
Unsupervised feature learning from finite data by message passing: discontinuous versus continuous phase transition
Title | Unsupervised feature learning from finite data by message passing: discontinuous versus continuous phase transition |
Authors | Haiping Huang, Taro Toyoizumi |
Abstract | Unsupervised neural network learning extracts hidden features from unlabeled training data. This is used as a pretraining step for further supervised learning in deep networks. Hence, understanding unsupervised learning is of fundamental importance. Here, we study the unsupervised learning from a finite number of data, based on the restricted Boltzmann machine learning. Our study inspires an efficient message passing algorithm to infer the hidden feature, and estimate the entropy of candidate features consistent with the data. Our analysis reveals that the learning requires only a few data if the feature is salient and extensively many if the feature is weak. Moreover, the entropy of candidate features monotonically decreases with data size and becomes negative (i.e., entropy crisis) before the message passing becomes unstable, suggesting a discontinuous phase transition. In terms of convergence time of the message passing algorithm, the unsupervised learning exhibits an easy-hard-easy phenomenon as the training data size increases. All these properties are reproduced in an approximate Hopfield model, with an exception that the entropy crisis is absent, and only continuous phase transition is observed. This key difference is also confirmed in a handwritten digits dataset. This study deepens our understanding of unsupervised learning from a finite number of data, and may provide insights into its role in training deep networks. |
Tasks | |
Published | 2016-08-12 |
URL | http://arxiv.org/abs/1608.03714v2 |
http://arxiv.org/pdf/1608.03714v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-feature-learning-from-finite |
Repo | |
Framework | |
Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge
Title | Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge |
Authors | Quan Liu, Hui Jiang, Zhen-Hua Ling, Xiaodan Zhu, Si Wei, Yu Hu |
Abstract | In this paper, we propose commonsense knowledge enhanced embeddings (KEE) for solving the Pronoun Disambiguation Problems (PDP). The PDP task we investigate in this paper is a complex coreference resolution task which requires the utilization of commonsense knowledge. This task is a standard first round test set in the 2016 Winograd Schema Challenge. In this task, traditional linguistic features that are useful for coreference resolution, e.g. context and gender information, are no longer effective anymore. Therefore, the KEE models are proposed to provide a general framework to make use of commonsense knowledge for solving the PDP problems. Since the PDP task doesn’t have training data, the KEE models would be used during the unsupervised feature extraction process. To evaluate the effectiveness of the KEE models, we propose to incorporate various commonsense knowledge bases, including ConceptNet, WordNet, and CauseCom, into the KEE training process. We achieved the best performance by applying the proposed methods to the 2016 Winograd Schema Challenge. In addition, experiments conducted on the standard PDP task indicate that, the proposed KEE models could solve the PDP problems by achieving 66.7% accuracy, which is a new state-of-the-art performance. |
Tasks | Coreference Resolution |
Published | 2016-11-13 |
URL | http://arxiv.org/abs/1611.04146v2 |
http://arxiv.org/pdf/1611.04146v2.pdf | |
PWC | https://paperswithcode.com/paper/commonsense-knowledge-enhanced-embeddings-for |
Repo | |
Framework | |
`Who would have thought of that!': A Hierarchical Topic Model for Extraction of Sarcasm-prevalent Topics and Sarcasm Detection
Title | `Who would have thought of that!': A Hierarchical Topic Model for Extraction of Sarcasm-prevalent Topics and Sarcasm Detection | |
Authors | Aditya Joshi, Prayas Jain, Pushpak Bhattacharyya, Mark Carman |
Abstract | Topic Models have been reported to be beneficial for aspect-based sentiment analysis. This paper reports a simple topic model for sarcasm detection, a first, to the best of our knowledge. Designed on the basis of the intuition that sarcastic tweets are likely to have a mixture of words of both sentiments as against tweets with literal sentiment (either positive or negative), our hierarchical topic model discovers sarcasm-prevalent topics and topic-level sentiment. Using a dataset of tweets labeled using hashtags, the model estimates topic-level, and sentiment-level distributions. Our evaluation shows that topics such as work', gun laws’, `weather’ are sarcasm-prevalent topics. Our model is also able to discover the mixture of sentiment-bearing words that exist in a text of a given sentiment-related label. Finally, we apply our model to predict sarcasm in tweets. We outperform two prior work based on statistical classifiers with specific features, by around 25%. | |
Tasks | Aspect-Based Sentiment Analysis, Sarcasm Detection, Sentiment Analysis, Topic Models |
Published | 2016-11-14 |
URL | http://arxiv.org/abs/1611.04326v2 |
http://arxiv.org/pdf/1611.04326v2.pdf | |
PWC | https://paperswithcode.com/paper/who-would-have-thought-of-that-a-hierarchical |
Repo | |
Framework | |
PCT and Beyond: Towards a Computational Framework for `Intelligent’ Communicative Systems
Title | PCT and Beyond: Towards a Computational Framework for `Intelligent’ Communicative Systems | |
Authors | Prof. Roger K. Moore |
Abstract | Recent years have witnessed increasing interest in the potential benefits of `intelligent’ autonomous machines such as robots. Honda’s Asimo humanoid robot, iRobot’s Roomba robot vacuum cleaner and Google’s driverless cars have fired the imagination of the general public, and social media buzz with speculation about a utopian world of helpful robot assistants or the coming robot apocalypse! However, there is a long way to go before autonomous systems reach the level of capabilities required for even the simplest of tasks involving human-robot interaction - especially if it involves communicative behaviour such as speech and language. Of course the field of Artificial Intelligence (AI) has made great strides in these areas, and has moved on from abstract high-level rule-based paradigms to embodied architectures whose operations are grounded in real physical environments. What is still missing, however, is an overarching theory of intelligent communicative behaviour that informs system-level design decisions in order to provide a more coherent approach to system integration. This chapter introduces the beginnings of such a framework inspired by the principles of Perceptual Control Theory (PCT). In particular, it is observed that PCT has hitherto tended to view perceptual processes as a relatively straightforward series of transformations from sensation to perception, and has overlooked the potential of powerful generative model-based solutions that have emerged in practical fields such as visual or auditory scene analysis. Starting from first principles, a sequence of arguments is presented which not only shows how these ideas might be integrated into PCT, but which also extend PCT towards a remarkably symmetric architecture for a needs-driven communicative agent. It is concluded that, if behaviour is the control of perception, then perception is the simulation of behaviour. | |
Tasks | |
Published | 2016-11-16 |
URL | http://arxiv.org/abs/1611.05379v1 |
http://arxiv.org/pdf/1611.05379v1.pdf | |
PWC | https://paperswithcode.com/paper/pct-and-beyond-towards-a-computational |
Repo | |
Framework | |
Distribution Free Learning with Local Queries
Title | Distribution Free Learning with Local Queries |
Authors | Galit Bary-Weisberg, Amit Daniely, Shai Shalev-Shwartz |
Abstract | The model of learning with \emph{local membership queries} interpolates between the PAC model and the membership queries model by allowing the learner to query the label of any example that is similar to an example in the training set. This model, recently proposed and studied by Awasthi, Feldman and Kanade, aims to facilitate practical use of membership queries. We continue this line of work, proving both positive and negative results in the {\em distribution free} setting. We restrict to the boolean cube ${-1, 1}^n$, and say that a query is $q$-local if it is of a hamming distance $\le q$ from some training example. On the positive side, we show that $1$-local queries already give an additional strength, and allow to learn a certain type of DNF formulas. On the negative side, we show that even $\left(n^{0.99}\right)$-local queries cannot help to learn various classes including Automata, DNFs and more. Likewise, $q$-local queries for any constant $q$ cannot help to learn Juntas, Decision Trees, Sparse Polynomials and more. Moreover, for these classes, an algorithm that uses $\left(\log^{0.99}(n)\right)$-local queries would lead to a breakthrough in the best known running times. |
Tasks | |
Published | 2016-03-11 |
URL | http://arxiv.org/abs/1603.03714v1 |
http://arxiv.org/pdf/1603.03714v1.pdf | |
PWC | https://paperswithcode.com/paper/distribution-free-learning-with-local-queries |
Repo | |
Framework | |
Automatic Sarcasm Detection: A Survey
Title | Automatic Sarcasm Detection: A Survey |
Authors | Aditya Joshi, Pushpak Bhattacharyya, Mark James Carman |
Abstract | Automatic sarcasm detection is the task of predicting sarcasm in text. This is a crucial step to sentiment analysis, considering prevalence and challenges of sarcasm in sentiment-bearing text. Beginning with an approach that used speech-based features, sarcasm detection has witnessed great interest from the sentiment analysis community. This paper is the first known compilation of past work in automatic sarcasm detection. We observe three milestones in the research so far: semi-supervised pattern extraction to identify implicit sentiment, use of hashtag-based supervision, and use of context beyond target text. In this paper, we describe datasets, approaches, trends and issues in sarcasm detection. We also discuss representative performance values, shared tasks and pointers to future work, as given in prior works. In terms of resources that could be useful for understanding state-of-the-art, the survey presents several useful illustrations - most prominently, a table that summarizes past papers along different dimensions such as features, annotation techniques, data forms, etc. |
Tasks | Sarcasm Detection, Sentiment Analysis |
Published | 2016-02-10 |
URL | http://arxiv.org/abs/1602.03426v2 |
http://arxiv.org/pdf/1602.03426v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-sarcasm-detection-a-survey |
Repo | |
Framework | |
Feature Selection Library (MATLAB Toolbox)
Title | Feature Selection Library (MATLAB Toolbox) |
Authors | Giorgio Roffo |
Abstract | Feature Selection Library (FSLib) is a widely applicable MATLAB library for Feature Selection (FS). FS is an essential component of machine learning and data mining which has been studied for many years under many different conditions and in diverse scenarios. These algorithms aim at ranking and selecting a subset of relevant features according to their degrees of relevance, preference, or importance as defined in a specific application. Because feature selection can reduce the amount of features used for training classification models, it alleviates the effect of the curse of dimensionality, speeds up the learning process, improves model’s performance, and enhances data understanding. This short report provides an overview of the feature selection algorithms included in the FSLib MATLAB toolbox among filter, embedded, and wrappers methods. |
Tasks | Feature Selection |
Published | 2016-07-05 |
URL | http://arxiv.org/abs/1607.01327v6 |
http://arxiv.org/pdf/1607.01327v6.pdf | |
PWC | https://paperswithcode.com/paper/feature-selection-library-matlab-toolbox |
Repo | |
Framework | |
Stochastic single flux quantum neuromorphic computing using magnetically tunable Josephson junctions
Title | Stochastic single flux quantum neuromorphic computing using magnetically tunable Josephson junctions |
Authors | S. E. Russek, C. A. Donnelly, M. L. Schneider, B. Baek, M. R. Pufall, W. H. Rippard, P. F. Hopkins, P. D. Dresselhaus, S. P. Benz |
Abstract | Single flux quantum (SFQ) circuits form a natural neuromorphic technology with SFQ pulses and superconducting transmission lines simulating action potentials and axons, respectively. Here we present a new component, magnetic Josephson junctions, that have a tunablility and re-configurability that was lacking from previous SFQ neuromorphic circuits. The nanoscale magnetic structure acts as a tunable synaptic constituent that modifies the junction critical current. These circuits can operate near the thermal limit where stochastic firing of the neurons is an essential component of the technology. This technology has the ability to create complex neural systems with greater than 10^21 neural firings per second with approximately 1 W dissipation. |
Tasks | |
Published | 2016-11-12 |
URL | http://arxiv.org/abs/1612.09292v1 |
http://arxiv.org/pdf/1612.09292v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-single-flux-quantum-neuromorphic |
Repo | |
Framework | |
Deep Active Contours
Title | Deep Active Contours |
Authors | Christian Rupprecht, Elizabeth Huaroc, Maximilian Baust, Nassir Navab |
Abstract | We propose a method for interactive boundary extraction which combines a deep, patch-based representation with an active contour framework. We train a class-specific convolutional neural network which predicts a vector pointing from the respective point on the evolving contour towards the closest point on the boundary of the object of interest. These predictions form a vector field which is then used for evolving the contour by the Sobolev active contour framework proposed by Sundaramoorthi et al. The resulting interactive segmentation method is very efficient in terms of required computational resources and can even be trained on comparatively small graphics cards. We evaluate the potential of the proposed method on both medical and non-medical challenge data sets, such as the STACOM data set and the PASCAL VOC 2012 data set. |
Tasks | Interactive Segmentation |
Published | 2016-07-18 |
URL | http://arxiv.org/abs/1607.05074v1 |
http://arxiv.org/pdf/1607.05074v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-active-contours |
Repo | |
Framework | |
Ego2Top: Matching Viewers in Egocentric and Top-view Videos
Title | Ego2Top: Matching Viewers in Egocentric and Top-view Videos |
Authors | Shervin Ardeshir, Ali Borji |
Abstract | Egocentric cameras are becoming increasingly popular and provide us with large amounts of videos, captured from the first person perspective. At the same time, surveillance cameras and drones offer an abundance of visual information, often captured from top-view. Although these two sources of information have been separately studied in the past, they have not been collectively studied and related. Having a set of egocentric cameras and a top-view camera capturing the same area, we propose a framework to identify the egocentric viewers in the top-view video. We utilize two types of features for our assignment procedure. Unary features encode what a viewer (seen from top-view or recording an egocentric video) visually experiences over time. Pairwise features encode the relationship between the visual content of a pair of viewers. Modeling each view (egocentric or top) by a graph, the assignment process is formulated as spectral graph matching. Evaluating our method over a dataset of 50 top-view and 188 egocentric videos taken in different scenarios demonstrates the efficiency of the proposed approach in assigning egocentric viewers to identities present in top-view camera. We also study the effect of different parameters such as the number of egocentric viewers and visual features. |
Tasks | Graph Matching |
Published | 2016-07-24 |
URL | http://arxiv.org/abs/1607.06986v2 |
http://arxiv.org/pdf/1607.06986v2.pdf | |
PWC | https://paperswithcode.com/paper/ego2top-matching-viewers-in-egocentric-and |
Repo | |
Framework | |
Optimal dictionary for least squares representation
Title | Optimal dictionary for least squares representation |
Authors | Mohammed Rayyan Sheriff, Debasish Chatterjee |
Abstract | Dictionaries are collections of vectors used for representations of random vectors in Euclidean spaces. Recent research on optimal dictionaries is focused on constructing dictionaries that offer sparse representations, i.e., $\ell_0$-optimal representations. Here we consider the problem of finding optimal dictionaries with which representations of samples of a random vector are optimal in an $\ell_2$-sense: optimality of representation is defined as attaining the minimal average $\ell_2$-norm of the coefficients used to represent the random vector. With the help of recent results on rank-$1$ decompositions of symmetric positive semidefinite matrices, we provide an explicit description of $\ell_2$-optimal dictionaries as well as their algorithmic constructions in polynomial time. |
Tasks | |
Published | 2016-03-07 |
URL | http://arxiv.org/abs/1603.02074v3 |
http://arxiv.org/pdf/1603.02074v3.pdf | |
PWC | https://paperswithcode.com/paper/optimal-dictionary-for-least-squares |
Repo | |
Framework | |
CDVAE: Co-embedding Deep Variational Auto Encoder for Conditional Variational Generation
Title | CDVAE: Co-embedding Deep Variational Auto Encoder for Conditional Variational Generation |
Authors | Jiajun Lu, Aditya Deshpande, David Forsyth |
Abstract | Problems such as predicting a new shading field (Y) for an image (X) are ambiguous: many very distinct solutions are good. Representing this ambiguity requires building a conditional model P(YX) of the prediction, conditioned on the image. Such a model is difficult to train, because we do not usually have training data containing many different shadings for the same image. As a result, we need different training examples to share data to produce good models. This presents a danger we call “code space collapse” - the training procedure produces a model that has a very good loss score, but which represents the conditional distribution poorly. We demonstrate an improved method for building conditional models by exploiting a metric constraint on training data that prevents code space collapse. We demonstrate our model on two example tasks using real data: image saturation adjustment, image relighting. We describe quantitative metrics to evaluate ambiguous generation results. Our results quantitatively and qualitatively outperform different strong baselines. |
Tasks | |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00132v2 |
http://arxiv.org/pdf/1612.00132v2.pdf | |
PWC | https://paperswithcode.com/paper/cdvae-co-embedding-deep-variational-auto |
Repo | |
Framework | |
When the map is better than the territory
Title | When the map is better than the territory |
Authors | Erik P Hoel |
Abstract | The causal structure of any system can be analyzed at a multitude of spatial and temporal scales. It has long been thought that while higher scale (macro) descriptions of causal structure may be useful to observers, they are at best a compressed description and at worse leave out critical information. However, recent research applying information theory to causal analysis has shown that the causal structure of some systems can actually come into focus (be more informative) at a macroscale (Hoel et al. 2013). That is, a macro model of a system (a map) can be more informative than a fully detailed model of the system (the territory). This has been called causal emergence. While causal emergence may at first glance seem counterintuitive, this paper grounds the phenomenon in a classic concept from information theory: Shannon’s discovery of the channel capacity. I argue that systems have a particular causal capacity, and that different causal models of those systems take advantage of that capacity to various degrees. For some systems, only macroscale causal models use the full causal capacity. Such macroscale causal models can either be coarse-grains, or may leave variables and states out of the model (exogenous) in various ways, which can improve the model’s efficacy and its informativeness via the same mathematical principles of how error-correcting codes take advantage of an information channel’s capacity. As model choice increase, the causal capacity of a system approaches the channel capacity. Ultimately, this provides a general framework for understanding how the causal structure of some systems cannot be fully captured by even the most detailed microscopic model. |
Tasks | |
Published | 2016-12-30 |
URL | http://arxiv.org/abs/1612.09592v1 |
http://arxiv.org/pdf/1612.09592v1.pdf | |
PWC | https://paperswithcode.com/paper/when-the-map-is-better-than-the-territory |
Repo | |
Framework | |
Planogram Compliance Checking Based on Detection of Recurring Patterns
Title | Planogram Compliance Checking Based on Detection of Recurring Patterns |
Authors | Song Liu, Wanqing Li, Stephen Davis, Christian Ritz, Hongda Tian |
Abstract | In this paper, a novel method for automatic planogram compliance checking in retail chains is proposed without requiring product template images for training. Product layout is extracted from an input image by means of unsupervised recurring pattern detection and matched via graph matching with the expected product layout specified by a planogram to measure the level of compliance. A divide and conquer strategy is employed to improve the speed. Specifically, the input image is divided into several regions based on the planogram. Recurring patterns are detected in each region respectively and then merged together to estimate the product layout. Experimental results on real data have verified the efficacy of the proposed method. Compared with a template-based method, higher accuracies are achieved by the proposed method over a wide range of products. |
Tasks | Graph Matching |
Published | 2016-02-22 |
URL | http://arxiv.org/abs/1602.06647v1 |
http://arxiv.org/pdf/1602.06647v1.pdf | |
PWC | https://paperswithcode.com/paper/planogram-compliance-checking-based-on |
Repo | |
Framework | |
Efficient Linear Programming for Dense CRFs
Title | Efficient Linear Programming for Dense CRFs |
Authors | Thalaiyasingam Ajanthan, Alban Desmaison, Rudy Bunel, Mathieu Salzmann, Philip H. S. Torr, M. Pawan Kumar |
Abstract | The fully connected conditional random field (CRF) with Gaussian pairwise potentials has proven popular and effective for multi-class semantic segmentation. While the energy of a dense CRF can be minimized accurately using a linear programming (LP) relaxation, the state-of-the-art algorithm is too slow to be useful in practice. To alleviate this deficiency, we introduce an efficient LP minimization algorithm for dense CRFs. To this end, we develop a proximal minimization framework, where the dual of each proximal problem is optimized via block coordinate descent. We show that each block of variables can be efficiently optimized. Specifically, for one block, the problem decomposes into significantly smaller subproblems, each of which is defined over a single pixel. For the other block, the problem is optimized via conditional gradient descent. This has two advantages: 1) the conditional gradient can be computed in a time linear in the number of pixels and labels; and 2) the optimal step size can be computed analytically. Our experiments on standard datasets provide compelling evidence that our approach outperforms all existing baselines including the previous LP based approach for dense CRFs. |
Tasks | Semantic Segmentation |
Published | 2016-11-29 |
URL | http://arxiv.org/abs/1611.09718v2 |
http://arxiv.org/pdf/1611.09718v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-linear-programming-for-dense-crfs |
Repo | |
Framework | |