Paper Group ANR 691
To Learn or Not to Learn Features for Deformable Registration?. Improving Text Normalization by Optimizing Nearest Neighbor Matching. DAG-based Long Short-Term Memory for Neural Word Segmentation. Supervised Classification: Quite a Brief Overview. A Frame Tracking Model for Memory-Enhanced Dialogue Systems. A hierarchical loss and its problems when …
To Learn or Not to Learn Features for Deformable Registration?
Title | To Learn or Not to Learn Features for Deformable Registration? |
Authors | Aabhas Majumdar, Raghav Mehta, Jayanthi Sivaswamy |
Abstract | Feature-based registration has been popular with a variety of features ranging from voxel intensity to Self-Similarity Context (SSC). In this paper, we examine the question on how features learnt using various Deep Learning (DL) frameworks can be used for deformable registration and whether this feature learning is necessary or not. We investigate the use of features learned by different DL methods in the current state-of-the-art discrete registration framework and analyze its performance on 2 publicly available datasets. We draw insights into the type of DL framework useful for feature learning and the impact, if any, of the complexity of different DL models and brain parcellation methods on the performance of discrete registration. Our results indicate that the registration performance with DL features and SSC are comparable and stable across datasets whereas this does not hold for low level features. |
Tasks | |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.01057v3 |
http://arxiv.org/pdf/1709.01057v3.pdf | |
PWC | https://paperswithcode.com/paper/to-learn-or-not-to-learn-features-for |
Repo | |
Framework | |
Improving Text Normalization by Optimizing Nearest Neighbor Matching
Title | Improving Text Normalization by Optimizing Nearest Neighbor Matching |
Authors | Salman Ahmad Ansari, Usman Zafar, Asim Karim |
Abstract | Text normalization is an essential task in the processing and analysis of social media that is dominated with informal writing. It aims to map informal words to their intended standard forms. Previously proposed text normalization approaches typically require manual selection of parameters for improved performance. In this paper, we present an automatic optimizationbased nearest neighbor matching approach for text normalization. This approach is motivated by the observation that text normalization is essentially a matching problem and nearest neighbor matching with an adaptive similarity function is the most direct procedure for it. Our similarity function incorporates weighted contributions of contextual, string, and phonetic similarity, and the nearest neighbor matching involves a minimum similarity threshold. These four parameters are tuned efficiently using grid search. We evaluate the performance of our approach on two benchmark datasets. The results demonstrate that parameter tuning on small sized labeled datasets produce state-of-the-art text normalization performances. Thus, this approach allows practically easy construction of evolving domain-specific normalization lexicons |
Tasks | |
Published | 2017-12-27 |
URL | http://arxiv.org/abs/1712.09518v1 |
http://arxiv.org/pdf/1712.09518v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-text-normalization-by-optimizing |
Repo | |
Framework | |
DAG-based Long Short-Term Memory for Neural Word Segmentation
Title | DAG-based Long Short-Term Memory for Neural Word Segmentation |
Authors | Xinchi Chen, Zhan Shi, Xipeng Qiu, Xuanjing Huang |
Abstract | Neural word segmentation has attracted more and more research interests for its ability to alleviate the effort of feature engineering and utilize the external resource by the pre-trained character or word embeddings. In this paper, we propose a new neural model to incorporate the word-level information for Chinese word segmentation. Unlike the previous word-based models, our model still adopts the framework of character-based sequence labeling, which has advantages on both effectiveness and efficiency at the inference stage. To utilize the word-level information, we also propose a new long short-term memory (LSTM) architecture over directed acyclic graph (DAG). Experimental results demonstrate that our model leads to better performances than the baseline models. |
Tasks | Chinese Word Segmentation, Feature Engineering, Word Embeddings |
Published | 2017-07-02 |
URL | http://arxiv.org/abs/1707.00248v1 |
http://arxiv.org/pdf/1707.00248v1.pdf | |
PWC | https://paperswithcode.com/paper/dag-based-long-short-term-memory-for-neural |
Repo | |
Framework | |
Supervised Classification: Quite a Brief Overview
Title | Supervised Classification: Quite a Brief Overview |
Authors | Marco Loog |
Abstract | The original problem of supervised classification considers the task of automatically assigning objects to their respective classes on the basis of numerical measurements derived from these objects. Classifiers are the tools that implement the actual functional mapping from these measurements—also called features or inputs—to the so-called class label—or output. The fields of pattern recognition and machine learning study ways of constructing such classifiers. The main idea behind supervised methods is that of learning from examples: given a number of example input-output relations, to what extent can the general mapping be learned that takes any new and unseen feature vector to its correct class? This chapter provides a basic introduction to the underlying ideas of how to come to a supervised classification problem. In addition, it provides an overview of some specific classification techniques, delves into the issues of object representation and classifier evaluation, and (very) briefly covers some variations on the basic supervised classification task that may also be of interest to the practitioner. |
Tasks | |
Published | 2017-10-25 |
URL | http://arxiv.org/abs/1710.09230v1 |
http://arxiv.org/pdf/1710.09230v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-classification-quite-a-brief |
Repo | |
Framework | |
A Frame Tracking Model for Memory-Enhanced Dialogue Systems
Title | A Frame Tracking Model for Memory-Enhanced Dialogue Systems |
Authors | Hannes Schulz, Jeremie Zumer, Layla El Asri, Shikhar Sharma |
Abstract | Recently, resources and tasks were proposed to go beyond state tracking in dialogue systems. An example is the frame tracking task, which requires recording multiple frames, one for each user goal set during the dialogue. This allows a user, for instance, to compare items corresponding to different goals. This paper proposes a model which takes as input the list of frames created so far during the dialogue, the current user utterance as well as the dialogue acts, slot types, and slot values associated with this utterance. The model then outputs the frame being referenced by each triple of dialogue act, slot type, and slot value. We show that on the recently published Frames dataset, this model significantly outperforms a previously proposed rule-based baseline. In addition, we propose an extensive analysis of the frame tracking task by dividing it into sub-tasks and assessing their difficulty with respect to our model. |
Tasks | |
Published | 2017-06-06 |
URL | http://arxiv.org/abs/1706.01690v1 |
http://arxiv.org/pdf/1706.01690v1.pdf | |
PWC | https://paperswithcode.com/paper/a-frame-tracking-model-for-memory-enhanced |
Repo | |
Framework | |
A hierarchical loss and its problems when classifying non-hierarchically
Title | A hierarchical loss and its problems when classifying non-hierarchically |
Authors | Cinna Wu, Mark Tygert, Yann LeCun |
Abstract | Failing to distinguish between a sheepdog and a skyscraper should be worse and penalized more than failing to distinguish between a sheepdog and a poodle; after all, sheepdogs and poodles are both breeds of dogs. However, existing metrics of failure (so-called “loss” or “win”) used in textual or visual classification/recognition via neural networks seldom leverage a-priori information, such as a sheepdog being more similar to a poodle than to a skyscraper. We define a metric that, inter alia, can penalize failure to distinguish between a sheepdog and a skyscraper more than failure to distinguish between a sheepdog and a poodle. Unlike previously employed possibilities, this metric is based on an ultrametric tree associated with any given tree organization into a semantically meaningful hierarchy of a classifier’s classes. An ultrametric tree is a tree with a so-called ultrametric distance metric such that all leaves are at the same distance from the root. Unfortunately, extensive numerical experiments indicate that the standard practice of training neural networks via stochastic gradient descent with random starting points often drives down the hierarchical loss nearly as much when minimizing the standard cross-entropy loss as when trying to minimize the hierarchical loss directly. Thus, this hierarchical loss is unreliable as an objective for plain, randomly started stochastic gradient descent to minimize; the main value of the hierarchical loss may be merely as a meaningful metric of success of a classifier. |
Tasks | |
Published | 2017-09-01 |
URL | https://arxiv.org/abs/1709.01062v2 |
https://arxiv.org/pdf/1709.01062v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-loss-for-classification |
Repo | |
Framework | |
The Voynich Manuscript is Written in Natural Language: The Pahlavi Hypothesis
Title | The Voynich Manuscript is Written in Natural Language: The Pahlavi Hypothesis |
Authors | J. Michael Herrmann |
Abstract | The late medieval Voynich Manuscript (VM) has resisted decryption and was considered a meaningless hoax or an unsolvable cipher. Here, we provide evidence that the VM is written in natural language by establishing a relation of the Voynich alphabet and the Iranian Pahlavi script. Many of the Voynich characters are upside-down versions of their Pahlavi counterparts, which may be an effect of different writing directions. Other Voynich letters can be explained as ligatures or departures from Pahlavi with the intent to cope with known problems due to the stupendous ambiguity of Pahlavi text. While a translation of the VM text is not attempted here, we can confirm the Voynich-Pahlavi relation at the character level by the transcription of many words from the VM illustrations and from parts of the main text. Many of the transcribed words can be identified as terms from Zoroastrian cosmology which is in line with the use of Pahlavi script in Zoroastrian communities from medieval times. |
Tasks | |
Published | 2017-09-06 |
URL | http://arxiv.org/abs/1709.01634v2 |
http://arxiv.org/pdf/1709.01634v2.pdf | |
PWC | https://paperswithcode.com/paper/the-voynich-manuscript-is-written-in-natural |
Repo | |
Framework | |
Adversarial Generation of Natural Language
Title | Adversarial Generation of Natural Language |
Authors | Sai Rajeswar, Sandeep Subramanian, Francis Dutil, Christopher Pal, Aaron Courville |
Abstract | Generative Adversarial Networks (GANs) have gathered a lot of attention from the computer vision community, yielding impressive results for image generation. Advances in the adversarial generation of natural language from noise however are not commensurate with the progress made in generating images, and still lag far behind likelihood based methods. In this paper, we take a step towards generating natural language with a GAN objective alone. We introduce a simple baseline that addresses the discrete output space problem without relying on gradient estimators and show that it is able to achieve state-of-the-art results on a Chinese poem generation dataset. We present quantitative results on generating sentences from context-free and probabilistic context-free grammars, and qualitative language modeling results. A conditional version is also described that can generate sequences conditioned on sentence characteristics. |
Tasks | Image Generation, Language Modelling |
Published | 2017-05-31 |
URL | http://arxiv.org/abs/1705.10929v1 |
http://arxiv.org/pdf/1705.10929v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-generation-of-natural-language |
Repo | |
Framework | |
Accelerating Permutation Testing in Voxel-wise Analysis through Subspace Tracking: A new plugin for SnPM
Title | Accelerating Permutation Testing in Voxel-wise Analysis through Subspace Tracking: A new plugin for SnPM |
Authors | Felipe Gutierrez-Barragan, Vamsi K. Ithapu, Chris Hinrichs, Camille Maumet, Sterling C. Johnson, Thomas E. Nichols, Vikas Singh, the ADNI |
Abstract | Permutation testing is a non-parametric method for obtaining the max null distribution used to compute corrected $p$-values that provide strong control of false positives. In neuroimaging, however, the computational burden of running such an algorithm can be significant. We find that by viewing the permutation testing procedure as the construction of a very large permutation testing matrix, $T$, one can exploit structural properties derived from the data and the test statistics to reduce the runtime under certain conditions. In particular, we see that $T$ is low-rank plus a low-variance residual. This makes $T$ a good candidate for low-rank matrix completion, where only a very small number of entries of $T$ ($\sim0.35%$ of all entries in our experiments) have to be computed to obtain a good estimate. Based on this observation, we present RapidPT, an algorithm that efficiently recovers the max null distribution commonly obtained through regular permutation testing in voxel-wise analysis. We present an extensive validation on a synthetic dataset and four varying sized datasets against two baselines: Statistical NonParametric Mapping (SnPM13) and a standard permutation testing implementation (referred as NaivePT). We find that RapidPT achieves its best runtime performance on medium sized datasets ($50 \leq n \leq 200$), with speedups of 1.5x - 38x (vs. SnPM13) and 20x-1000x (vs. NaivePT). For larger datasets ($n \geq 200$) RapidPT outperforms NaivePT (6x - 200x) on all datasets, and provides large speedups over SnPM13 when more than 10000 permutations (2x - 15x) are needed. The implementation is a standalone toolbox and also integrated within SnPM13, able to leverage multi-core architectures when available. |
Tasks | Low-Rank Matrix Completion, Matrix Completion |
Published | 2017-03-04 |
URL | http://arxiv.org/abs/1703.01506v2 |
http://arxiv.org/pdf/1703.01506v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-permutation-testing-in-voxel |
Repo | |
Framework | |
Cost-sensitive detection with variational autoencoders for environmental acoustic sensing
Title | Cost-sensitive detection with variational autoencoders for environmental acoustic sensing |
Authors | Yunpeng Li, Ivan Kiskin, Davide Zilli, Marianne Sinka, Henry Chan, Kathy Willis, Stephen Roberts |
Abstract | Environmental acoustic sensing involves the retrieval and processing of audio signals to better understand our surroundings. While large-scale acoustic data make manual analysis infeasible, they provide a suitable playground for machine learning approaches. Most existing machine learning techniques developed for environmental acoustic sensing do not provide flexible control of the trade-off between the false positive rate and the false negative rate. This paper presents a cost-sensitive classification paradigm, in which the hyper-parameters of classifiers and the structure of variational autoencoders are selected in a principled Neyman-Pearson framework. We examine the performance of the proposed approach using a dataset from the HumBug project which aims to detect the presence of mosquitoes using sound collected by simple embedded devices. |
Tasks | |
Published | 2017-12-07 |
URL | http://arxiv.org/abs/1712.02488v1 |
http://arxiv.org/pdf/1712.02488v1.pdf | |
PWC | https://paperswithcode.com/paper/cost-sensitive-detection-with-variational |
Repo | |
Framework | |
Maintaining Ad-Hoc Communication Network in Area Protection Scenarios with Adversarial Agents
Title | Maintaining Ad-Hoc Communication Network in Area Protection Scenarios with Adversarial Agents |
Authors | Marika Ivanová, Pavel Surynek, Diep Thi Ngoc Nguyen |
Abstract | We address a problem of area protection in graph-based scenarios with multiple mobile agents where connectivity is maintained among agents to ensure they can communicate. The problem consists of two adversarial teams of agents that move in an undirected graph shared by both teams. Agents are placed in vertices of the graph; at most one agent can occupy a vertex; and they can move into adjacent vertices in a conflict free way. Teams have asymmetric goals: the aim of one team - attackers - is to invade into given area while the aim of the opponent team - defenders - is to protect the area from being entered by attackers by occupying selected vertices. The team of defenders need to maintain connectivity of vertices occupied by its own agents in a visibility graph. The visibility graph models possibility of communication between pairs of vertices. We study strategies for allocating vertices to be occupied by the team of defenders to block attacking agents where connectivity is maintained at the same time. To do this we reserve a subset of defending agents that do not try to block the attackers but instead are placed to support connectivity of the team. The performance of strategies is tested in multiple benchmarks. The success of a strategy is heavily dependent on the type of the instance, and so one of the contributions of this work is that we identify suitable strategies for diverse instance types. |
Tasks | |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.01070v1 |
http://arxiv.org/pdf/1709.01070v1.pdf | |
PWC | https://paperswithcode.com/paper/maintaining-ad-hoc-communication-network-in |
Repo | |
Framework | |
SAR: Semantic Analysis for Recommendation
Title | SAR: Semantic Analysis for Recommendation |
Authors | Han Xiao, Lian Meng |
Abstract | Recommendation system is a common demand in daily life and matrix completion is a widely adopted technique for this task. However, most matrix completion methods lack semantic interpretation and usually result in weak-semantic recommendations. To this end, this paper proposes a $S$emantic $A$nalysis approach for $R$ecommendation systems $(SAR)$, which applies a two-level hierarchical generative process that assigns semantic properties and categories for user and item. $SAR$ learns semantic representations of users/items merely from user ratings on items, which offers a new path to recommendation by semantic matching with the learned representations. Extensive experiments demonstrate $SAR$ outperforms other state-of-the-art baselines substantially. |
Tasks | Matrix Completion |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06247v4 |
http://arxiv.org/pdf/1702.06247v4.pdf | |
PWC | https://paperswithcode.com/paper/sar-semantic-analysis-for-recommendation |
Repo | |
Framework | |
Scalable Multi-Class Gaussian Process Classification using Expectation Propagation
Title | Scalable Multi-Class Gaussian Process Classification using Expectation Propagation |
Authors | Carlos Villacampa-Calvo, Daniel Hernández-Lobato |
Abstract | This paper describes an expectation propagation (EP) method for multi-class classification with Gaussian processes that scales well to very large datasets. In such a method the estimate of the log-marginal-likelihood involves a sum across the data instances. This enables efficient training using stochastic gradients and mini-batches. When this type of training is used, the computational cost does not depend on the number of data instances $N$. Furthermore, extra assumptions in the approximate inference process make the memory cost independent of $N$. The consequence is that the proposed EP method can be used on datasets with millions of instances. We compare empirically this method with alternative approaches that approximate the required computations using variational inference. The results show that it performs similar or even better than these techniques, which sometimes give significantly worse predictive distributions in terms of the test log-likelihood. Besides this, the training process of the proposed approach also seems to converge in a smaller number of iterations. |
Tasks | Gaussian Processes |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07258v1 |
http://arxiv.org/pdf/1706.07258v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-multi-class-gaussian-process |
Repo | |
Framework | |
Rational coordination with no communication or conventions
Title | Rational coordination with no communication or conventions |
Authors | Valentin Goranko, Antti Kuusisto, Raine Rönnholm |
Abstract | We study pure coordination games where in every outcome, all players have identical payoffs, ‘win’ or ‘lose’. We identify and discuss a range of ‘purely rational principles’ guiding the reasoning of rational players in such games and analyze which classes of coordination games can be solved by such players with no preplay communication or conventions. We observe that it is highly nontrivial to delineate a boundary between purely rational principles and other decision methods, such as conventions, for solving such coordination games. |
Tasks | |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07412v2 |
http://arxiv.org/pdf/1706.07412v2.pdf | |
PWC | https://paperswithcode.com/paper/rational-coordination-with-no-communication |
Repo | |
Framework | |
Matrix Completion from $O(n)$ Samples in Linear Time
Title | Matrix Completion from $O(n)$ Samples in Linear Time |
Authors | David Gamarnik, Quan Li, Hongyi Zhang |
Abstract | We consider the problem of reconstructing a rank-$k$ $n \times n$ matrix $M$ from a sampling of its entries. Under a certain incoherence assumption on $M$ and for the case when both the rank and the condition number of $M$ are bounded, it was shown in \cite{CandesRecht2009, CandesTao2010, keshavan2010, Recht2011, Jain2012, Hardt2014} that $M$ can be recovered exactly or approximately (depending on some trade-off between accuracy and computational complexity) using $O(n , \text{poly}(\log n))$ samples in super-linear time $O(n^{a} , \text{poly}(\log n))$ for some constant $a \geq 1$. In this paper, we propose a new matrix completion algorithm using a novel sampling scheme based on a union of independent sparse random regular bipartite graphs. We show that under the same conditions w.h.p. our algorithm recovers an $\epsilon$-approximation of $M$ in terms of the Frobenius norm using $O(n \log^2(1/\epsilon))$ samples and in linear time $O(n \log^2(1/\epsilon))$. This provides the best known bounds both on the sample complexity and computational complexity for reconstructing (approximately) an unknown low-rank matrix. The novelty of our algorithm is two new steps of thresholding singular values and rescaling singular vectors in the application of the “vanilla” alternating minimization algorithm. The structure of sparse random regular graphs is used heavily for controlling the impact of these regularization steps. |
Tasks | Matrix Completion |
Published | 2017-02-08 |
URL | http://arxiv.org/abs/1702.02267v4 |
http://arxiv.org/pdf/1702.02267v4.pdf | |
PWC | https://paperswithcode.com/paper/matrix-completion-from-on-samples-in-linear |
Repo | |
Framework | |