May 5, 2019

2918 words 14 mins read

Paper Group ANR 443

SS4MCT: A Statistical Stemmer for Morphologically Complex Texts. Lifted Rule Injection for Relation Embeddings. Salient Region Detection with Convex Hull Overlap. Extraction of Skin Lesions from Non-Dermoscopic Images Using Deep Learning. Stochastic Multidimensional Scaling. Generalized RBF kernel for incomplete data. A Framework for Parallel and D …

SS4MCT: A Statistical Stemmer for Morphologically Complex Texts


Title	SS4MCT: A Statistical Stemmer for Morphologically Complex Texts
Authors	Javid Dadashkarimi, Hossein Nasr Esfahani, Heshaam Faili, Azadeh Shakery
Abstract	There have been multiple attempts to resolve various inflection matching problems in information retrieval. Stemming is a common approach to this end. Among many techniques for stemming, statistical stemming has been shown to be effective in a number of languages, particularly highly inflected languages. In this paper we propose a method for finding affixes in different positions of a word. Common statistical techniques heavily rely on string similarity in terms of prefix and suffix matching. Since infixes are common in irregular/informal inflections in morphologically complex texts, it is required to find infixes for stemming. In this paper we propose a method whose aim is to find statistical inflectional rules based on minimum edit distance table of word pairs and the likelihoods of the rules in a language. These rules are used to statistically stem words and can be used in different text mining tasks. Experimental results on CLEF 2008 and CLEF 2009 English-Persian CLIR tasks indicate that the proposed method significantly outperforms all the baselines in terms of MAP.
Tasks	Information Retrieval
Published	2016-05-25
URL	http://arxiv.org/abs/1605.07852v2
PDF	http://arxiv.org/pdf/1605.07852v2.pdf
PWC	https://paperswithcode.com/paper/ss4mct-a-statistical-stemmer-for
Repo
Framework

Lifted Rule Injection for Relation Embeddings


Title	Lifted Rule Injection for Relation Embeddings
Authors	Thomas Demeester, Tim Rocktäschel, Sebastian Riedel
Abstract	Methods based on representation learning currently hold the state-of-the-art in many natural language processing and knowledge base inference tasks. Yet, a major challenge is how to efficiently incorporate commonsense knowledge into such models. A recent approach regularizes relation and entity representations by propositionalization of first-order logic rules. However, propositionalization does not scale beyond domains with only few entities and rules. In this paper we present a highly efficient method for incorporating implication rules into distributed representations for automated knowledge base construction. We map entity-tuple embeddings into an approximately Boolean space and encourage a partial ordering over relation embeddings based on implication rules mined from WordNet. Surprisingly, we find that the strong restriction of the entity-tuple embedding space does not hurt the expressiveness of the model and even acts as a regularizer that improves generalization. By incorporating few commonsense rules, we achieve an increase of 2 percentage points mean average precision over a matrix factorization baseline, while observing a negligible increase in runtime.
Tasks	Representation Learning
Published	2016-06-27
URL	http://arxiv.org/abs/1606.08359v2
PDF	http://arxiv.org/pdf/1606.08359v2.pdf
PWC	https://paperswithcode.com/paper/lifted-rule-injection-for-relation-embeddings
Repo
Framework

Salient Region Detection with Convex Hull Overlap


Title	Salient Region Detection with Convex Hull Overlap
Authors	Yongqing Liang, Cheng Jin, Yuejie Zhang
Abstract	In this paper, we establish a novel bottom-up cue named Convex Hull Overlap (CHO), and then propose an effective approach to detect salient regions using the combination of the CHO cue and global contrast cue. Our scheme significantly differs from other earlier work in: 1) The hierarchical segmentation model based on Normalized Graph-Cut fits the splitting and merging processes in human visual perception; 2) Previous work only focuses on color and texture cues, while our CHO cue makes up the obvious gap between the spatial region covering and the region saliency. CHO is a kind of improved and enhanced Gestalt cue, while other popular figure-ground cues such as convexity and surroundedness can be regarded as the special cases of CHO. Our experiments on a large number of public data have obtained very positive results.
Tasks
Published	2016-12-10
URL	http://arxiv.org/abs/1612.03284v1
PDF	http://arxiv.org/pdf/1612.03284v1.pdf
PWC	https://paperswithcode.com/paper/salient-region-detection-with-convex-hull
Repo
Framework

Extraction of Skin Lesions from Non-Dermoscopic Images Using Deep Learning


Title	Extraction of Skin Lesions from Non-Dermoscopic Images Using Deep Learning
Authors	Mohammad H. Jafari, Ebrahim Nasr-Esfahani, Nader Karimi, S. M. Reza Soroushmehr, Shadrokh Samavi, Kayvan Najarian
Abstract	Melanoma is amongst most aggressive types of cancer. However, it is highly curable if detected in its early stages. Prescreening of suspicious moles and lesions for malignancy is of great importance. Detection can be done by images captured by standard cameras, which are more preferable due to low cost and availability. One important step in computerized evaluation of skin lesions is accurate detection of lesion region, i.e. segmentation of an image into two regions as lesion and normal skin. Accurate segmentation can be challenging due to burdens such as illumination variation and low contrast between lesion and healthy skin. In this paper, a method based on deep neural networks is proposed for accurate extraction of a lesion region. The input image is preprocessed and then its patches are fed to a convolutional neural network (CNN). Local texture and global structure of the patches are processed in order to assign pixels to lesion or normal classes. A method for effective selection of training patches is used for more accurate detection of a lesion border. The output segmentation mask is refined by some post processing operations. The experimental results of qualitative and quantitative evaluations demonstrate that our method can outperform other state-of-the-art algorithms exist in the literature.
Tasks
Published	2016-09-08
URL	http://arxiv.org/abs/1609.02374v1
PDF	http://arxiv.org/pdf/1609.02374v1.pdf
PWC	https://paperswithcode.com/paper/extraction-of-skin-lesions-from-non
Repo
Framework

Stochastic Multidimensional Scaling


Title	Stochastic Multidimensional Scaling
Authors	Ketan Rajawat, Sandeep Kumar
Abstract	Multidimensional scaling (MDS) is a popular dimensionality reduction techniques that has been widely used for network visualization and cooperative localization. However, the traditional stress minimization formulation of MDS necessitates the use of batch optimization algorithms that are not scalable to large-sized problems. This paper considers an alternative stochastic stress minimization framework that is amenable to incremental and distributed solutions. A novel linear-complexity stochastic optimization algorithm is proposed that is provably convergent and simple to implement. The applicability of the proposed algorithm to localization and visualization tasks is also expounded. Extensive tests on synthetic and real datasets demonstrate the efficacy of the proposed algorithm.
Tasks	Dimensionality Reduction, Stochastic Optimization
Published	2016-12-21
URL	http://arxiv.org/abs/1612.07089v1
PDF	http://arxiv.org/pdf/1612.07089v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-multidimensional-scaling
Repo
Framework

Generalized RBF kernel for incomplete data


Title	Generalized RBF kernel for incomplete data
Authors	Łukasz Struski, Marek Śmieja, Jacek Tabor
Abstract	We construct $\bf genRBF$ kernel, which generalizes the classical Gaussian RBF kernel to the case of incomplete data. We model the uncertainty contained in missing attributes making use of data distribution and associate every point with a conditional probability density function. This allows to embed incomplete data into the function space and to define a kernel between two missing data points based on scalar product in $L_2$. Experiments show that introduced kernel applied to SVM classifier gives better results than other state-of-the-art methods, especially in the case when large number of features is missing. Moreover, it is easy to implement and can be used together with any kernel approaches with no additional modifications.
Tasks
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01480v2
PDF	http://arxiv.org/pdf/1612.01480v2.pdf
PWC	https://paperswithcode.com/paper/generalized-rbf-kernel-for-incomplete-data
Repo
Framework

A Framework for Parallel and Distributed Training of Neural Networks


Title	A Framework for Parallel and Distributed Training of Neural Networks
Authors	Simone Scardapane, Paolo Di Lorenzo
Abstract	The aim of this paper is to develop a general framework for training neural networks (NNs) in a distributed environment, where training data is partitioned over a set of agents that communicate with each other through a sparse, possibly time-varying, connectivity pattern. In such distributed scenario, the training problem can be formulated as the (regularized) optimization of a non-convex social cost function, given by the sum of local (non-convex) costs, where each agent contributes with a single error term defined with respect to its local dataset. To devise a flexible and efficient solution, we customize a recently proposed framework for non-convex optimization over networks, which hinges on a (primal) convexification-decomposition technique to handle non-convexity, and a dynamic consensus procedure to diffuse information among the agents. Several typical choices for the training criterion (e.g., squared loss, cross entropy, etc.) and regularization (e.g., $\ell_2$ norm, sparsity inducing penalties, etc.) are included in the framework and explored along the paper. Convergence to a stationary solution of the social non-convex problem is guaranteed under mild assumptions. Additionally, we show a principled way allowing each agent to exploit a possible multi-core architecture (e.g., a local cloud) in order to parallelize its local optimization step, resulting in strategies that are both distributed (across the agents) and parallel (inside each agent) in nature. A comprehensive set of experimental results validate the proposed approach.
Tasks
Published	2016-10-24
URL	http://arxiv.org/abs/1610.07448v3
PDF	http://arxiv.org/pdf/1610.07448v3.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-parallel-and-distributed
Repo
Framework

A Simple, Fast Diverse Decoding Algorithm for Neural Generation


Title	A Simple, Fast Diverse Decoding Algorithm for Neural Generation
Authors	Jiwei Li, Will Monroe, Dan Jurafsky
Abstract	In this paper, we propose a simple, fast decoding algorithm that fosters diversity in neural generation. The algorithm modifies the standard beam search algorithm by adding an inter-sibling ranking penalty, favoring choosing hypotheses from diverse parents. We evaluate the proposed model on the tasks of dialogue response generation, abstractive summarization and machine translation. We find that diverse decoding helps across all tasks, especially those for which reranking is needed. We further propose a variation that is capable of automatically adjusting its diversity decoding rates for different inputs using reinforcement learning (RL). We observe a further performance boost from this RL technique. This paper includes material from the unpublished script “Mutual Information and Diverse Decoding Improve Neural Machine Translation” (Li and Jurafsky, 2016).
Tasks	Abstractive Text Summarization, Machine Translation
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08562v2
PDF	http://arxiv.org/pdf/1611.08562v2.pdf
PWC	https://paperswithcode.com/paper/a-simple-fast-diverse-decoding-algorithm-for
Repo
Framework

A scalable convolutional neural network for task-specified scenarios via knowledge distillation


Title	A scalable convolutional neural network for task-specified scenarios via knowledge distillation
Authors	Mengnan Shi, Fei Qin, Qixiang Ye, Zhenjun Han, Jianbin Jiao
Abstract	In this paper, we explore the redundancy in convolutional neural network, which scales with the complexity of vision tasks. Considering that many front-end visual systems are interested in only a limited range of visual targets, the removing of task-specified network redundancy can promote a wide range of potential applications. We propose a task-specified knowledge distillation algorithm to derive a simplified model with pre-set computation cost and minimized accuracy loss, which suits the resource constraint front-end systems well. Experiments on the MNIST and CIFAR10 datasets demonstrate the feasibility of the proposed approach as well as the existence of task-specified redundancy.
Tasks
Published	2016-09-19
URL	http://arxiv.org/abs/1609.05695v2
PDF	http://arxiv.org/pdf/1609.05695v2.pdf
PWC	https://paperswithcode.com/paper/a-scalable-convolutional-neural-network-for
Repo
Framework

Online and Distributed learning of Gaussian mixture models by Bayesian Moment Matching


Title	Online and Distributed learning of Gaussian mixture models by Bayesian Moment Matching
Authors	Priyank Jaini, Pascal Poupart
Abstract	The Gaussian mixture model is a classic technique for clustering and data modeling that is used in numerous applications. With the rise of big data, there is a need for parameter estimation techniques that can handle streaming data and distribute the computation over several processors. While online variants of the Expectation Maximization (EM) algorithm exist, their data efficiency is reduced by a stochastic approximation of the E-step and it is not clear how to distribute the computation over multiple processors. We propose a Bayesian learning technique that lends itself naturally to online and distributed computation. Since the Bayesian posterior is not tractable, we project it onto a family of tractable distributions after each observation by matching a set of sufficient moments. This Bayesian moment matching technique compares favorably to online EM in terms of time and accuracy on a set of data modeling benchmarks.
Tasks
Published	2016-09-19
URL	http://arxiv.org/abs/1609.05881v1
PDF	http://arxiv.org/pdf/1609.05881v1.pdf
PWC	https://paperswithcode.com/paper/online-and-distributed-learning-of-gaussian
Repo
Framework

ABA+: Assumption-Based Argumentation with Preferences


Title	ABA+: Assumption-Based Argumentation with Preferences
Authors	Kristijonas Čyras, Francesca Toni
Abstract	We present ABA+, a new approach to handling preferences in a well known structured argumentation formalism, Assumption-Based Argumentation (ABA). In ABA+, preference information given over assumptions is incorporated directly into the attack relation, thus resulting in attack reversal. ABA+ conservatively extends ABA and exhibits various desirable features regarding relationship among argumentation semantics as well as preference handling. We also introduce Weak Contraposition, a principle concerning reasoning with rules and preferences that relaxes the standard principle of contraposition, while guaranteeing additional desirable features for ABA+.
Tasks
Published	2016-10-10
URL	http://arxiv.org/abs/1610.03024v2
PDF	http://arxiv.org/pdf/1610.03024v2.pdf
PWC	https://paperswithcode.com/paper/aba-assumption-based-argumentation-with
Repo
Framework

Feature ranking for multi-label classification using Markov Networks


Title	Feature ranking for multi-label classification using Markov Networks
Authors	Paweł Teisseyre
Abstract	We propose a simple and efficient method for ranking features in multi-label classification. The method produces a ranking of features showing their relevance in predicting labels, which in turn allows to choose a final subset of features. The procedure is based on Markov Networks and allows to model the dependencies between labels and features in a direct way. In the first step we build a simple network using only labels and then we test how much adding a single feature affects the initial network. More specifically, in the first step we use the Ising model whereas the second step is based on the score statistic, which allows to test a significance of added features very quickly. The proposed approach does not require transformation of label space, gives interpretable results and allows for attractive visualization of dependency structure. We give a theoretical justification of the procedure by discussing some theoretical properties of the Ising model and the score statistic. We also discuss feature ranking procedure based on fitting Ising model using $l_1$ regularized logistic regressions. Numerical experiments show that the proposed methods outperform the conventional approaches on the considered artificial and real datasets.
Tasks	Multi-Label Classification
Published	2016-02-24
URL	http://arxiv.org/abs/1602.07464v1
PDF	http://arxiv.org/pdf/1602.07464v1.pdf
PWC	https://paperswithcode.com/paper/feature-ranking-for-multi-label
Repo
Framework

Learning Temporal Dependence from Time-Series Data with Latent Variables


Title	Learning Temporal Dependence from Time-Series Data with Latent Variables
Authors	Hossein Hosseini, Sreeram Kannan, Baosen Zhang, Radha Poovendran
Abstract	We consider the setting where a collection of time series, modeled as random processes, evolve in a causal manner, and one is interested in learning the graph governing the relationships of these processes. A special case of wide interest and applicability is the setting where the noise is Gaussian and relationships are Markov and linear. We study this setting with two additional features: firstly, each random process has a hidden (latent) state, which we use to model the internal memory possessed by the variables (similar to hidden Markov models). Secondly, each variable can depend on its latent memory state through a random lag (rather than a fixed lag), thus modeling memory recall with differing lags at distinct times. Under this setting, we develop an estimator and prove that under a genericity assumption, the parameters of the model can be learned consistently. We also propose a practical adaption of this estimator, which demonstrates significant performance gains in both synthetic and real-world datasets.
Tasks	Time Series
Published	2016-08-27
URL	http://arxiv.org/abs/1608.07636v1
PDF	http://arxiv.org/pdf/1608.07636v1.pdf
PWC	https://paperswithcode.com/paper/learning-temporal-dependence-from-time-series
Repo
Framework

Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies


Title	Square Root Graphical Models: Multivariate Generalizations of Univariate Exponential Families that Permit Positive Dependencies
Authors	David I. Inouye, Pradeep Ravikumar, Inderjit S. Dhillon
Abstract	We develop Square Root Graphical Models (SQR), a novel class of parametric graphical models that provides multivariate generalizations of univariate exponential family distributions. Previous multivariate graphical models [Yang et al. 2015] did not allow positive dependencies for the exponential and Poisson generalizations. However, in many real-world datasets, variables clearly have positive dependencies. For example, the airport delay time in New York—modeled as an exponential distribution—is positively related to the delay time in Boston. With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix—a condition akin to the positive definiteness of the Gaussian covariance matrix. Our Poisson generalization allows for both positive and negative dependencies without any constraints on the parameter values. We also develop parameter estimation methods using node-wise regressions with $\ell_1$ regularization and likelihood approximation methods using sampling. Finally, we demonstrate our exponential generalization on a synthetic dataset and a real-world dataset of airport delay times.
Tasks
Published	2016-03-11
URL	http://arxiv.org/abs/1603.03629v2
PDF	http://arxiv.org/pdf/1603.03629v2.pdf
PWC	https://paperswithcode.com/paper/square-root-graphical-models-multivariate
Repo
Framework

Localizing and Orienting Street Views Using Overhead Imagery


Title	Localizing and Orienting Street Views Using Overhead Imagery
Authors	Nam Vo, James Hays
Abstract	In this paper we aim to determine the location and orientation of a ground-level query image by matching to a reference database of overhead (e.g. satellite) images. For this task we collect a new dataset with one million pairs of street view and overhead images sampled from eleven U.S. cities. We explore several deep CNN architectures for cross-domain matching – Classification, Hybrid, Siamese, and Triplet networks. Classification and Hybrid architectures are accurate but slow since they allow only partial feature precomputation. We propose a new loss function which significantly improves the accuracy of Siamese and Triplet embedding networks while maintaining their applicability to large-scale retrieval tasks like image geolocalization. This image matching task is challenging not just because of the dramatic viewpoint difference between ground-level and overhead imagery but because the orientation (i.e. azimuth) of the street views is unknown making correspondence even more difficult. We examine several mechanisms to match in spite of this – training for rotation invariance, sampling possible rotations at query time, and explicitly predicting relative rotation of ground and overhead images with our deep networks. It turns out that explicit orientation supervision also improves location prediction accuracy. Our best performing architectures are roughly 2.5 times as accurate as the commonly used Siamese network baseline.
Tasks
Published	2016-07-30
URL	http://arxiv.org/abs/1608.00161v2
PDF	http://arxiv.org/pdf/1608.00161v2.pdf
PWC	https://paperswithcode.com/paper/localizing-and-orienting-street-views-using
Repo
Framework