May 5, 2019

3198 words 16 mins read

Paper Group ANR 471

Paper Group ANR 471

Local Sparse Approximation for Image Restoration with Adaptive Block Size Selection. Data Collection for Interactive Learning through the Dialog. Quantifying Radiographic Knee Osteoarthritis Severity using Deep Convolutional Neural Networks. Textual Paralanguage and its Implications for Marketing Communications. Changepoint Detection in the Presenc …

Local Sparse Approximation for Image Restoration with Adaptive Block Size Selection

Title Local Sparse Approximation for Image Restoration with Adaptive Block Size Selection
Authors Sujit Kumar Sahoo
Abstract In this paper the problem of image restoration (denoising and inpainting) is approached using sparse approximation of local image blocks. The local image blocks are extracted by sliding square windows over the image. An adaptive block size selection procedure for local sparse approximation is proposed, which affects the global recovery of underlying image. Ideally the adaptive local block selection yields the minimum mean square error (MMSE) in recovered image. This framework gives us a clustered image based on the selected block size, then each cluster is restored separately using sparse approximation. The results obtained using the proposed framework are very much comparable with the recently proposed image restoration techniques.
Tasks Denoising, Image Restoration
Published 2016-12-20
URL http://arxiv.org/abs/1612.06738v1
PDF http://arxiv.org/pdf/1612.06738v1.pdf
PWC https://paperswithcode.com/paper/local-sparse-approximation-for-image
Repo
Framework

Data Collection for Interactive Learning through the Dialog

Title Data Collection for Interactive Learning through the Dialog
Authors Miroslav Vodolán, Filip Jurčíček
Abstract This paper presents a dataset collected from natural dialogs which enables to test the ability of dialog systems to learn new facts from user utterances throughout the dialog. This interactive learning will help with one of the most prevailing problems of open domain dialog system, which is the sparsity of facts a dialog system can reason about. The proposed dataset, consisting of 1900 collected dialogs, allows simulation of an interactive gaining of denotations and questions explanations from users which can be used for the interactive learning.
Tasks
Published 2016-03-31
URL http://arxiv.org/abs/1603.09631v2
PDF http://arxiv.org/pdf/1603.09631v2.pdf
PWC https://paperswithcode.com/paper/data-collection-for-interactive-learning
Repo
Framework

Quantifying Radiographic Knee Osteoarthritis Severity using Deep Convolutional Neural Networks

Title Quantifying Radiographic Knee Osteoarthritis Severity using Deep Convolutional Neural Networks
Authors Joseph Antony, Kevin McGuinness, Noel E O Connor, Kieran Moran
Abstract This paper proposes a new approach to automatically quantify the severity of knee osteoarthritis (OA) from radiographs using deep convolutional neural networks (CNN). Clinically, knee OA severity is assessed using Kellgren & Lawrence (KL) grades, a five point scale. Previous work on automatically predicting KL grades from radiograph images were based on training shallow classifiers using a variety of hand engineered features. We demonstrate that classification accuracy can be significantly improved using deep convolutional neural network models pre-trained on ImageNet and fine-tuned on knee OA images. Furthermore, we argue that it is more appropriate to assess the accuracy of automatic knee OA severity predictions using a continuous distance-based evaluation metric like mean squared error than it is to use classification accuracy. This leads to the formulation of the prediction of KL grades as a regression problem and further improves accuracy. Results on a dataset of X-ray images and KL grades from the Osteoarthritis Initiative (OAI) show a sizable improvement over the current state-of-the-art.
Tasks
Published 2016-09-08
URL http://arxiv.org/abs/1609.02469v1
PDF http://arxiv.org/pdf/1609.02469v1.pdf
PWC https://paperswithcode.com/paper/quantifying-radiographic-knee-osteoarthritis
Repo
Framework

Textual Paralanguage and its Implications for Marketing Communications

Title Textual Paralanguage and its Implications for Marketing Communications
Authors Andrea Webb Luangrath, Joann Peck, Victor A. Barger
Abstract Both face-to-face communication and communication in online environments convey information beyond the actual verbal message. In a traditional face-to-face conversation, paralanguage, or the ancillary meaning- and emotion-laden aspects of speech that are not actual verbal prose, gives contextual information that allows interactors to more appropriately understand the message being conveyed. In this paper, we conceptualize textual paralanguage (TPL), which we define as written manifestations of nonverbal audible, tactile, and visual elements that supplement or replace written language and that can be expressed through words, symbols, images, punctuation, demarcations, or any combination of these elements. We develop a typology of textual paralanguage using data from Twitter, Facebook, and Instagram. We present a conceptual framework of antecedents and consequences of brands’ use of textual paralanguage. Implications for theory and practice are discussed.
Tasks
Published 2016-05-22
URL http://arxiv.org/abs/1605.06799v1
PDF http://arxiv.org/pdf/1605.06799v1.pdf
PWC https://paperswithcode.com/paper/textual-paralanguage-and-its-implications-for
Repo
Framework

Changepoint Detection in the Presence of Outliers

Title Changepoint Detection in the Presence of Outliers
Authors Paul Fearnhead, Guillem Rigaill
Abstract Many traditional methods for identifying changepoints can struggle in the presence of outliers, or when the noise is heavy-tailed. Often they will infer additional changepoints in order to fit the outliers. To overcome this problem, data often needs to be pre-processed to remove outliers, though this is difficult for applications where the data needs to be analysed online. We present an approach to changepoint detection that is robust to the presence of outliers. The idea is to adapt existing penalised cost approaches for detecting changes so that they use loss functions that are less sensitive to outliers. We argue that loss functions that are bounded, such as the classical biweight loss, are particularly suitable – as we show that only bounded loss functions are robust to arbitrarily extreme outliers. We present an efficient dynamic programming algorithm that can find the optimal segmentation under our penalised cost criteria. Importantly, this algorithm can be used in settings where the data needs to be analysed online. We show that we can consistently estimate the number of changepoints, and accurately estimate their locations, using the biweight loss function. We demonstrate the usefulness of our approach for applications such as analysing well-log data, detecting copy number variation, and detecting tampering of wireless devices.
Tasks
Published 2016-09-23
URL http://arxiv.org/abs/1609.07363v2
PDF http://arxiv.org/pdf/1609.07363v2.pdf
PWC https://paperswithcode.com/paper/changepoint-detection-in-the-presence-of
Repo
Framework

Tracking Words in Chinese Poetry of Tang and Song Dynasties with the China Biographical Database

Title Tracking Words in Chinese Poetry of Tang and Song Dynasties with the China Biographical Database
Authors Chao-Lin Liu, Kuo-Feng Luo
Abstract Large-scale comparisons between the poetry of Tang and Song dynasties shed light on how words, collocations, and expressions were used and shared among the poets. That some words were used only in the Tang poetry and some only in the Song poetry could lead to interesting research in linguistics. That the most frequent colors are different in the Tang and Song poetry provides a trace of the changing social circumstances in the dynasties. Results of the current work link to research topics of lexicography, semantics, and social transitions. We discuss our findings and present our algorithms for efficient comparisons among the poems, which are crucial for completing billion times of comparisons within acceptable time.
Tasks
Published 2016-11-19
URL http://arxiv.org/abs/1611.06320v2
PDF http://arxiv.org/pdf/1611.06320v2.pdf
PWC https://paperswithcode.com/paper/tracking-words-in-chinese-poetry-of-tang-and
Repo
Framework

The Pessimistic Limits and Possibilities of Margin-based Losses in Semi-supervised Learning

Title The Pessimistic Limits and Possibilities of Margin-based Losses in Semi-supervised Learning
Authors Jesse H. Krijthe, Marco Loog
Abstract Consider a classification problem where we have both labeled and unlabeled data available. We show that for linear classifiers defined by convex margin-based surrogate losses that are decreasing, it is impossible to construct any semi-supervised approach that is able to guarantee an improvement over the supervised classifier measured by this surrogate loss on the labeled and unlabeled data. For convex margin-based loss functions that also increase, we demonstrate safe improvements are possible.
Tasks
Published 2016-12-28
URL http://arxiv.org/abs/1612.08875v3
PDF http://arxiv.org/pdf/1612.08875v3.pdf
PWC https://paperswithcode.com/paper/the-pessimistic-limits-and-possibilities-of
Repo
Framework

Elicitation for Preferences Single Peaked on Trees

Title Elicitation for Preferences Single Peaked on Trees
Authors Palash Dey, Neeldhara Misra
Abstract In multiagent systems, we often have a set of agents each of which have a preference ordering over a set of items and one would like to know these preference orderings for various tasks, for example, data analysis, preference aggregation, voting etc. However, we often have a large number of items which makes it impractical to ask the agents for their complete preference ordering. In such scenarios, we usually elicit these agents’ preferences by asking (a hopefully small number of) comparison queries — asking an agent to compare two items. Prior works on preference elicitation focus on unrestricted domain and the domain of single peaked preferences and show that the preferences in single peaked domain can be elicited by much less number of queries compared to unrestricted domain. We extend this line of research and study preference elicitation for single peaked preferences on trees which is a strict superset of the domain of single peaked preferences. We show that the query complexity crucially depends on the number of leaves, the path cover number, and the distance from path of the underlying single peaked tree, whereas the other natural parameters like maximum degree, diameter, pathwidth do not play any direct role in determining query complexity. We then investigate the query complexity for finding a weak Condorcet winner for preferences single peaked on a tree and show that this task has much less query complexity than preference elicitation. Here again we observe that the number of leaves in the underlying single peaked tree and the path cover number of the tree influence the query complexity of the problem.
Tasks
Published 2016-04-15
URL http://arxiv.org/abs/1604.04403v1
PDF http://arxiv.org/pdf/1604.04403v1.pdf
PWC https://paperswithcode.com/paper/elicitation-for-preferences-single-peaked-on
Repo
Framework

Active Ranking from Pairwise Comparisons and when Parametric Assumptions Don’t Help

Title Active Ranking from Pairwise Comparisons and when Parametric Assumptions Don’t Help
Authors Reinhard Heckel, Nihar B. Shah, Kannan Ramchandran, Martin J. Wainwright
Abstract We consider sequential or active ranking of a set of n items based on noisy pairwise comparisons. Items are ranked according to the probability that a given item beats a randomly chosen item, and ranking refers to partitioning the items into sets of pre-specified sizes according to their scores. This notion of ranking includes as special cases the identification of the top-k items and the total ordering of the items. We first analyze a sequential ranking algorithm that counts the number of comparisons won, and uses these counts to decide whether to stop, or to compare another pair of items, chosen based on confidence intervals specified by the data collected up to that point. We prove that this algorithm succeeds in recovering the ranking using a number of comparisons that is optimal up to logarithmic factors. This guarantee does not require any structural properties of the underlying pairwise probability matrix, unlike a significant body of past work on pairwise ranking based on parametric models such as the Thurstone or Bradley-Terry-Luce models. It has been a long-standing open question as to whether or not imposing these parametric assumptions allows for improved ranking algorithms. For stochastic comparison models, in which the pairwise probabilities are bounded away from zero, our second contribution is to resolve this issue by proving a lower bound for parametric models. This shows, perhaps surprisingly, that these popular parametric modeling choices offer at most logarithmic gains for stochastic comparisons.
Tasks
Published 2016-06-28
URL http://arxiv.org/abs/1606.08842v2
PDF http://arxiv.org/pdf/1606.08842v2.pdf
PWC https://paperswithcode.com/paper/active-ranking-from-pairwise-comparisons-and
Repo
Framework

Applications of Data Mining (DM) in Science and Engineering: State of the art and perspectives

Title Applications of Data Mining (DM) in Science and Engineering: State of the art and perspectives
Authors Jose A. García Gutiérrez
Abstract The continuous increase in the availability of data of any kind, coupled with the development of networks of high-speed communications, the popularization of cloud computing and the growth of data centers and the emergence of high-performance computing does essential the task to develop techniques that allow more efficient data processing and analyzing of large volumes datasets and extraction of valuable information. In the following pages we will discuss about development of this field in recent decades, and its potential and applicability present in the various branches of scientific research. Also, we try to review briefly the different families of algorithms that are included in data mining research area, its scalability with increasing dimensionality of the input data and how they can be addressed and what behavior different methods in a scenario in which the information is distributed or decentralized processed so as to increment performance optimization in heterogeneous environments.
Tasks
Published 2016-09-17
URL http://arxiv.org/abs/1609.05401v1
PDF http://arxiv.org/pdf/1609.05401v1.pdf
PWC https://paperswithcode.com/paper/applications-of-data-mining-dm-in-science-and
Repo
Framework

Improving training of deep neural networks via Singular Value Bounding

Title Improving training of deep neural networks via Singular Value Bounding
Authors Kui Jia
Abstract Deep learning methods achieve great success recently on many computer vision problems, with image classification and object detection as the prominent examples. In spite of these practical successes, optimization of deep networks remains an active topic in deep learning research. In this work, we focus on investigation of the network solution properties that can potentially lead to good performance. Our research is inspired by theoretical and empirical results that use orthogonal matrices to initialize networks, but we are interested in investigating how orthogonal weight matrices perform when network training converges. To this end, we propose to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training, and achieve this by a simple yet effective method called Singular Value Bounding (SVB). In SVB, all singular values of each weight matrix are simply bounded in a narrow band around the value of 1. Based on the same motivation, we also propose Bounded Batch Normalization (BBN), which improves Batch Normalization by removing its potential risk of ill-conditioned layer transform. We present both theoretical and empirical results to justify our proposed methods. Experiments on benchmark image classification datasets show the efficacy of our proposed SVB and BBN. In particular, we achieve the state-of-the-art results of 3.06% error rate on CIFAR10 and 16.90% on CIFAR100, using off-the-shelf network architectures (Wide ResNets). Our preliminary results on ImageNet also show the promise in large-scale learning.
Tasks Image Classification, Object Detection
Published 2016-11-18
URL http://arxiv.org/abs/1611.06013v3
PDF http://arxiv.org/pdf/1611.06013v3.pdf
PWC https://paperswithcode.com/paper/improving-training-of-deep-neural-networks
Repo
Framework

Input Aggregated Network for Face Video Representation

Title Input Aggregated Network for Face Video Representation
Authors Zhen Dong, Su Jia, Chi Zhang, Mingtao Pei
Abstract Recently, deep neural network has shown promising performance in face image recognition. The inputs of most networks are face images, and there is hardly any work reported in literature on network with face videos as input. To sufficiently discover the useful information contained in face videos, we present a novel network architecture called input aggregated network which is able to learn fixed-length representations for variable-length face videos. To accomplish this goal, an aggregation unit is designed to model a face video with various frames as a point on a Riemannian manifold, and the mapping unit aims at mapping the point into high-dimensional space where face videos belonging to the same subject are close-by and others are distant. These two units together with the frame representation unit build an end-to-end learning system which can learn representations of face videos for the specific tasks. Experiments on two public face video datasets demonstrate the effectiveness of the proposed network.
Tasks
Published 2016-03-22
URL http://arxiv.org/abs/1603.06655v1
PDF http://arxiv.org/pdf/1603.06655v1.pdf
PWC https://paperswithcode.com/paper/input-aggregated-network-for-face-video
Repo
Framework

Left-corner Methods for Syntactic Modeling with Universal Structural Constraints

Title Left-corner Methods for Syntactic Modeling with Universal Structural Constraints
Authors Hiroshi Noji
Abstract The primary goal in this thesis is to identify better syntactic constraint or bias, that is language independent but also efficiently exploitable during sentence processing. We focus on a particular syntactic construction called center-embedding, which is well studied in psycholinguistics and noted to cause particular difficulty for comprehension. Since people use language as a tool for communication, one expects such complex constructions to be avoided for communication efficiency. From a computational perspective, center-embedding is closely relevant to a left-corner parsing algorithm, which can capture the degree of center-embedding of a parse tree being constructed. This connection suggests left-corner methods can be a tool to exploit the universal syntactic constraint that people avoid generating center-embedded structures. We explore such utilities of center-embedding as well as left-corner methods extensively through several theoretical and empirical examinations. Our primary task is unsupervised grammar induction. In this task, the input to the algorithm is a collection of sentences, from which the model tries to extract the salient patterns on them as a grammar. This is a particularly hard problem although we expect the universal constraint may help in improving the performance since it can effectively restrict the possible search space for the model. We build the model by extending the left-corner parsing algorithm for efficiently tabulating the search space except those involving center-embedding up to a specific degree. We examine the effectiveness of our approach on many treebanks, and demonstrate that often our constraint leads to better parsing performance. We thus conclude that left-corner methods are particularly useful for syntax-oriented systems, as it can exploit efficiently the inherent universal constraints in languages.
Tasks
Published 2016-08-01
URL http://arxiv.org/abs/1608.00293v1
PDF http://arxiv.org/pdf/1608.00293v1.pdf
PWC https://paperswithcode.com/paper/left-corner-methods-for-syntactic-modeling
Repo
Framework

On Choosing Training and Testing Data for Supervised Algorithms in Ground Penetrating Radar Data for Buried Threat Detection

Title On Choosing Training and Testing Data for Supervised Algorithms in Ground Penetrating Radar Data for Buried Threat Detection
Authors Daniël Reichman, Leslie M. Collins, Jordan M. Malof
Abstract Ground penetrating radar (GPR) is one of the most popular and successful sensing modalities that has been investigated for landmine and subsurface threat detection. Many of the detection algorithms applied to this task are supervised and therefore require labeled examples of target and non-target data for training. Training data most often consists of 2-dimensional images (or patches) of GPR data, from which features are extracted, and provided to the classifier during training and testing. Identifying desirable training and testing locations to extract patches, which we term “keypoints”, is well established in the literature. In contrast however, a large variety of strategies have been proposed regarding keypoint utilization (e.g., how many of the identified keypoints should be used at targets, or non-target, locations). Given the variety keypoint utilization strategies that are available, it is very unclear (i) which strategies are best, or (ii) whether the choice of strategy has a large impact on classifier performance. We address these questions by presenting a taxonomy of existing utilization strategies, and then evaluating their effectiveness on a large dataset using many different classifiers and features. We analyze the results and propose a new strategy, called PatchSelect, which outperforms other strategies across all experiments.
Tasks
Published 2016-12-11
URL http://arxiv.org/abs/1612.03477v1
PDF http://arxiv.org/pdf/1612.03477v1.pdf
PWC https://paperswithcode.com/paper/on-choosing-training-and-testing-data-for
Repo
Framework

Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution

Title Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution
Authors Ting Liu, Yiming Cui, Qingyu Yin, Weinan Zhang, Shijin Wang, Guoping Hu
Abstract Most existing approaches for zero pronoun resolution are heavily relying on annotated data, which is often released by shared task organizers. Therefore, the lack of annotated data becomes a major obstacle in the progress of zero pronoun resolution task. Also, it is expensive to spend manpower on labeling the data for better performance. To alleviate the problem above, in this paper, we propose a simple but novel approach to automatically generate large-scale pseudo training data for zero pronoun resolution. Furthermore, we successfully transfer the cloze-style reading comprehension neural network model into zero pronoun resolution task and propose a two-step training mechanism to overcome the gap between the pseudo training data and the real one. Experimental results show that the proposed approach significantly outperforms the state-of-the-art systems with an absolute improvements of 3.1% F-score on OntoNotes 5.0 data.
Tasks Reading Comprehension
Published 2016-06-06
URL http://arxiv.org/abs/1606.01603v3
PDF http://arxiv.org/pdf/1606.01603v3.pdf
PWC https://paperswithcode.com/paper/generating-and-exploiting-large-scale-pseudo
Repo
Framework
comments powered by Disqus