July 26, 2019

2989 words 15 mins read

Paper Group ANR 762

From Task Classification Towards Similarity Measures for Recommendation in Crowdsourcing Systems. A First Step in Combining Cognitive Event Features and Natural Language Representations to Predict Emotions. Krylov Subspace Recycling for Fast Iterative Least-Squares in Machine Learning. Beyond Parity: Fairness Objectives for Collaborative Filtering. …

From Task Classification Towards Similarity Measures for Recommendation in Crowdsourcing Systems


Title	From Task Classification Towards Similarity Measures for Recommendation in Crowdsourcing Systems
Authors	Steffen Schnitzer, Svenja Neitzel, Christoph Rensing
Abstract	Task selection in micro-task markets can be supported by recommender systems to help individuals to find appropriate tasks. Previous work showed that for the selection process of a micro-task the semantic aspects, such as the required action and the comprehensibility, are rated more important than factual aspects, such as the payment or the required completion time. This work gives a foundation to create such similarity measures. Therefore, we show that an automatic classification based on task descriptions is possible. Additionally, we propose similarity measures to cluster micro-tasks according to semantic aspects.
Tasks	Recommendation Systems
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06562v1
PDF	http://arxiv.org/pdf/1707.06562v1.pdf
PWC	https://paperswithcode.com/paper/from-task-classification-towards-similarity
Repo
Framework

A First Step in Combining Cognitive Event Features and Natural Language Representations to Predict Emotions


Title	A First Step in Combining Cognitive Event Features and Natural Language Representations to Predict Emotions
Authors	Andres Campero, Bjarke Felbo, Joshua B. Tenenbaum, Rebecca Saxe
Abstract	We explore the representational space of emotions by combining methods from different academic fields. Cognitive science has proposed appraisal theory as a view on human emotion with previous research showing how human-rated abstract event features can predict fine-grained emotions and capture the similarity space of neural patterns in mentalizing brain regions. At the same time, natural language processing (NLP) has demonstrated how transfer and multitask learning can be used to cope with scarcity of annotated data for text modeling. The contribution of this work is to show that appraisal theory can be combined with NLP for mutual benefit. First, fine-grained emotion prediction can be improved to human-level performance by using NLP representations in addition to appraisal features. Second, using the appraisal features as auxiliary targets during training can improve predictions even when only text is available as input. Third, we obtain a representation with a similarity matrix that better correlates with the neural activity across regions. Best results are achieved when the model is trained to simultaneously predict appraisals, emotions and emojis using a shared representation. While these results are preliminary, the integration of cognitive neuroscience and NLP techniques opens up an interesting direction for future research.
Tasks
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08048v1
PDF	http://arxiv.org/pdf/1710.08048v1.pdf
PWC	https://paperswithcode.com/paper/a-first-step-in-combining-cognitive-event
Repo
Framework

Krylov Subspace Recycling for Fast Iterative Least-Squares in Machine Learning


Title	Krylov Subspace Recycling for Fast Iterative Least-Squares in Machine Learning
Authors	Filip de Roos, Philipp Hennig
Abstract	Solving symmetric positive definite linear problems is a fundamental computational task in machine learning. The exact solution, famously, is cubicly expensive in the size of the matrix. To alleviate this problem, several linear-time approximations, such as spectral and inducing-point methods, have been suggested and are now in wide use. These are low-rank approximations that choose the low-rank space a priori and do not refine it over time. While this allows linear cost in the data-set size, it also causes a finite, uncorrected approximation error. Authors from numerical linear algebra have explored ways to iteratively refine such low-rank approximations, at a cost of a small number of matrix-vector multiplications. This idea is particularly interesting in the many situations in machine learning where one has to solve a sequence of related symmetric positive definite linear problems. From the machine learning perspective, such deflation methods can be interpreted as transfer learning of a low-rank approximation across a time-series of numerical tasks. We study the use of such methods for our field. Our empirical results show that, on regression and classification problems of intermediate size, this approach can interpolate between low computational cost and numerical precision.
Tasks	Time Series, Transfer Learning
Published	2017-06-01
URL	http://arxiv.org/abs/1706.00241v1
PDF	http://arxiv.org/pdf/1706.00241v1.pdf
PWC	https://paperswithcode.com/paper/krylov-subspace-recycling-for-fast-iterative
Repo
Framework

Beyond Parity: Fairness Objectives for Collaborative Filtering


Title	Beyond Parity: Fairness Objectives for Collaborative Filtering
Authors	Sirui Yao, Bert Huang
Abstract	We study fairness in collaborative-filtering recommender systems, which are sensitive to discrimination that exists in historical data. Biased data can lead collaborative-filtering methods to make unfair predictions for users from minority groups. We identify the insufficiency of existing fairness metrics and propose four new metrics that address different forms of unfairness. These fairness metrics can be optimized by adding fairness terms to the learning objective. Experiments on synthetic and real data show that our new metrics can better measure fairness than the baseline, and that the fairness objectives effectively help reduce unfairness.
Tasks	Recommendation Systems
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08804v2
PDF	http://arxiv.org/pdf/1705.08804v2.pdf
PWC	https://paperswithcode.com/paper/beyond-parity-fairness-objectives-for
Repo
Framework

Learning Structured Natural Language Representations for Semantic Parsing


Title	Learning Structured Natural Language Representations for Semantic Parsing
Authors	Jianpeng Cheng, Siva Reddy, Vijay Saraswat, Mirella Lapata
Abstract	We introduce a neural semantic parser that converts natural language utterances to intermediate representations in the form of predicate-argument structures, which are induced with a transition system and subsequently mapped to target domains. The semantic parser is trained end-to-end using annotated logical forms or their denotations. We obtain competitive results on various datasets. The induced predicate-argument structures shed light on the types of representations useful for semantic parsing and how these are different from linguistically motivated ones.
Tasks	Semantic Parsing
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08387v3
PDF	http://arxiv.org/pdf/1704.08387v3.pdf
PWC	https://paperswithcode.com/paper/learning-structured-natural-language
Repo
Framework

A Faster Patch Ordering Method for Image Denoising


Title	A Faster Patch Ordering Method for Image Denoising
Authors	Badre Munir
Abstract	Among the patch-based image denoising processing methods, smooth ordering of local patches (patch ordering) has been shown to give state-of-art results. For image denoising the patch ordering method forms two large TSPs (Traveling Salesman Problem) comprised of nodes in N-dimensional space. Ten approximate solutions of the two large TSPs are then used in a filtering process to form the reconstructed image. Use of large TSPs makes patch ordering a computationally intensive method. A modified patch ordering method for image denoising is proposed. In the proposed method, several smaller-sized TSPs are formed and the filtering process varied to work with solutions of these smaller TSPs. In terms of PSNR, denoising results of the proposed method differed by 0.032 dB to 0.016 dB on average. In original method, solving TSPs was observed to consume 85% of execution time. In proposed method, the time for solving TSPs can be reduced to half of the time required in original method. The proposed method can denoise images in 40% less time.
Tasks	Denoising, Image Denoising
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08090v1
PDF	http://arxiv.org/pdf/1704.08090v1.pdf
PWC	https://paperswithcode.com/paper/a-faster-patch-ordering-method-for-image
Repo
Framework

Methods for Detecting Paraphrase Plagiarism


Title	Methods for Detecting Paraphrase Plagiarism
Authors	Victor Thompson
Abstract	Paraphrase plagiarism is one of the difficult challenges facing plagiarism detection systems. Paraphrasing occur when texts are lexically or syntactically altered to look different, but retain their original meaning. Most plagiarism detection systems (many of which are commercial based) are designed to detect word co-occurrences and light modifications, but are unable to detect severe semantic and structural alterations such as what is seen in many academic documents. Hence many paraphrase plagiarism cases go undetected. In this paper, we approached the problem of paraphrase plagiarism by proposing methods for detecting the most common techniques (phenomena) used in paraphrasing texts (namely; lexical substitution, insertion/deletion and word and phrase reordering), and combined the methods into a paraphrase detection model. We evaluated our proposed methods and model on collections containing paraphrase texts. Experimental results show significant improvement in performance when the methods were combined (the proposed model) as opposed to running them individually. The results also show that the proposed paraphrase detection model outperformed a standard baseline (based on greedy string tilling), and previous studies.
Tasks
Published	2017-12-29
URL	http://arxiv.org/abs/1712.10309v1
PDF	http://arxiv.org/pdf/1712.10309v1.pdf
PWC	https://paperswithcode.com/paper/methods-for-detecting-paraphrase-plagiarism
Repo
Framework

Polish Read Speech Corpus for Speech Tools and Services


Title	Polish Read Speech Corpus for Speech Tools and Services
Authors	Danijel Koržinek, Krzysztof Marasek, Łukasz Brocki, Krzysztof Wołk
Abstract	This paper describes the speech processing activities conducted at the Polish consortium of the CLARIN project. The purpose of this segment of the project was to develop specific tools that would allow for automatic and semi-automatic processing of large quantities of acoustic speech data. The tools include the following: grapheme-to-phoneme conversion, speech-to-text alignment, voice activity detection, speaker diarization, keyword spotting and automatic speech transcription. Furthermore, in order to develop these tools, a large high-quality studio speech corpus was recorded and released under an open license, to encourage development in the area of Polish speech research. Another purpose of the corpus was to serve as a reference for studies in phonetics and pronunciation. All the tools and resources were released on the the Polish CLARIN website. This paper discusses the current status and future plans for the project.
Tasks	Action Detection, Activity Detection, Keyword Spotting, Speaker Diarization
Published	2017-06-01
URL	http://arxiv.org/abs/1706.00245v1
PDF	http://arxiv.org/pdf/1706.00245v1.pdf
PWC	https://paperswithcode.com/paper/polish-read-speech-corpus-for-speech-tools
Repo
Framework

Attend and Diagnose: Clinical Time Series Analysis using Attention Models


Title	Attend and Diagnose: Clinical Time Series Analysis using Attention Models
Authors	Huan Song, Deepta Rajan, Jayaraman J. Thiagarajan, Andreas Spanias
Abstract	With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNNs, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the \textit{SAnD} (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of \textit{SAnD} to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.
Tasks	Time Series, Time Series Analysis
Published	2017-11-10
URL	http://arxiv.org/abs/1711.03905v2
PDF	http://arxiv.org/pdf/1711.03905v2.pdf
PWC	https://paperswithcode.com/paper/attend-and-diagnose-clinical-time-series
Repo
Framework

On the Relation between Color Image Denoising and Classification


Title	On the Relation between Color Image Denoising and Classification
Authors	Jiqing Wu, Radu Timofte, Zhiwu Huang, Luc Van Gool
Abstract	Large amount of image denoising literature focuses on single channel images and often experimentally validates the proposed methods on tens of images at most. In this paper, we investigate the interaction between denoising and classification on large scale dataset. Inspired by classification models, we propose a novel deep learning architecture for color (multichannel) image denoising and report on thousands of images from ImageNet dataset as well as commonly used imagery. We study the importance of (sufficient) training data, how semantic class information can be traded for improved denoising results. As a result, our method greatly improves PSNR performance by 0.34 - 0.51 dB on average over state-of-the art methods on large scale dataset. We conclude that it is beneficial to incorporate in classification models. On the other hand, we also study how noise affect classification performance. In the end, we come to a number of interesting conclusions, some being counter-intuitive.
Tasks	Denoising, Image Denoising
Published	2017-04-05
URL	http://arxiv.org/abs/1704.01372v1
PDF	http://arxiv.org/pdf/1704.01372v1.pdf
PWC	https://paperswithcode.com/paper/on-the-relation-between-color-image-denoising
Repo
Framework

How to Read Many-Objective Solution Sets in Parallel Coordinates


Title	How to Read Many-Objective Solution Sets in Parallel Coordinates
Authors	Miqing Li, Liangli Zhen, Xin Yao
Abstract	Rapid development of evolutionary algorithms in handling many-objective optimization problems requires viable methods of visualizing a high-dimensional solution set. Parallel coordinates which scale well to high-dimensional data are such a method, and have been frequently used in evolutionary many-objective optimization. However, the parallel coordinates plot is not as straightforward as the classic scatter plot to present the information contained in a solution set. In this paper, we make some observations of the parallel coordinates plot, in terms of comparing the quality of solution sets, understanding the shape and distribution of a solution set, and reflecting the relation between objectives. We hope that these observations could provide some guidelines as to the proper use of parallel coordinates in evolutionary many-objective optimization.
Tasks
Published	2017-04-30
URL	http://arxiv.org/abs/1705.00368v1
PDF	http://arxiv.org/pdf/1705.00368v1.pdf
PWC	https://paperswithcode.com/paper/how-to-read-many-objective-solution-sets-in
Repo
Framework

Learning a collaborative multiscale dictionary based on robust empirical mode decomposition


Title	Learning a collaborative multiscale dictionary based on robust empirical mode decomposition
Authors	Rui Chen, Huizhu Jia, Xiaodong Xie, Wen Gao
Abstract	Dictionary learning is a challenge topic in many image processing areas. The basic goal is to learn a sparse representation from an overcomplete basis set. Due to combining the advantages of generic multiscale representations with learning based adaptivity, multiscale dictionary representation approaches have the power in capturing structural characteristics of natural images. However, existing multiscale learning approaches still suffer from three main weaknesses: inadaptability to diverse scales of image data, sensitivity to noise and outliers, difficulty to determine optimal dictionary structure. In this paper, we present a novel multiscale dictionary learning paradigm for sparse image representations based on an improved empirical mode decomposition. This powerful data-driven analysis tool for multi-dimensional signal can fully adaptively decompose the image into multiscale oscillating components according to intrinsic modes of data self. This treatment can obtain a robust and effective sparse representation, and meanwhile generates a raw base dictionary at multiple geometric scales and spatial frequency bands. This dictionary is refined by selecting optimal oscillating atoms based on frequency clustering. In order to further enhance sparsity and generalization, a tolerance dictionary is learned using a coherence regularized model. A fast proximal scheme is developed to optimize this model. The multiscale dictionary is considered as the product of oscillating dictionary and tolerance dictionary. Experimental results demonstrate that the proposed learning approach has the superior performance in sparse image representations as compared with several competing methods. We also show the promising results in image denoising application.
Tasks	Denoising, Dictionary Learning, Image Denoising
Published	2017-04-04
URL	http://arxiv.org/abs/1704.04422v1
PDF	http://arxiv.org/pdf/1704.04422v1.pdf
PWC	https://paperswithcode.com/paper/learning-a-collaborative-multiscale
Repo
Framework

Material Classification using Neural Networks


Title	Material Classification using Neural Networks
Authors	Anca Sticlaru
Abstract	The recognition and classification of the diversity of materials that exist in the environment around us are a key visual competence that computer vision systems focus on in recent years. Understanding the identification of materials in distinct images involves a deep process that has made usage of the recent progress in neural networks which has brought the potential to train architectures to extract features for this challenging task. This project uses state-of-the-art Convolutional Neural Network (CNN) techniques and Support Vector Machine (SVM) classifiers in order to classify materials and analyze the results. Building on various widely used material databases collected, a selection of CNN architectures is evaluated to understand which is the best approach to extract features in order to achieve outstanding results for the task. The results gathered over four material datasets and nine CNNs outline that the best overall performance of a CNN using a linear SVM can achieve up to ~92.5% mean average precision, while applying a new relevant direction in computer vision, transfer learning. By limiting the amount of information extracted from the layer before the last fully connected layer, transfer learning aims at analyzing the contribution of shading information and reflectance to identify which main characteristics decide the material category the image belongs to. In addition to the main topic of my project, the evaluation of the nine different CNN architectures, it is questioned if, by using the transfer learning instead of extracting the information from the last convolutional layer, the total accuracy of the system created improves. The results of the comparison emphasize the fact that the accuracy and performance of the system improve, especially in the datasets which consist of a large number of images.
Tasks	Material Classification, Transfer Learning
Published	2017-10-17
URL	http://arxiv.org/abs/1710.06854v1
PDF	http://arxiv.org/pdf/1710.06854v1.pdf
PWC	https://paperswithcode.com/paper/material-classification-using-neural-networks
Repo
Framework

On Singleton Arc Consistency for CSPs Defined by Monotone Patterns


Title	On Singleton Arc Consistency for CSPs Defined by Monotone Patterns
Authors	Clement Carbonnel, David A. Cohen, Martin C. Cooper, Stanislav Zivny
Abstract	Singleton arc consistency is an important type of local consistency which has been recently shown to solve all constraint satisfaction problems (CSPs) over constraint languages of bounded width. We aim to characterise all classes of CSPs defined by a forbidden pattern that are solved by singleton arc consistency and closed under removing constraints. We identify five new patterns whose absence ensures solvability by singleton arc consistency, four of which are provably maximal and three of which generalise 2-SAT. Combined with simple counter-examples for other patterns, we make significant progress towards a complete classification.
Tasks
Published	2017-04-20
URL	http://arxiv.org/abs/1704.06215v4
PDF	http://arxiv.org/pdf/1704.06215v4.pdf
PWC	https://paperswithcode.com/paper/on-singleton-arc-consistency-for-csps-defined
Repo
Framework

Differentially Private Empirical Risk Minimization with Input Perturbation


Title	Differentially Private Empirical Risk Minimization with Input Perturbation
Authors	Kazuto Fukuchi, Quang Khai Tran, Jun Sakuma
Abstract	We propose a novel framework for the differentially private ERM, input perturbation. Existing differentially private ERM implicitly assumed that the data contributors submit their private data to a database expecting that the database invokes a differentially private mechanism for publication of the learned model. In input perturbation, each data contributor independently randomizes her/his data by itself and submits the perturbed data to the database. We show that the input perturbation framework theoretically guarantees that the model learned with the randomized data eventually satisfies differential privacy with the prescribed privacy parameters. At the same time, input perturbation guarantees that local differential privacy is guaranteed to the server. We also show that the excess risk bound of the model learned with input perturbation is $O(1/n)$ under a certain condition, where $n$ is the sample size. This is the same as the excess risk bound of the state-of-the-art.
Tasks
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07425v1
PDF	http://arxiv.org/pdf/1710.07425v1.pdf
PWC	https://paperswithcode.com/paper/differentially-private-empirical-risk
Repo
Framework