February 2, 2020

2645 words 13 mins read

Paper Group AWR 52

Paper Group AWR 52

Word Similarity Datasets for Thai: Construction and Evaluation. COS960: A Chinese Word Similarity Dataset of 960 Word Pairs. A Resource for Computational Experiments on Mapudungun. Bottom-up Object Detection by Grouping Extreme and Center Points. Common Voice: A Massively-Multilingual Speech Corpus. Generalized Presentation Attack Detection: a face …

Word Similarity Datasets for Thai: Construction and Evaluation

Title Word Similarity Datasets for Thai: Construction and Evaluation
Authors Ponrudee Netisopakul, Gerhard Wohlgenannt, Aleksei Pulich
Abstract Distributional semantics in the form of word embeddings are an essential ingredient to many modern natural language processing systems. The quantification of semantic similarity between words can be used to evaluate the ability of a system to perform semantic interpretation. To this end, a number of word similarity datasets have been created for the English language over the last decades. For Thai language few such resources are available. In this work, we create three Thai word similarity datasets by translating and re-rating the popular WordSim-353, SimLex-999 and SemEval-2017-Task-2 datasets. The three datasets contain 1852 word pairs in total and have different characteristics in terms of difficulty, domain coverage, and notion of similarity (relatedness vs.~similarity). These features help to gain a broader picture of the properties of an evaluated word embedding model. We include baseline evaluations with existing Thai embedding models, and identify the high ratio of out-of-vocabulary words as one of the biggest challenges. All datasets, evaluation results, and a tool for easy evaluation of new Thai embedding models are available to the NLP community online.
Tasks Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published 2019-04-08
URL http://arxiv.org/abs/1904.04307v1
PDF http://arxiv.org/pdf/1904.04307v1.pdf
PWC https://paperswithcode.com/paper/word-similarity-datasets-for-thai
Repo https://github.com/gwohlgen/thai_word_similarity
Framework none

COS960: A Chinese Word Similarity Dataset of 960 Word Pairs

Title COS960: A Chinese Word Similarity Dataset of 960 Word Pairs
Authors Junjie Huang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Maosong Sun
Abstract Word similarity computation is a widely recognized task in the field of lexical semantics. Most proposed tasks test on similarity of word pairs of single morpheme, while few works focus on words of two morphemes or more morphemes. In this work, we propose COS960, a benchmark dataset with 960 pairs of Chinese wOrd Similarity, where all the words have two morphemes in three Part of Speech (POS) tags with their human annotated similarity rather than relatedness. We give a detailed description of dataset construction and annotation process, and test on a range of word embedding models. The dataset of this paper can be obtained from https://github.com/thunlp/COS960.
Tasks
Published 2019-06-01
URL https://arxiv.org/abs/1906.00247v2
PDF https://arxiv.org/pdf/1906.00247v2.pdf
PWC https://paperswithcode.com/paper/190600247
Repo https://github.com/thunlp/COS960
Framework none

A Resource for Computational Experiments on Mapudungun

Title A Resource for Computational Experiments on Mapudungun
Authors Mingjun Duan, Carlos Fasola, Sai Krishna Rallabandi, Rodolfo M. Vega, Antonios Anastasopoulos, Lori Levin, Alan W Black
Abstract We present a resource for computational experiments on Mapudungun, a polysynthetic indigenous language spoken in Chile with upwards of 200 thousand speakers. We provide 142 hours of culturally significant conversations in the domain of medical treatment. The conversations are fully transcribed and translated into Spanish. The transcriptions also include annotations for code-switching and non-standard pronunciations. We also provide baseline results on three core NLP tasks: speech recognition, speech synthesis, and machine translation between Spanish and Mapudungun. We further explore other applications for which the corpus will be suitable, including the study of code-switching, historical orthography change, linguistic structure, and sociological and anthropological studies.
Tasks Machine Translation, Speech Recognition, Speech Synthesis
Published 2019-12-04
URL https://arxiv.org/abs/1912.01772v1
PDF https://arxiv.org/pdf/1912.01772v1.pdf
PWC https://paperswithcode.com/paper/a-resource-for-computational-experiments-on
Repo https://github.com/mingjund/mapudungun-corpus
Framework none

Bottom-up Object Detection by Grouping Extreme and Center Points

Title Bottom-up Object Detection by Grouping Extreme and Center Points
Authors Xingyi Zhou, Jiacheng Zhuo, Philipp Krähenbühl
Abstract With the advent of deep learning, object detection drifted from a bottom-up to a top-down recognition problem. State of the art algorithms enumerate a near-exhaustive list of object locations and classify each into: object or not. In this paper, we show that bottom-up approaches still perform competitively. We detect four extreme points (top-most, left-most, bottom-most, right-most) and one center point of objects using a standard keypoint estimation network. We group the five keypoints into a bounding box if they are geometrically aligned. Object detection is then a purely appearance-based keypoint estimation problem, without region classification or implicit feature learning. The proposed method performs on-par with the state-of-the-art region based detection methods, with a bounding box AP of 43.2% on COCO test-dev. In addition, our estimated extreme points directly span a coarse octagonal mask, with a COCO Mask AP of 18.9%, much better than the Mask AP of vanilla bounding boxes. Extreme point guided segmentation further improves this to 34.6% Mask AP.
Tasks Object Detection
Published 2019-01-23
URL http://arxiv.org/abs/1901.08043v3
PDF http://arxiv.org/pdf/1901.08043v3.pdf
PWC https://paperswithcode.com/paper/bottom-up-object-detection-by-grouping
Repo https://github.com/xingyizhou/ExtremeNet
Framework pytorch

Common Voice: A Massively-Multilingual Speech Corpus

Title Common Voice: A Massively-Multilingual Speech Corpus
Authors Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler, Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, Gregor Weber
Abstract The Common Voice corpus is a massively-multilingual collection of transcribed speech intended for speech technology research and development. Common Voice is designed for Automatic Speech Recognition purposes but can be useful in other domains (e.g. language identification). To achieve scale and sustainability, the Common Voice project employs crowdsourcing for both data collection and data validation. The most recent release includes 29 languages, and as of November 2019 there are a total of 38 languages collecting data. Over 50,000 individuals have participated so far, resulting in 2,500 hours of collected audio. To our knowledge this is the largest audio corpus in the public domain for speech recognition, both in terms of number of hours and number of languages. As an example use case for Common Voice, we present speech recognition experiments using Mozilla’s DeepSpeech Speech-to-Text toolkit. By applying transfer learning from a source English model, we find an average Character Error Rate improvement of 5.99 +/- 5.48 for twelve target languages (German, French, Italian, Turkish, Catalan, Slovenian, Welsh, Irish, Breton, Tatar, Chuvash, and Kabyle). For most of these languages, these are the first ever published results on end-to-end Automatic Speech Recognition.
Tasks Language Identification, Speech Recognition, Transfer Learning
Published 2019-12-13
URL https://arxiv.org/abs/1912.06670v2
PDF https://arxiv.org/pdf/1912.06670v2.pdf
PWC https://paperswithcode.com/paper/common-voice-a-massively-multilingual-speech
Repo https://github.com/facebookresearch/covost
Framework none

Generalized Presentation Attack Detection: a face anti-spoofing evaluation proposal

Title Generalized Presentation Attack Detection: a face anti-spoofing evaluation proposal
Authors Artur Costa-Pazo, David Jimenez-Cabello, Esteban Vazquez-Fernandez, Jose L. Alba-Castro, Roberto J. López-Sastre
Abstract Over the past few years, Presentation Attack Detection (PAD) has become a fundamental part of facial recognition systems. Although much effort has been devoted to anti-spoofing research, generalization in real scenarios remains a challenge. In this paper we present a new open-source evaluation framework to study the generalization capacity of face PAD methods, coined here as face-GPAD. This framework facilitates the creation of new protocols focused on the generalization problem establishing fair procedures of evaluation and comparison between PAD solutions. We also introduce a large aggregated and categorized dataset to address the problem of incompatibility between publicly available datasets. Finally, we propose a benchmark adding two novel evaluation protocols: one for measuring the effect introduced by the variations in face resolution, and the second for evaluating the influence of adversarial operating conditions.
Tasks Face Anti-Spoofing
Published 2019-04-12
URL http://arxiv.org/abs/1904.06213v1
PDF http://arxiv.org/pdf/1904.06213v1.pdf
PWC https://paperswithcode.com/paper/generalized-presentation-attack-detection-a
Repo https://github.com/Gradiant/bob.paper.icb2019.gradgpad
Framework none

Property Inference for Deep Neural Networks

Title Property Inference for Deep Neural Networks
Authors Divya Gopinath, Hayes Converse, Corina S. Pasareanu, Ankur Taly
Abstract We present techniques for automatically inferring formal properties of feed-forward neural networks. We observe that a significant part (if not all) of the logic of feed forward networks is captured in the activation status (‘on’ or ‘off’) of its neurons. We propose to extract patterns based on neuron decisions as preconditions that imply certain desirable output property e.g., the prediction being a certain class. We present techniques to extract input properties, encoding convex predicates on the input space that imply given output properties and layer properties, representing network properties captured in the hidden layers that imply the desired output behavior. We apply our techniques on networks for the MNIST and ACASXU applications. Our experiments highlight the use of the inferred properties in a variety of tasks, such as explaining predictions, providing robustness guarantees, simplifying proofs, and network distillation.
Tasks
Published 2019-04-29
URL https://arxiv.org/abs/1904.13215v2
PDF https://arxiv.org/pdf/1904.13215v2.pdf
PWC https://paperswithcode.com/paper/finding-invariants-in-deep-neural-networks
Repo https://github.com/safednn-nasa/Prophecy
Framework none

Fast Interactive Object Annotation with Curve-GCN

Title Fast Interactive Object Annotation with Curve-GCN
Authors Huan Ling, Jun Gao, Amlan Kar, Wenzheng Chen, Sanja Fidler
Abstract Manually labeling objects by tracing their boundaries is a laborious process. In Polygon-RNN++ the authors proposed Polygon-RNN that produces polygonal annotations in a recurrent manner using a CNN-RNN architecture, allowing interactive correction via humans-in-the-loop. We propose a new framework that alleviates the sequential nature of Polygon-RNN, by predicting all vertices simultaneously using a Graph Convolutional Network (GCN). Our model is trained end-to-end. It supports object annotation by either polygons or splines, facilitating labeling efficiency for both line-based and curved objects. We show that Curve-GCN outperforms all existing approaches in automatic mode, including the powerful PSP-DeepLab and is significantly more efficient in interactive mode than Polygon-RNN++. Our model runs at 29.3ms in automatic, and 2.6ms in interactive mode, making it 10x and 100x faster than Polygon-RNN++.
Tasks
Published 2019-03-16
URL http://arxiv.org/abs/1903.06874v1
PDF http://arxiv.org/pdf/1903.06874v1.pdf
PWC https://paperswithcode.com/paper/fast-interactive-object-annotation-with-curve
Repo https://github.com/fidler-lab/curve-gcn
Framework pytorch

A Simple Baseline for Audio-Visual Scene-Aware Dialog

Title A Simple Baseline for Audio-Visual Scene-Aware Dialog
Authors Idan Schwartz Alexander Schwing, Tamir Hazan and
Abstract The recently proposed audio-visual scene-aware dialog task paves the way to a more data-driven way of learning virtual assistants, smart speakers and car navigation systems. However, very little is known to date about how to effectively extract meaningful information from a plethora of sensors that pound the computational engine of those devices. Therefore, in this paper, we provide and carefully analyze a simple baseline for audio-visual scene-aware dialog which is trained end-to-end. Our method differentiates in a data-driven manner useful signals from distracting ones using an attention mechanism. We evaluate the proposed approach on the recently introduced and challenging audio-visual scene-aware dataset, and demonstrate the key features that permit to outperform the current state-of-the-art by more than 20% on CIDEr.
Tasks
Published 2019-04-11
URL http://arxiv.org/abs/1904.05876v1
PDF http://arxiv.org/pdf/1904.05876v1.pdf
PWC https://paperswithcode.com/paper/a-simple-baseline-for-audio-visual-scene
Repo https://github.com/idansc/simple-avsd
Framework pytorch

Contrastive Adaptation Network for Unsupervised Domain Adaptation

Title Contrastive Adaptation Network for Unsupervised Domain Adaptation
Authors Guoliang Kang, Lu Jiang, Yi Yang, Alexander G Hauptmann
Abstract Unsupervised Domain Adaptation (UDA) makes predictions for the target domain data while manual annotations are only available in the source domain. Previous methods minimize the domain discrepancy neglecting the class information, which may lead to misalignment and poor generalization performance. To address this issue, this paper proposes Contrastive Adaptation Network (CAN) optimizing a new metric which explicitly models the intra-class domain discrepancy and the inter-class domain discrepancy. We design an alternating update strategy for training CAN in an end-to-end manner. Experiments on two real-world benchmarks Office-31 and VisDA-2017 demonstrate that CAN performs favorably against the state-of-the-art methods and produces more discriminative features.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2019-01-04
URL http://arxiv.org/abs/1901.00976v2
PDF http://arxiv.org/pdf/1901.00976v2.pdf
PWC https://paperswithcode.com/paper/contrastive-adaptation-network-for
Repo https://github.com/kgl-prml/Contrastive-Adaptation-Network-for-Unsupervised-Domain-Adaptation
Framework pytorch

Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

Title Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks
Authors Alejandro Molina, Patrick Schramowski, Kristian Kersting
Abstract The performance of deep network learning strongly depends on the choice of the non-linear activation function associated with each neuron. However, deciding on the best activation is non-trivial, and the choice depends on the architecture, hyper-parameters, and even on the dataset. Typically these activations are fixed by hand before training. Here, we demonstrate how to eliminate the reliance on first picking fixed activation functions by using flexible parametric rational functions instead. The resulting Pad'e Activation Units (PAUs) can both approximate common activation functions and also learn new ones while providing compact representations. Our empirical evidence shows that end-to-end learning deep networks with PAUs can increase the predictive performance. Moreover, PAUs pave the way to approximations with provable robustness. https://github.com/ml-research/pau
Tasks
Published 2019-07-15
URL https://arxiv.org/abs/1907.06732v3
PDF https://arxiv.org/pdf/1907.06732v3.pdf
PWC https://paperswithcode.com/paper/pade-activation-units-end-to-end-learning-of
Repo https://github.com/ml-research/pau
Framework pytorch

Heteroscedastic Gaussian Process Regression on the Alkenone over Sea Surface Temperatures

Title Heteroscedastic Gaussian Process Regression on the Alkenone over Sea Surface Temperatures
Authors Taehee Lee, Charles E. Lawrence
Abstract To restore the historical sea surface temperatures (SSTs) better, it is important to construct a good calibration model for the associated proxies. In this paper, we introduce a new model for alkenone (${\rm{U}}_{37}^{\rm{K}'}$) based on the heteroscedastic Gaussian process (GP) regression method. Our nonparametric approach not only deals with the variable pattern of noises over SSTs but also contains a Bayesian method of classifying potential outliers.
Tasks Calibration
Published 2019-12-18
URL https://arxiv.org/abs/1912.08843v1
PDF https://arxiv.org/pdf/1912.08843v1.pdf
PWC https://paperswithcode.com/paper/heteroscedastic-gaussian-process-regression
Repo https://github.com/eilion/HGPR_SST_Proxy_Cal
Framework none

Invert to Learn to Invert

Title Invert to Learn to Invert
Authors Patrick Putzky, Max Welling
Abstract Iterative learning to infer approaches have become popular solvers for inverse problems. However, their memory requirements during training grow linearly with model depth, limiting in practice model expressiveness. In this work, we propose an iterative inverse model with constant memory that relies on invertible networks to avoid storing intermediate activations. As a result, the proposed approach allows us to train models with 400 layers on 3D volumes in an MRI image reconstruction task. In experiments on a public data set, we demonstrate that these deeper, and thus more expressive, networks perform state-of-the-art image reconstruction.
Tasks Image Reconstruction
Published 2019-11-25
URL https://arxiv.org/abs/1911.10914v1
PDF https://arxiv.org/pdf/1911.10914v1.pdf
PWC https://paperswithcode.com/paper/invert-to-learn-to-invert-1
Repo https://github.com/pputzky/invertible_rim
Framework pytorch

Learning Sparse Nonparametric DAGs

Title Learning Sparse Nonparametric DAGs
Authors Xun Zheng, Chen Dan, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing
Abstract We develop a framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data. Our approach is based on a recent algebraic characterization of DAGs that led to a fully continuous program for score-based learning of DAG models parametrized by a linear structural equation model (SEM). We extend this algebraic characterization to nonparametric SEM by leveraging nonparametric sparsity based on partial derivatives, resulting in a continuous optimization problem that can be applied to a variety of nonparametric and semiparametric models including GLMs, additive noise models, and index models as special cases. Unlike existing approaches that require specific modeling choices, loss functions, or algorithms, we present a completely general framework that can be applied to general nonlinear models (e.g. without additive noise), general differentiable loss functions, and generic black-box optimization routines. The code is available at https://github.com/xunzheng/notears.
Tasks
Published 2019-09-29
URL https://arxiv.org/abs/1909.13189v2
PDF https://arxiv.org/pdf/1909.13189v2.pdf
PWC https://paperswithcode.com/paper/learning-sparse-nonparametric-dags
Repo https://github.com/xunzheng/notears
Framework tf

Patch Learning

Title Patch Learning
Authors Dongrui Wu, Jerry M. Mendel
Abstract There have been different strategies to improve the performance of a machine learning model, e.g., increasing the depth, width, and/or nonlinearity of the model, and using ensemble learning to aggregate multiple base/weak learners in parallel or in series. This paper proposes a novel strategy called patch learning (PL) for this problem. It consists of three steps: 1) train an initial global model using all training data; 2) identify from the initial global model the patches which contribute the most to the learning error, and train a (local) patch model for each such patch; and, 3) update the global model using training data that do not fall into any patch. To use a PL model, we first determine if the input falls into any patch. If yes, then the corresponding patch model is used to compute the output. Otherwise, the global model is used. We explain in detail how PL can be implemented using fuzzy systems. Five regression problems on 1D/2D/3D curve fitting, nonlinear system identification, and chaotic time-series prediction, verified its effectiveness. To our knowledge, the PL idea has not appeared in the literature before, and it opens up a promising new line of research in machine learning.
Tasks Time Series, Time Series Prediction
Published 2019-06-01
URL https://arxiv.org/abs/1906.00158v1
PDF https://arxiv.org/pdf/1906.00158v1.pdf
PWC https://paperswithcode.com/paper/190600158
Repo https://github.com/drwuHUST/Patch-Learning
Framework none
comments powered by Disqus