January 25, 2020

2894 words 14 mins read

Paper Group NANR 98

Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts. Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning. Learning Latent Semantic Representation from Pre-defined Generative Model. ACCELERATING NONCONVEX LEARNING VIA REPLICA EXCHANGE LANGEVIN DIFFUSION. Cardiff University at SemEval-20 …


Title	Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts
Authors	Luke Breitfeller, Emily Ahn, David Jurgens, Yulia Tsvetkov
Abstract	Microaggressions are subtle, often veiled, manifestations of human biases. These uncivil interactions can have a powerful negative impact on people by marginalizing minorities and disadvantaged groups. The linguistic subtlety of microaggressions in communication has made it difficult for researchers to analyze their exact nature, and to quantify and extract microaggressions automatically. Specifically, the lack of a corpus of real-world microaggressions and objective criteria for annotating them have prevented researchers from addressing these problems at scale. In this paper, we devise a general but nuanced, computationally operationalizable typology of microaggressions based on a small subset of data that we have. We then create two datasets: one with examples of diverse types of microaggressions recollected by their targets, and another with gender-based microaggressions in public conversations on social media. We introduce a new, more objective, criterion for annotation and an active-learning based procedure that increases the likelihood of surfacing posts containing microaggressions. Finally, we analyze the trends that emerge from these new datasets.
Tasks	Active Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1176/
PDF	https://www.aclweb.org/anthology/D19-1176
PWC	https://paperswithcode.com/paper/finding-microaggressions-in-the-wild-a-case
Repo
Framework

Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning


Title	Small Steps and Giant Leaps: Minimal Newton Solvers for Deep Learning
Authors	Joao F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi
Abstract	We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable to two standard forward passes and is easy to implement. Our method addresses long-standing issues with current second-order solvers, which invert an approximate Hessian matrix every iteration exactly or by conjugate-gradient methods, procedures that are much slower than a SGD step. Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration with just two passes over the network. This estimate has the same size and is similar to the momentum variable that is commonly used in SGD . No estimate of the Hessian is maintained. We first validate our method, called CurveBall, on small problems with known solutions (noisy Rosenbrock function and degenerate 2-layer linear networks), where current deep learning solvers struggle. We then train several large models on CIFAR and ImageNet, including ResNet and VGG-f networks, where we demonstrate faster convergence with no hyperparameter tuning. We also show our optimiser’s generality by testing on a large set of randomly generated architectures.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Henriques_Small_Steps_and_Giant_Leaps_Minimal_Newton_Solvers_for_Deep_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Henriques_Small_Steps_and_Giant_Leaps_Minimal_Newton_Solvers_for_Deep_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/small-steps-and-giant-leaps-minimal-newton-2
Repo
Framework

Learning Latent Semantic Representation from Pre-defined Generative Model


Title	Learning Latent Semantic Representation from Pre-defined Generative Model
Authors	Jin-Young Kim, Sung-Bae Cho
Abstract	Learning representations of data is an important issue in machine learning. Though GAN has led to significant improvements in the data representations, it still has several problems such as unstable training, hidden manifold of data, and huge computational overhead. GAN tends to produce the data simply without any information about the manifold of the data, which hinders from controlling desired features to generate. Moreover, most of GAN’s have a large size of manifold, resulting in poor scalability. In this paper, we propose a novel GAN to control the latent semantic representation, called LSC-GAN, which allows us to produce desired data to generate and learns a representation of the data efficiently. Unlike the conventional GAN models with hidden distribution of latent space, we define the distributions explicitly in advance that are trained to generate the data based on the corresponding features by inputting the latent variables that follow the distribution. As the larger scale of latent space caused by deploying various distributions in one latent space makes training unstable while maintaining the dimension of latent space, we need to separate the process of defining the distributions explicitly and operation of generation. We prove that a VAE is proper for the former and modify a loss function of VAE to map the data into the pre-defined latent space so as to locate the reconstructed data as close to the input data according to its characteristics. Moreover, we add the KL divergence to the loss function of LSC-GAN to include this process. The decoder of VAE, which generates the data with the corresponding features from the pre-defined latent space, is used as the generator of the LSC-GAN. Several experiments on the CelebA dataset are conducted to verify the usefulness of the proposed method to generate desired data stably and efficiently, achieving a high compression ratio that can hold about 24 pixels of information in each dimension of latent space. Besides, our model learns the reverse of features such as not laughing (rather frowning) only with data of ordinary and smiling facial expression.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=Hyg1Ls0cKQ
PDF	https://openreview.net/pdf?id=Hyg1Ls0cKQ
PWC	https://paperswithcode.com/paper/learning-latent-semantic-representation-from
Repo
Framework

ACCELERATING NONCONVEX LEARNING VIA REPLICA EXCHANGE LANGEVIN DIFFUSION


Title	ACCELERATING NONCONVEX LEARNING VIA REPLICA EXCHANGE LANGEVIN DIFFUSION
Authors	Yi Chen, Jinglin Chen, Jing Dong, Jian Peng, Zhaoran Wang
Abstract	Langevin diffusion is a powerful method for nonconvex optimization, which enables the escape from local minima by injecting noise into the gradient. In particular, the temperature parameter controlling the noise level gives rise to a tradeoff between `global exploration'' and` local exploitation’', which correspond to high and low temperatures. To attain the advantages of both regimes, we propose to use replica exchange, which swaps between two Langevin diffusions with different temperatures. We theoretically analyze the acceleration effect of replica exchange from two perspectives: (i) the convergence in $\chi^2$-divergence, and (ii) the large deviation principle. Such an acceleration effect allows us to faster approach the global minima. Furthermore, by discretizing the replica exchange Langevin diffusion, we obtain a discrete-time algorithm. For such an algorithm, we quantify its discretization error in theory and demonstrate its acceleration effect in practice.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=SJfPFjA9Fm
PDF	https://openreview.net/pdf?id=SJfPFjA9Fm
PWC	https://paperswithcode.com/paper/accelerating-nonconvex-learning-via-replica
Repo
Framework

Cardiff University at SemEval-2019 Task 4: Linguistic Features for Hyperpartisan News Detection


Title	Cardiff University at SemEval-2019 Task 4: Linguistic Features for Hyperpartisan News Detection
Authors	Carla P{'e}rez-Almendros, Luis Espinosa-Anke, Steven Schockaert
Abstract	This paper summarizes our contribution to the Hyperpartisan News Detection task in SemEval 2019. We experiment with two different approaches: 1) an SVM classifier based on word vector averages and hand-crafted linguistic features, and 2) a BiLSTM-based neural text classifier trained on a filtered training set. Surprisingly, despite their different nature, both approaches achieve an accuracy of 0.74. The main focus of this paper is to further analyze the remarkable fact that a simple feature-based approach can perform on par with modern neural classifiers. We also highlight the effectiveness of our filtering strategy for training the neural network on a large but noisy training set.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2158/
PDF	https://www.aclweb.org/anthology/S19-2158
PWC	https://paperswithcode.com/paper/cardiff-university-at-semeval-2019-task-4
Repo
Framework

AiFu at SemEval-2019 Task 10: A Symbolic and Sub-symbolic Integrated System for SAT Math Question Answering


Title	AiFu at SemEval-2019 Task 10: A Symbolic and Sub-symbolic Integrated System for SAT Math Question Answering
Authors	Yifan Liu, Keyu Ding, Yi Zhou
Abstract	AiFu has won the first place in the SemEval-2019 Task 10 - {''}Math Question Answering{''}competition. This paper is to describe how it works technically and to report and analyze some essential experimental results
Tasks	Question Answering
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2154/
PDF	https://www.aclweb.org/anthology/S19-2154
PWC	https://paperswithcode.com/paper/aifu-at-semeval-2019-task-10-a-symbolic-and
Repo
Framework

End-to-end learning of pharmacological assays from high-resolution microscopy images


Title	End-to-end learning of pharmacological assays from high-resolution microscopy images
Authors	Markus Hofmarcher, Elisabeth Rumetshofer, Sepp Hochreiter, Günter Klambauer
Abstract	Predicting the outcome of pharmacological assays based on high-resolution microscopy images of treated cells is a crucial task in drug discovery which tremendously increases discovery rates. However, end-to-end learning on these images with convolutional neural networks (CNNs) has not been ventured for this task because it has been considered infeasible and overly complex. On the largest available public dataset, we compare several state-of-the-art CNNs trained in an end-to-end fashion with models based on a cell-centric approach involving segmentation. We found that CNNs operating on full images containing hundreds of cells perform significantly better at assay prediction than networks operating on a single-cell level. Surprisingly, we could predict 29% of the 209 pharmacological assays at high predictive performance (AUC > 0.9). We compared a novel CNN architecture called “GapNet” against four competing CNN architectures and found that it performs on par with the best methods and at the same time has the lowest training time. Our results demonstrate that end-to-end learning on high-resolution imaging data is not only possible but even outperforms cell-centric and segmentation-dependent approaches. Hence, the costly cell segmentation and feature extraction steps are not necessary, in fact they even hamper predictive performance. Our work further suggests that many pharmacological assays could be replaced by high-resolution microscopy imaging together with convolutional neural networks.
Tasks	Cell Segmentation, Drug Discovery
Published	2019-05-01
URL	https://openreview.net/forum?id=S1gBgnR9Y7
PDF	https://openreview.net/pdf?id=S1gBgnR9Y7
PWC	https://paperswithcode.com/paper/end-to-end-learning-of-pharmacological-assays
Repo
Framework

Cross-Entropy Loss Leads To Poor Margins


Title	Cross-Entropy Loss Leads To Poor Margins
Authors	Kamil Nar, Orhan Ocal, S. Shankar Sastry, Kannan Ramchandran
Abstract	Neural networks could misclassify inputs that are slightly different from their training data, which indicates a small margin between their decision boundaries and the training dataset. In this work, we study the binary classification of linearly separable datasets and show that linear classifiers could also have decision boundaries that lie close to their training dataset if cross-entropy loss is used for training. In particular, we show that if the features of the training dataset lie in a low-dimensional affine subspace and the cross-entropy loss is minimized by using a gradient method, the margin between the training points and the decision boundary could be much smaller than the optimal value. This result is contrary to the conclusions of recent related works such as (Soudry et al., 2018), and we identify the reason for this contradiction. In order to improve the margin, we introduce differential training, which is a training paradigm that uses a loss function defined on pairs of points from each class. We show that the decision boundary of a linear classifier trained with differential training indeed achieves the maximum margin. The results reveal the use of cross-entropy loss as one of the hidden culprits of adversarial examples and introduces a new direction to make neural networks robust against them.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=ByfbnsA9Km
PDF	https://openreview.net/pdf?id=ByfbnsA9Km
PWC	https://paperswithcode.com/paper/cross-entropy-loss-leads-to-poor-margins
Repo
Framework

Risk Factors Extraction from Clinical Texts based on Linked Open Data


Title	Risk Factors Extraction from Clinical Texts based on Linked Open Data
Authors	Svetla Boytcheva, Galia Angelova, Zhivko Angelov
Abstract	This paper presents experiments in risk factors analysis based on clinical texts enhanced with Linked Open Data (LOD). The idea is to determine whether a patient has risk factors for a specific disease analyzing only his/her outpatient records. A semantic graph of {``}meta-knowledge{''} about a disease of interest is constructed, with integrated multilingual terms (labels) of symptoms, risk factors etc. coming from Wikidata, PubMed, Wikipedia and MESH, and linked to clinical records of individual patients via ICD{–}10 codes. Then a predictive model is trained to foretell whether patients are at risk to develop the disease of interest. The testing was done using outpatient records from a nation-wide repository available for the period 2011-2016. The results show improvement of the overall performance of all tested algorithms (kNN, Naive Bayes, Tree, Logistic regression, ANN), when the clinical texts are enriched with LOD resources. \|
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/R19-1019/
PDF	https://www.aclweb.org/anthology/R19-1019
PWC	https://paperswithcode.com/paper/risk-factors-extraction-from-clinical-texts
Repo
Framework

Predicting Humorousness and Metaphor Novelty with Gaussian Process Preference Learning


Title	Predicting Humorousness and Metaphor Novelty with Gaussian Process Preference Learning
Authors	Edwin Simpson, Erik-L{^a}n Do Dinh, Tristan Miller, Iryna Gurevych
Abstract	The inability to quantify key aspects of creative language is a frequent obstacle to natural language understanding. To address this, we introduce novel tasks for evaluating the creativeness of language{—}namely, scoring and ranking text by humorousness and metaphor novelty. To sidestep the difficulty of assigning discrete labels or numeric scores, we learn from pairwise comparisons between texts. We introduce a Bayesian approach for predicting humorousness and metaphor novelty using Gaussian process preference learning (GPPL), which achieves a Spearman{'}s Ï� of 0.56 against gold using word embeddings and linguistic features. Our experiments show that given sparse, crowdsourced annotation data, ranking using GPPL outperforms best{–}worst scaling. We release a new dataset for evaluating humour containing 28,210 pairwise comparisons of 4,030 texts, and make our software freely available.
Tasks	Word Embeddings
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1572/
PDF	https://www.aclweb.org/anthology/P19-1572
PWC	https://paperswithcode.com/paper/predicting-humorousness-and-metaphor-novelty
Repo
Framework

Empirical Linguistic Study of Sentence Embeddings


Title	Empirical Linguistic Study of Sentence Embeddings
Authors	Katarzyna Krasnowska-Kiera{'s}, Alina Wr{'o}blewska
Abstract	The purpose of the research is to answer the question whether linguistic information is retained in vector representations of sentences. We introduce a method of analysing the content of sentence embeddings based on universal probing tasks, along with the classification datasets for two contrasting languages. We perform a series of probing and downstream experiments with different types of sentence embeddings, followed by a thorough analysis of the experimental results. Aside from dependency parser-based embeddings, linguistic information is retained best in the recently proposed LASER sentence embeddings.
Tasks	Sentence Embeddings
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1573/
PDF	https://www.aclweb.org/anthology/P19-1573
PWC	https://paperswithcode.com/paper/empirical-linguistic-study-of-sentence
Repo
Framework

Exploring Numeracy in Word Embeddings


Title	Exploring Numeracy in Word Embeddings
Authors	Aakanksha Naik, Ravich, Abhilasha er, Carolyn Rose, Eduard Hovy
Abstract	Word embeddings are now pervasive across NLP subfields as the de-facto method of forming text representataions. In this work, we show that existing embedding models are inadequate at constructing representations that capture salient aspects of mathematical meaning for numbers, which is important for language understanding. Numbers are ubiquitous and frequently appear in text. Inspired by cognitive studies on how humans perceive numbers, we develop an analysis framework to test how well word embeddings capture two essential properties of numbers: magnitude (e.g. 3{\textless}4) and numeration (e.g. 3=three). Our experiments reveal that most models capture an approximate notion of magnitude, but are inadequate at capturing numeration. We hope that our observations provide a starting point for the development of methods which better capture numeracy in NLP systems.
Tasks	Word Embeddings
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1329/
PDF	https://www.aclweb.org/anthology/P19-1329
PWC	https://paperswithcode.com/paper/exploring-numeracy-in-word-embeddings
Repo
Framework

EED: Extended Edit Distance Measure for Machine Translation


Title	EED: Extended Edit Distance Measure for Machine Translation
Authors	Peter Stanchev, Weiyue Wang, Hermann Ney
Abstract	Over the years a number of machine translation metrics have been developed in order to evaluate the accuracy and quality of machine-generated translations. Metrics such as BLEU and TER have been used for decades. However, with the rapid progress of machine translation systems, the need for better metrics is growing. This paper proposes an extension of the edit distance, which achieves better human correlation, whilst remaining fast, flexible and easy to understand.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5359/
PDF	https://www.aclweb.org/anthology/W19-5359
PWC	https://paperswithcode.com/paper/eed-extended-edit-distance-measure-for
Repo
Framework

Filtering Pseudo-References by Paraphrasing for Automatic Evaluation of Machine Translation


Title	Filtering Pseudo-References by Paraphrasing for Automatic Evaluation of Machine Translation
Authors	Ryoma Yoshimura, Hiroki Shimanaka, Yukio Matsumura, Hayahide Yamagishi, Mamoru Komachi
Abstract	In this paper, we introduce our participation in the WMT 2019 Metric Shared Task. We propose an improved version of sentence BLEU using filtered pseudo-references. We propose a method to filter pseudo-references by paraphrasing for automatic evaluation of machine translation (MT). We use the outputs of off-the-shelf MT systems as pseudo-references filtered by paraphrasing in addition to a single human reference (gold reference). We use BERT fine-tuned with paraphrase corpus to filter pseudo-references by checking the paraphrasability with the gold reference. Our experimental results of the WMT 2016 and 2017 datasets show that our method achieved higher correlation with human evaluation than the sentence BLEU (SentBLEU) baselines with a single reference and with unfiltered pseudo-references.
Tasks	Machine Translation
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5360/
PDF	https://www.aclweb.org/anthology/W19-5360
PWC	https://paperswithcode.com/paper/filtering-pseudo-references-by-paraphrasing
Repo
Framework

l-Net: Reconstruct Hyperspectral Images From a Snapshot Measurement


Title	l-Net: Reconstruct Hyperspectral Images From a Snapshot Measurement
Authors	Xin Miao, Xin Yuan, Yunchen Pu, Vassilis Athitsos
Abstract	We propose the l-net, which reconstructs hyperspectral images (e.g., with 24 spectral channels) from a single shot measurement. This task is usually termed snapshot compressive-spectral imaging (SCI), which enjoys low cost, low bandwidth and high-speed sensing rate via capturing the three-dimensional (3D) signal i.e., (x, y, l), using a 2D snapshot. Though proposed more than a decade ago, the poor quality and low-speed of reconstruction algorithms preclude wide applications of SCI. To address this challenge, in this paper, we develop a dual-stage generative model to reconstruct the desired 3D signal in SCI, dubbed l-net. Results on both simulation and real datasets demonstrate the significant advantages of l-net, which leads to >4dB improvement in PSNR for real-mask-in-the-loop simulation data compared to the current state-of-the-art. Furthermore, l-net can finish the reconstruction task within sub-seconds instead of hours taken by the most recently proposed DeSCI algorithm, thus speeding up the reconstruction >1000 times.
Tasks
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Miao_l-Net_Reconstruct_Hyperspectral_Images_From_a_Snapshot_Measurement_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Miao_l-Net_Reconstruct_Hyperspectral_Images_From_a_Snapshot_Measurement_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/l-net-reconstruct-hyperspectral-images-from-a
Repo
Framework