January 24, 2020

2309 words 11 mins read

Paper Group NANR 216

Private Testing of Distributions via Sample Permutations. Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection. Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European. Automatic Grammatical Error Correction for Sequence-to-sequence Text Gene …

Private Testing of Distributions via Sample Permutations


Title	Private Testing of Distributions via Sample Permutations
Authors	Maryam Aliakbarpour, Ilias Diakonikolas, Daniel Kane, Ronitt Rubinfeld
Abstract	Statistical tests are at the heart of many scientific tasks. To validate their hypothesis, researchers in medical and social sciences use individuals’ data. The sensitivity of participants’ data requires the design of statistical tests that ensure the privacy of the individuals in the most efficient way. In this paper, we use the framework of property testing to design algorithms to test the properties of the distribution that the data is drawn from with respect to differential privacy. In particular, we investigate testing two fundamental properties of distributions: (1) testing the equivalence of two distributions when we have unequal numbers of samples from the two distributions. (2) Testing independence of two random variables. In both cases, we show that our testers achieve near optimal sample complexity (up to logarithmic factors). Moreover, our dependence on the privacy parameter is an additive term, which indicates that differential privacy can be obtained in most regimes of parameters for free.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9270-private-testing-of-distributions-via-sample-permutations
PDF	http://papers.nips.cc/paper/9270-private-testing-of-distributions-via-sample-permutations.pdf
PWC	https://paperswithcode.com/paper/private-testing-of-distributions-via-sample
Repo
Framework

Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection


Title	Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection
Authors	Mykhailo Shvets, Wei Liu, Alexander C. Berg
Abstract	Single-frame object detectors perform well on videos sometimes, even without temporal context. However, challenges such as occlusion, motion blur, and rare poses of objects are hard to resolve without temporal awareness. Thus, there is a strong need to improve video object detection by considering long-range temporal dependencies. In this paper, we present a light-weight modification to a single-frame detector that accounts for arbitrary long dependencies in a video. It improves the accuracy of a single-frame detector significantly with negligible compute overhead. The key component of our approach is a novel temporal relation module, operating on object proposals, that learns the similarities between proposals from different frames and selects proposals from past and/or future to support current proposals. Our final “causal” model, without any offline post-processing steps, runs at a similar speed as a single-frame detector and achieves state-of-the-art video object detection on ImageNet VID dataset.
Tasks	Object Detection, Video Object Detection
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Shvets_Leveraging_Long-Range_Temporal_Relationships_Between_Proposals_for_Video_Object_Detection_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Shvets_Leveraging_Long-Range_Temporal_Relationships_Between_Proposals_for_Video_Object_Detection_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/leveraging-long-range-temporal-relationships
Repo
Framework

Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European


Title	Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European
Authors	Frederik Hartmann
Abstract	Traditional historical linguistics lacks the possibility to empirically assess its assumptions regarding the phonetic systems of past languages and language stages since most current methods rely on comparative tools to gain insights into phonetic features of sounds in proto- or ancestor languages. The paper at hand presents a computational method based on deep neural networks to predict phonetic features of historical sounds where the exact quality is unknown and to test the overall coherence of reconstructed historical phonetic features. The method utilizes the principles of coarticulation, local predictability and statistical phonological constraints to predict phonetic features by the features of their immediate phonetic environment. The validity of this method will be assessed using New High German phonetic data and its specific application to diachronic linguistics will be demonstrated in a case study of the phonetic system Proto-Indo-European.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4713/
PDF	https://www.aclweb.org/anthology/W19-4713
PWC	https://paperswithcode.com/paper/predicting-historical-phonetic-features-using
Repo
Framework

Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study


Title	Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study
Authors	Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou
Abstract	Sequence-to-sequence (seq2seq) models have achieved tremendous success in text generation tasks. However, there is no guarantee that they can always generate sentences without grammatical errors. In this paper, we present a preliminary empirical study on whether and how much automatic grammatical error correction can help improve seq2seq text generation. We conduct experiments across various seq2seq text generation tasks including machine translation, formality style transfer, sentence compression and simplification. Experiments show the state-of-the-art grammatical error correction system can improve the grammaticality of generated text and can bring task-oriented improvements in the tasks where target sentences are in a formal style.
Tasks	Grammatical Error Correction, Machine Translation, Sentence Compression, Style Transfer, Text Generation
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1609/
PDF	https://www.aclweb.org/anthology/P19-1609
PWC	https://paperswithcode.com/paper/automatic-grammatical-error-correction-for
Repo
Framework

Thompson Sampling and Approximate Inference


Title	Thompson Sampling and Approximate Inference
Authors	My Phan, Yasin Abbasi Yadkori, Justin Domke
Abstract	We study the effects of approximate inference on the performance of Thompson sampling in the $k$-armed bandit problems. Thompson sampling is a successful algorithm for online decision-making but requires posterior inference, which often must be approximated in practice. We show that even small constant inference error (in $\alpha$-divergence) can lead to poor performance (linear regret) due to under-exploration (for $\alpha<1$) or over-exploration (for $\alpha>0$) by the approximation. While for $\alpha > 0$ this is unavoidable, for $\alpha \leq 0$ the regret can be improved by adding a small amount of forced exploration even when the inference error is a large constant.
Tasks	Decision Making
Published	2019-12-01
URL	http://papers.nips.cc/paper/9084-thompson-sampling-and-approximate-inference
PDF	http://papers.nips.cc/paper/9084-thompson-sampling-and-approximate-inference.pdf
PWC	https://paperswithcode.com/paper/thompson-sampling-and-approximate-inference-1
Repo
Framework

Aligning Artificial Neural Networks to the Brain yields Shallow Recurrent Architectures


Title	Aligning Artificial Neural Networks to the Brain yields Shallow Recurrent Architectures
Authors	Jonas Kubilius, Martin Schrimpf, Ha Hong, Najib J. Majaj, Rishi Rajalingham, Elias B. Issa, Kohitij Kar, Pouya Bashivan, Jonathan Prescott-Roy, Kailyn Schmidt, Aran Nayebi, Daniel Bear, Daniel L. K. Yamins, James J. DiCarlo
Abstract	Deep artificial neural networks with spatially repeated processing (a.k.a., deep convolutional ANNs) have been established as the best class of candidate models of visual processing in the primate ventral visual processing stream. Over the past five years, these ANNs have evolved from a simple feedforward eight-layer architecture in AlexNet to extremely deep and branching NASNet architectures, demonstrating increasingly better object categorization performance. Here we ask, as ANNs have continued to evolve in performance, are they also strong candidate models for the brain? To answer this question, we developed Brain-Score, a composite of neural and behavioral benchmarks for determining how brain-like a model is, together with an online platform where models can receive a Brain-Score and compare against other models. Despite high scores, typical deep models from the machine learning community are often hard to map onto the brain’s anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. To further map onto anatomy and validate our approach, we built CORnet-S: an ANN guided by Brain-Score with the anatomical constraints of compactness and recurrence. Although a shallow model with four anatomically mapped areas and recurrent connectivity, CORnet-S is a top model on Brain-Score and outperforms similarly compact models on ImageNet. Analyzing CORnet-S circuitry variants revealed recurrence as the main predictive factor of both Brain-Score and ImageNet top-1 performance.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=BJeY6sR9KX
PDF	https://openreview.net/pdf?id=BJeY6sR9KX
PWC	https://paperswithcode.com/paper/aligning-artificial-neural-networks-to-the
Repo
Framework

Syntax-aware Semantic Role Labeling without Parsing


Title	Syntax-aware Semantic Role Labeling without Parsing
Authors	Rui Cai, Mirella Lapata
Abstract	In this paper we focus on learning dependency aware representations for semantic role labeling without recourse to an external parser. The backbone of our model is an LSTM-based semantic role labeler jointly trained with two auxiliary tasks: predicting the dependency label of a word and whether there exists an arc linking it to the predicate. The auxiliary tasks provide syntactic information that is specific to semantic role labeling and are learned from training data (dependency annotations) without relying on existing dependency parsers, which can be noisy (e.g., on out-of-domain data or infrequent constructions). Experimental results on the CoNLL-2009 benchmark dataset show that our model outperforms the state of the art in English, and consistently improves performance in other languages, including Chinese, German, and Spanish.
Tasks	Semantic Role Labeling
Published	2019-03-01
URL	https://www.aclweb.org/anthology/Q19-1022/
PDF	https://www.aclweb.org/anthology/Q19-1022
PWC	https://paperswithcode.com/paper/syntax-aware-semantic-role-labeling-without
Repo
Framework

Gradient-based learning for F-measure and other performance metrics


Title	Gradient-based learning for F-measure and other performance metrics
Authors	Yu Gai, Zheng Zhang, Kyunghyun Cho
Abstract	Many important classification performance metrics, e.g. $F$-measure, are non-differentiable and non-decomposable, and are thus unfriendly to gradient descent algorithm. Consequently, despite their popularity as evaluation metrics, these metrics are rarely optimized as training objectives in neural network community. In this paper, we propose an empirical utility maximization scheme with provable learning guarantees to address the non-differentiability of these metrics. We then derive a strongly consistent gradient estimator to handle non-decomposability. These innovations enable end-to-end optimization of these metrics with the same computational complexity as optimizing a decomposable and differentiable metric, e.g. cross-entropy loss.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=H1zxjsCqKQ
PDF	https://openreview.net/pdf?id=H1zxjsCqKQ
PWC	https://paperswithcode.com/paper/gradient-based-learning-for-f-measure-and
Repo
Framework


Title	Step-wise Refinement Classification Approach for Enterprise Legal Litigation
Authors	Ying Mao, Xian Wang, Jianbo Tang, Changliang Li
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5505/
PDF	https://www.aclweb.org/anthology/W19-5505
PWC	https://paperswithcode.com/paper/step-wise-refinement-classification-approach
Repo
Framework

Proceedings of the First Workshop on Financial Technology and Natural Language Processing


Title	Proceedings of the First Workshop on Financial Technology and Natural Language Processing
Authors
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5500/
PDF	https://www.aclweb.org/anthology/W19-5500
PWC	https://paperswithcode.com/paper/proceedings-of-the-first-workshop-on-12
Repo
Framework

Equity Beyond Bias in Language Technologies for Education


Title	Equity Beyond Bias in Language Technologies for Education
Authors	Elijah Mayfield, Michael Madaio, Shrimai Prabhumoye, David Gerritsen, Brittany McLaughlin, Ezekiel Dixon-Rom{'a}n, Alan W Black
Abstract	There is a long record of research on equity in schools. As machine learning researchers begin to study fairness and bias in earnest, language technologies in education have an unusually strong theoretical and applied foundation to build on. Here, we introduce concepts from culturally relevant pedagogy and other frameworks for teaching and learning, identifying future work on equity in NLP. We present case studies in a range of topics like intelligent tutoring systems, computer-assisted language learning, automated essay scoring, and sentiment analysis in classrooms, and provide an actionable agenda for research.
Tasks	Sentiment Analysis
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4446/
PDF	https://www.aclweb.org/anthology/W19-4446
PWC	https://paperswithcode.com/paper/equity-beyond-bias-in-language-technologies
Repo
Framework

REPRESENTATION COMPRESSION AND GENERALIZATION IN DEEP NEURAL NETWORKS


Title	REPRESENTATION COMPRESSION AND GENERALIZATION IN DEEP NEURAL NETWORKS
Authors	Ravid Shwartz-Ziv, Amichai Painsky, Naftali Tishby
Abstract	Understanding the groundbreaking performance of Deep Neural Networks is one of the greatest challenges to the scientific community today. In this work, we introduce an information theoretic viewpoint on the behavior of deep networks optimization processes and their generalization abilities. By studying the Information Plane, the plane of the mutual information between the input variable and the desired label, for each hidden layer. Specifically, we show that the training of the network is characterized by a rapid increase in the mutual information (MI) between the layers and the target label, followed by a longer decrease in the MI between the layers and the input variable. Further, we explicitly show that these two fundamental information-theoretic quantities correspond to the generalization error of the network, as a result of introducing a new generalization bound that is exponential in the representation compression. The analysis focuses on typical patterns of large-scale problems. For this purpose, we introduce a novel analytic bound on the mutual information between consecutive layers in the network. An important consequence of our analysis is a super-linear boost in training time with the number of non-degenerate hidden layers, demonstrating the computational benefit of the hidden layers.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=SkeL6sCqK7
PDF	https://openreview.net/pdf?id=SkeL6sCqK7
PWC	https://paperswithcode.com/paper/representation-compression-and-generalization
Repo
Framework

Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)


Title	Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Authors
Abstract
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-6100/
PDF	https://www.aclweb.org/anthology/D19-6100
PWC	https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-deep
Repo
Framework

Cross-Cultural Transfer Learning for Text Classification


Title	Cross-Cultural Transfer Learning for Text Classification
Authors	Dor Ringel, Gal Lavee, Ido Guy, Kira Radinsky
Abstract	Large training datasets are required to achieve competitive performance in most natural language tasks. The acquisition process for these datasets is labor intensive, expensive, and time consuming. This process is also prone to human errors. In this work, we show that cross-cultural differences can be harnessed for natural language text classification. We present a transfer-learning framework that leverages widely-available unaligned bilingual corpora for classification tasks, using no task-specific data. Our empirical evaluation on two tasks {–} formality classification and sarcasm detection {–} shows that the cross-cultural difference between German and American English, as manifested in product review text, can be applied to achieve good performance for formality classification, while the difference between Japanese and American English can be applied to achieve good performance for sarcasm detection {–} both without any task-specific labeled data.
Tasks	Sarcasm Detection, Text Classification, Transfer Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1400/
PDF	https://www.aclweb.org/anthology/D19-1400
PWC	https://paperswithcode.com/paper/cross-cultural-transfer-learning-for-text
Repo
Framework

Siamese Networks: The Tale of Two Manifolds


Title	Siamese Networks: The Tale of Two Manifolds
Authors	Soumava Kumar Roy, Mehrtash Harandi, Richard Nock, Richard Hartley
Abstract	Siamese networks are non-linear deep models that have found their ways into a broad set of problems in learning theory, thanks to their embedding capabilities. In this paper, we study Siamese networks from a new perspective and question the validity of their training procedure. We show that in the majority of cases, the objective of a Siamese network is endowed with an invariance property. Neglecting the invariance property leads to a hindrance in training the Siamese networks. To alleviate this issue, we propose two Riemannian structures and generalize a well-established accelerated stochastic gradient descent method to take into account the proposed Riemannian structures. Our empirical evaluations suggest that by making use of the Riemannian geometry, we achieve state-of-the-art results against several algorithms for the challenging problem of fine-grained image classification.
Tasks	Fine-Grained Image Classification, Image Classification
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Roy_Siamese_Networks_The_Tale_of_Two_Manifolds_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Roy_Siamese_Networks_The_Tale_of_Two_Manifolds_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/siamese-networks-the-tale-of-two-manifolds
Repo
Framework