January 24, 2020

2309 words 11 mins read

Paper Group NANR 216

Paper Group NANR 216

Private Testing of Distributions via Sample Permutations. Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection. Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European. Automatic Grammatical Error Correction for Sequence-to-sequence Text Gene …

Private Testing of Distributions via Sample Permutations

Title Private Testing of Distributions via Sample Permutations
Authors Maryam Aliakbarpour, Ilias Diakonikolas, Daniel Kane, Ronitt Rubinfeld
Abstract Statistical tests are at the heart of many scientific tasks. To validate their hypothesis, researchers in medical and social sciences use individuals’ data. The sensitivity of participants’ data requires the design of statistical tests that ensure the privacy of the individuals in the most efficient way. In this paper, we use the framework of property testing to design algorithms to test the properties of the distribution that the data is drawn from with respect to differential privacy. In particular, we investigate testing two fundamental properties of distributions: (1) testing the equivalence of two distributions when we have unequal numbers of samples from the two distributions. (2) Testing independence of two random variables. In both cases, we show that our testers achieve near optimal sample complexity (up to logarithmic factors). Moreover, our dependence on the privacy parameter is an additive term, which indicates that differential privacy can be obtained in most regimes of parameters for free.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9270-private-testing-of-distributions-via-sample-permutations
PDF http://papers.nips.cc/paper/9270-private-testing-of-distributions-via-sample-permutations.pdf
PWC https://paperswithcode.com/paper/private-testing-of-distributions-via-sample
Repo
Framework

Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection

Title Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection
Authors Mykhailo Shvets, Wei Liu, Alexander C. Berg
Abstract Single-frame object detectors perform well on videos sometimes, even without temporal context. However, challenges such as occlusion, motion blur, and rare poses of objects are hard to resolve without temporal awareness. Thus, there is a strong need to improve video object detection by considering long-range temporal dependencies. In this paper, we present a light-weight modification to a single-frame detector that accounts for arbitrary long dependencies in a video. It improves the accuracy of a single-frame detector significantly with negligible compute overhead. The key component of our approach is a novel temporal relation module, operating on object proposals, that learns the similarities between proposals from different frames and selects proposals from past and/or future to support current proposals. Our final “causal” model, without any offline post-processing steps, runs at a similar speed as a single-frame detector and achieves state-of-the-art video object detection on ImageNet VID dataset.
Tasks Object Detection, Video Object Detection
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Shvets_Leveraging_Long-Range_Temporal_Relationships_Between_Proposals_for_Video_Object_Detection_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Shvets_Leveraging_Long-Range_Temporal_Relationships_Between_Proposals_for_Video_Object_Detection_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/leveraging-long-range-temporal-relationships
Repo
Framework

Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European

Title Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European
Authors Frederik Hartmann
Abstract Traditional historical linguistics lacks the possibility to empirically assess its assumptions regarding the phonetic systems of past languages and language stages since most current methods rely on comparative tools to gain insights into phonetic features of sounds in proto- or ancestor languages. The paper at hand presents a computational method based on deep neural networks to predict phonetic features of historical sounds where the exact quality is unknown and to test the overall coherence of reconstructed historical phonetic features. The method utilizes the principles of coarticulation, local predictability and statistical phonological constraints to predict phonetic features by the features of their immediate phonetic environment. The validity of this method will be assessed using New High German phonetic data and its specific application to diachronic linguistics will be demonstrated in a case study of the phonetic system Proto-Indo-European.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4713/
PDF https://www.aclweb.org/anthology/W19-4713
PWC https://paperswithcode.com/paper/predicting-historical-phonetic-features-using
Repo
Framework

Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study

Title Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study
Authors Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou
Abstract Sequence-to-sequence (seq2seq) models have achieved tremendous success in text generation tasks. However, there is no guarantee that they can always generate sentences without grammatical errors. In this paper, we present a preliminary empirical study on whether and how much automatic grammatical error correction can help improve seq2seq text generation. We conduct experiments across various seq2seq text generation tasks including machine translation, formality style transfer, sentence compression and simplification. Experiments show the state-of-the-art grammatical error correction system can improve the grammaticality of generated text and can bring task-oriented improvements in the tasks where target sentences are in a formal style.
Tasks Grammatical Error Correction, Machine Translation, Sentence Compression, Style Transfer, Text Generation
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1609/
PDF https://www.aclweb.org/anthology/P19-1609
PWC https://paperswithcode.com/paper/automatic-grammatical-error-correction-for
Repo
Framework

Thompson Sampling and Approximate Inference

Title Thompson Sampling and Approximate Inference
Authors My Phan, Yasin Abbasi Yadkori, Justin Domke
Abstract We study the effects of approximate inference on the performance of Thompson sampling in the $k$-armed bandit problems. Thompson sampling is a successful algorithm for online decision-making but requires posterior inference, which often must be approximated in practice. We show that even small constant inference error (in $\alpha$-divergence) can lead to poor performance (linear regret) due to under-exploration (for $\alpha<1$) or over-exploration (for $\alpha>0$) by the approximation. While for $\alpha > 0$ this is unavoidable, for $\alpha \leq 0$ the regret can be improved by adding a small amount of forced exploration even when the inference error is a large constant.
Tasks Decision Making
Published 2019-12-01
URL http://papers.nips.cc/paper/9084-thompson-sampling-and-approximate-inference
PDF http://papers.nips.cc/paper/9084-thompson-sampling-and-approximate-inference.pdf
PWC https://paperswithcode.com/paper/thompson-sampling-and-approximate-inference-1
Repo
Framework

Aligning Artificial Neural Networks to the Brain yields Shallow Recurrent Architectures

Title Aligning Artificial Neural Networks to the Brain yields Shallow Recurrent Architectures
Authors Jonas Kubilius, Martin Schrimpf, Ha Hong, Najib J. Majaj, Rishi Rajalingham, Elias B. Issa, Kohitij Kar, Pouya Bashivan, Jonathan Prescott-Roy, Kailyn Schmidt, Aran Nayebi, Daniel Bear, Daniel L. K. Yamins, James J. DiCarlo
Abstract Deep artificial neural networks with spatially repeated processing (a.k.a., deep convolutional ANNs) have been established as the best class of candidate models of visual processing in the primate ventral visual processing stream. Over the past five years, these ANNs have evolved from a simple feedforward eight-layer architecture in AlexNet to extremely deep and branching NASNet architectures, demonstrating increasingly better object categorization performance. Here we ask, as ANNs have continued to evolve in performance, are they also strong candidate models for the brain? To answer this question, we developed Brain-Score, a composite of neural and behavioral benchmarks for determining how brain-like a model is, together with an online platform where models can receive a Brain-Score and compare against other models. Despite high scores, typical deep models from the machine learning community are often hard to map onto the brain’s anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. To further map onto anatomy and validate our approach, we built CORnet-S: an ANN guided by Brain-Score with the anatomical constraints of compactness and recurrence. Although a shallow model with four anatomically mapped areas and recurrent connectivity, CORnet-S is a top model on Brain-Score and outperforms similarly compact models on ImageNet. Analyzing CORnet-S circuitry variants revealed recurrence as the main predictive factor of both Brain-Score and ImageNet top-1 performance.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=BJeY6sR9KX
PDF https://openreview.net/pdf?id=BJeY6sR9KX
PWC https://paperswithcode.com/paper/aligning-artificial-neural-networks-to-the
Repo
Framework

Syntax-aware Semantic Role Labeling without Parsing

Title Syntax-aware Semantic Role Labeling without Parsing
Authors Rui Cai, Mirella Lapata
Abstract In this paper we focus on learning dependency aware representations for semantic role labeling without recourse to an external parser. The backbone of our model is an LSTM-based semantic role labeler jointly trained with two auxiliary tasks: predicting the dependency label of a word and whether there exists an arc linking it to the predicate. The auxiliary tasks provide syntactic information that is specific to semantic role labeling and are learned from training data (dependency annotations) without relying on existing dependency parsers, which can be noisy (e.g., on out-of-domain data or infrequent constructions). Experimental results on the CoNLL-2009 benchmark dataset show that our model outperforms the state of the art in English, and consistently improves performance in other languages, including Chinese, German, and Spanish.
Tasks Semantic Role Labeling
Published 2019-03-01
URL https://www.aclweb.org/anthology/Q19-1022/
PDF https://www.aclweb.org/anthology/Q19-1022
PWC https://paperswithcode.com/paper/syntax-aware-semantic-role-labeling-without
Repo
Framework

Gradient-based learning for F-measure and other performance metrics

Title Gradient-based learning for F-measure and other performance metrics
Authors Yu Gai, Zheng Zhang, Kyunghyun Cho
Abstract Many important classification performance metrics, e.g. $F$-measure, are non-differentiable and non-decomposable, and are thus unfriendly to gradient descent algorithm. Consequently, despite their popularity as evaluation metrics, these metrics are rarely optimized as training objectives in neural network community. In this paper, we propose an empirical utility maximization scheme with provable learning guarantees to address the non-differentiability of these metrics. We then derive a strongly consistent gradient estimator to handle non-decomposability. These innovations enable end-to-end optimization of these metrics with the same computational complexity as optimizing a decomposable and differentiable metric, e.g. cross-entropy loss.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=H1zxjsCqKQ
PDF https://openreview.net/pdf?id=H1zxjsCqKQ
PWC https://paperswithcode.com/paper/gradient-based-learning-for-f-measure-and
Repo
Framework
Title Step-wise Refinement Classification Approach for Enterprise Legal Litigation
Authors Ying Mao, Xian Wang, Jianbo Tang, Changliang Li
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5505/
PDF https://www.aclweb.org/anthology/W19-5505
PWC https://paperswithcode.com/paper/step-wise-refinement-classification-approach
Repo
Framework

Proceedings of the First Workshop on Financial Technology and Natural Language Processing

Title Proceedings of the First Workshop on Financial Technology and Natural Language Processing
Authors
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5500/
PDF https://www.aclweb.org/anthology/W19-5500
PWC https://paperswithcode.com/paper/proceedings-of-the-first-workshop-on-12
Repo
Framework

Equity Beyond Bias in Language Technologies for Education

Title Equity Beyond Bias in Language Technologies for Education
Authors Elijah Mayfield, Michael Madaio, Shrimai Prabhumoye, David Gerritsen, Brittany McLaughlin, Ezekiel Dixon-Rom{'a}n, Alan W Black
Abstract There is a long record of research on equity in schools. As machine learning researchers begin to study fairness and bias in earnest, language technologies in education have an unusually strong theoretical and applied foundation to build on. Here, we introduce concepts from culturally relevant pedagogy and other frameworks for teaching and learning, identifying future work on equity in NLP. We present case studies in a range of topics like intelligent tutoring systems, computer-assisted language learning, automated essay scoring, and sentiment analysis in classrooms, and provide an actionable agenda for research.
Tasks Sentiment Analysis
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4446/
PDF https://www.aclweb.org/anthology/W19-4446
PWC https://paperswithcode.com/paper/equity-beyond-bias-in-language-technologies
Repo
Framework

REPRESENTATION COMPRESSION AND GENERALIZATION IN DEEP NEURAL NETWORKS

Title REPRESENTATION COMPRESSION AND GENERALIZATION IN DEEP NEURAL NETWORKS
Authors Ravid Shwartz-Ziv, Amichai Painsky, Naftali Tishby
Abstract Understanding the groundbreaking performance of Deep Neural Networks is one of the greatest challenges to the scientific community today. In this work, we introduce an information theoretic viewpoint on the behavior of deep networks optimization processes and their generalization abilities. By studying the Information Plane, the plane of the mutual information between the input variable and the desired label, for each hidden layer. Specifically, we show that the training of the network is characterized by a rapid increase in the mutual information (MI) between the layers and the target label, followed by a longer decrease in the MI between the layers and the input variable. Further, we explicitly show that these two fundamental information-theoretic quantities correspond to the generalization error of the network, as a result of introducing a new generalization bound that is exponential in the representation compression. The analysis focuses on typical patterns of large-scale problems. For this purpose, we introduce a novel analytic bound on the mutual information between consecutive layers in the network. An important consequence of our analysis is a super-linear boost in training time with the number of non-degenerate hidden layers, demonstrating the computational benefit of the hidden layers.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=SkeL6sCqK7
PDF https://openreview.net/pdf?id=SkeL6sCqK7
PWC https://paperswithcode.com/paper/representation-compression-and-generalization
Repo
Framework

Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)

Title Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Authors
Abstract
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6100/
PDF https://www.aclweb.org/anthology/D19-6100
PWC https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-deep
Repo
Framework

Cross-Cultural Transfer Learning for Text Classification

Title Cross-Cultural Transfer Learning for Text Classification
Authors Dor Ringel, Gal Lavee, Ido Guy, Kira Radinsky
Abstract Large training datasets are required to achieve competitive performance in most natural language tasks. The acquisition process for these datasets is labor intensive, expensive, and time consuming. This process is also prone to human errors. In this work, we show that cross-cultural differences can be harnessed for natural language text classification. We present a transfer-learning framework that leverages widely-available unaligned bilingual corpora for classification tasks, using no task-specific data. Our empirical evaluation on two tasks {–} formality classification and sarcasm detection {–} shows that the cross-cultural difference between German and American English, as manifested in product review text, can be applied to achieve good performance for formality classification, while the difference between Japanese and American English can be applied to achieve good performance for sarcasm detection {–} both without any task-specific labeled data.
Tasks Sarcasm Detection, Text Classification, Transfer Learning
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1400/
PDF https://www.aclweb.org/anthology/D19-1400
PWC https://paperswithcode.com/paper/cross-cultural-transfer-learning-for-text
Repo
Framework

Siamese Networks: The Tale of Two Manifolds

Title Siamese Networks: The Tale of Two Manifolds
Authors Soumava Kumar Roy, Mehrtash Harandi, Richard Nock, Richard Hartley
Abstract Siamese networks are non-linear deep models that have found their ways into a broad set of problems in learning theory, thanks to their embedding capabilities. In this paper, we study Siamese networks from a new perspective and question the validity of their training procedure. We show that in the majority of cases, the objective of a Siamese network is endowed with an invariance property. Neglecting the invariance property leads to a hindrance in training the Siamese networks. To alleviate this issue, we propose two Riemannian structures and generalize a well-established accelerated stochastic gradient descent method to take into account the proposed Riemannian structures. Our empirical evaluations suggest that by making use of the Riemannian geometry, we achieve state-of-the-art results against several algorithms for the challenging problem of fine-grained image classification.
Tasks Fine-Grained Image Classification, Image Classification
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Roy_Siamese_Networks_The_Tale_of_Two_Manifolds_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Roy_Siamese_Networks_The_Tale_of_Two_Manifolds_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/siamese-networks-the-tale-of-two-manifolds
Repo
Framework
comments powered by Disqus