Paper Group NANR 216
Private Testing of Distributions via Sample Permutations. Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection. Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European. Automatic Grammatical Error Correction for Sequence-to-sequence Text Gene …
Private Testing of Distributions via Sample Permutations
Title | Private Testing of Distributions via Sample Permutations |
Authors | Maryam Aliakbarpour, Ilias Diakonikolas, Daniel Kane, Ronitt Rubinfeld |
Abstract | Statistical tests are at the heart of many scientific tasks. To validate their hypothesis, researchers in medical and social sciences use individuals’ data. The sensitivity of participants’ data requires the design of statistical tests that ensure the privacy of the individuals in the most efficient way. In this paper, we use the framework of property testing to design algorithms to test the properties of the distribution that the data is drawn from with respect to differential privacy. In particular, we investigate testing two fundamental properties of distributions: (1) testing the equivalence of two distributions when we have unequal numbers of samples from the two distributions. (2) Testing independence of two random variables. In both cases, we show that our testers achieve near optimal sample complexity (up to logarithmic factors). Moreover, our dependence on the privacy parameter is an additive term, which indicates that differential privacy can be obtained in most regimes of parameters for free. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9270-private-testing-of-distributions-via-sample-permutations |
http://papers.nips.cc/paper/9270-private-testing-of-distributions-via-sample-permutations.pdf | |
PWC | https://paperswithcode.com/paper/private-testing-of-distributions-via-sample |
Repo | |
Framework | |
Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection
Title | Leveraging Long-Range Temporal Relationships Between Proposals for Video Object Detection |
Authors | Mykhailo Shvets, Wei Liu, Alexander C. Berg |
Abstract | Single-frame object detectors perform well on videos sometimes, even without temporal context. However, challenges such as occlusion, motion blur, and rare poses of objects are hard to resolve without temporal awareness. Thus, there is a strong need to improve video object detection by considering long-range temporal dependencies. In this paper, we present a light-weight modification to a single-frame detector that accounts for arbitrary long dependencies in a video. It improves the accuracy of a single-frame detector significantly with negligible compute overhead. The key component of our approach is a novel temporal relation module, operating on object proposals, that learns the similarities between proposals from different frames and selects proposals from past and/or future to support current proposals. Our final “causal” model, without any offline post-processing steps, runs at a similar speed as a single-frame detector and achieves state-of-the-art video object detection on ImageNet VID dataset. |
Tasks | Object Detection, Video Object Detection |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Shvets_Leveraging_Long-Range_Temporal_Relationships_Between_Proposals_for_Video_Object_Detection_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Shvets_Leveraging_Long-Range_Temporal_Relationships_Between_Proposals_for_Video_Object_Detection_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-long-range-temporal-relationships |
Repo | |
Framework | |
Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European
Title | Predicting Historical Phonetic Features using Deep Neural Networks: A Case Study of the Phonetic System of Proto-Indo-European |
Authors | Frederik Hartmann |
Abstract | Traditional historical linguistics lacks the possibility to empirically assess its assumptions regarding the phonetic systems of past languages and language stages since most current methods rely on comparative tools to gain insights into phonetic features of sounds in proto- or ancestor languages. The paper at hand presents a computational method based on deep neural networks to predict phonetic features of historical sounds where the exact quality is unknown and to test the overall coherence of reconstructed historical phonetic features. The method utilizes the principles of coarticulation, local predictability and statistical phonological constraints to predict phonetic features by the features of their immediate phonetic environment. The validity of this method will be assessed using New High German phonetic data and its specific application to diachronic linguistics will be demonstrated in a case study of the phonetic system Proto-Indo-European. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4713/ |
https://www.aclweb.org/anthology/W19-4713 | |
PWC | https://paperswithcode.com/paper/predicting-historical-phonetic-features-using |
Repo | |
Framework | |
Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study
Title | Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study |
Authors | Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou |
Abstract | Sequence-to-sequence (seq2seq) models have achieved tremendous success in text generation tasks. However, there is no guarantee that they can always generate sentences without grammatical errors. In this paper, we present a preliminary empirical study on whether and how much automatic grammatical error correction can help improve seq2seq text generation. We conduct experiments across various seq2seq text generation tasks including machine translation, formality style transfer, sentence compression and simplification. Experiments show the state-of-the-art grammatical error correction system can improve the grammaticality of generated text and can bring task-oriented improvements in the tasks where target sentences are in a formal style. |
Tasks | Grammatical Error Correction, Machine Translation, Sentence Compression, Style Transfer, Text Generation |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1609/ |
https://www.aclweb.org/anthology/P19-1609 | |
PWC | https://paperswithcode.com/paper/automatic-grammatical-error-correction-for |
Repo | |
Framework | |
Thompson Sampling and Approximate Inference
Title | Thompson Sampling and Approximate Inference |
Authors | My Phan, Yasin Abbasi Yadkori, Justin Domke |
Abstract | We study the effects of approximate inference on the performance of Thompson sampling in the $k$-armed bandit problems. Thompson sampling is a successful algorithm for online decision-making but requires posterior inference, which often must be approximated in practice. We show that even small constant inference error (in $\alpha$-divergence) can lead to poor performance (linear regret) due to under-exploration (for $\alpha<1$) or over-exploration (for $\alpha>0$) by the approximation. While for $\alpha > 0$ this is unavoidable, for $\alpha \leq 0$ the regret can be improved by adding a small amount of forced exploration even when the inference error is a large constant. |
Tasks | Decision Making |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9084-thompson-sampling-and-approximate-inference |
http://papers.nips.cc/paper/9084-thompson-sampling-and-approximate-inference.pdf | |
PWC | https://paperswithcode.com/paper/thompson-sampling-and-approximate-inference-1 |
Repo | |
Framework | |
Aligning Artificial Neural Networks to the Brain yields Shallow Recurrent Architectures
Title | Aligning Artificial Neural Networks to the Brain yields Shallow Recurrent Architectures |
Authors | Jonas Kubilius, Martin Schrimpf, Ha Hong, Najib J. Majaj, Rishi Rajalingham, Elias B. Issa, Kohitij Kar, Pouya Bashivan, Jonathan Prescott-Roy, Kailyn Schmidt, Aran Nayebi, Daniel Bear, Daniel L. K. Yamins, James J. DiCarlo |
Abstract | Deep artificial neural networks with spatially repeated processing (a.k.a., deep convolutional ANNs) have been established as the best class of candidate models of visual processing in the primate ventral visual processing stream. Over the past five years, these ANNs have evolved from a simple feedforward eight-layer architecture in AlexNet to extremely deep and branching NASNet architectures, demonstrating increasingly better object categorization performance. Here we ask, as ANNs have continued to evolve in performance, are they also strong candidate models for the brain? To answer this question, we developed Brain-Score, a composite of neural and behavioral benchmarks for determining how brain-like a model is, together with an online platform where models can receive a Brain-Score and compare against other models. Despite high scores, typical deep models from the machine learning community are often hard to map onto the brain’s anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. To further map onto anatomy and validate our approach, we built CORnet-S: an ANN guided by Brain-Score with the anatomical constraints of compactness and recurrence. Although a shallow model with four anatomically mapped areas and recurrent connectivity, CORnet-S is a top model on Brain-Score and outperforms similarly compact models on ImageNet. Analyzing CORnet-S circuitry variants revealed recurrence as the main predictive factor of both Brain-Score and ImageNet top-1 performance. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJeY6sR9KX |
https://openreview.net/pdf?id=BJeY6sR9KX | |
PWC | https://paperswithcode.com/paper/aligning-artificial-neural-networks-to-the |
Repo | |
Framework | |
Syntax-aware Semantic Role Labeling without Parsing
Title | Syntax-aware Semantic Role Labeling without Parsing |
Authors | Rui Cai, Mirella Lapata |
Abstract | In this paper we focus on learning dependency aware representations for semantic role labeling without recourse to an external parser. The backbone of our model is an LSTM-based semantic role labeler jointly trained with two auxiliary tasks: predicting the dependency label of a word and whether there exists an arc linking it to the predicate. The auxiliary tasks provide syntactic information that is specific to semantic role labeling and are learned from training data (dependency annotations) without relying on existing dependency parsers, which can be noisy (e.g., on out-of-domain data or infrequent constructions). Experimental results on the CoNLL-2009 benchmark dataset show that our model outperforms the state of the art in English, and consistently improves performance in other languages, including Chinese, German, and Spanish. |
Tasks | Semantic Role Labeling |
Published | 2019-03-01 |
URL | https://www.aclweb.org/anthology/Q19-1022/ |
https://www.aclweb.org/anthology/Q19-1022 | |
PWC | https://paperswithcode.com/paper/syntax-aware-semantic-role-labeling-without |
Repo | |
Framework | |
Gradient-based learning for F-measure and other performance metrics
Title | Gradient-based learning for F-measure and other performance metrics |
Authors | Yu Gai, Zheng Zhang, Kyunghyun Cho |
Abstract | Many important classification performance metrics, e.g. $F$-measure, are non-differentiable and non-decomposable, and are thus unfriendly to gradient descent algorithm. Consequently, despite their popularity as evaluation metrics, these metrics are rarely optimized as training objectives in neural network community. In this paper, we propose an empirical utility maximization scheme with provable learning guarantees to address the non-differentiability of these metrics. We then derive a strongly consistent gradient estimator to handle non-decomposability. These innovations enable end-to-end optimization of these metrics with the same computational complexity as optimizing a decomposable and differentiable metric, e.g. cross-entropy loss. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=H1zxjsCqKQ |
https://openreview.net/pdf?id=H1zxjsCqKQ | |
PWC | https://paperswithcode.com/paper/gradient-based-learning-for-f-measure-and |
Repo | |
Framework | |
Step-wise Refinement Classification Approach for Enterprise Legal Litigation
Title | Step-wise Refinement Classification Approach for Enterprise Legal Litigation |
Authors | Ying Mao, Xian Wang, Jianbo Tang, Changliang Li |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5505/ |
https://www.aclweb.org/anthology/W19-5505 | |
PWC | https://paperswithcode.com/paper/step-wise-refinement-classification-approach |
Repo | |
Framework | |
Proceedings of the First Workshop on Financial Technology and Natural Language Processing
Title | Proceedings of the First Workshop on Financial Technology and Natural Language Processing |
Authors | |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5500/ |
https://www.aclweb.org/anthology/W19-5500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-first-workshop-on-12 |
Repo | |
Framework | |
Equity Beyond Bias in Language Technologies for Education
Title | Equity Beyond Bias in Language Technologies for Education |
Authors | Elijah Mayfield, Michael Madaio, Shrimai Prabhumoye, David Gerritsen, Brittany McLaughlin, Ezekiel Dixon-Rom{'a}n, Alan W Black |
Abstract | There is a long record of research on equity in schools. As machine learning researchers begin to study fairness and bias in earnest, language technologies in education have an unusually strong theoretical and applied foundation to build on. Here, we introduce concepts from culturally relevant pedagogy and other frameworks for teaching and learning, identifying future work on equity in NLP. We present case studies in a range of topics like intelligent tutoring systems, computer-assisted language learning, automated essay scoring, and sentiment analysis in classrooms, and provide an actionable agenda for research. |
Tasks | Sentiment Analysis |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4446/ |
https://www.aclweb.org/anthology/W19-4446 | |
PWC | https://paperswithcode.com/paper/equity-beyond-bias-in-language-technologies |
Repo | |
Framework | |
REPRESENTATION COMPRESSION AND GENERALIZATION IN DEEP NEURAL NETWORKS
Title | REPRESENTATION COMPRESSION AND GENERALIZATION IN DEEP NEURAL NETWORKS |
Authors | Ravid Shwartz-Ziv, Amichai Painsky, Naftali Tishby |
Abstract | Understanding the groundbreaking performance of Deep Neural Networks is one of the greatest challenges to the scientific community today. In this work, we introduce an information theoretic viewpoint on the behavior of deep networks optimization processes and their generalization abilities. By studying the Information Plane, the plane of the mutual information between the input variable and the desired label, for each hidden layer. Specifically, we show that the training of the network is characterized by a rapid increase in the mutual information (MI) between the layers and the target label, followed by a longer decrease in the MI between the layers and the input variable. Further, we explicitly show that these two fundamental information-theoretic quantities correspond to the generalization error of the network, as a result of introducing a new generalization bound that is exponential in the representation compression. The analysis focuses on typical patterns of large-scale problems. For this purpose, we introduce a novel analytic bound on the mutual information between consecutive layers in the network. An important consequence of our analysis is a super-linear boost in training time with the number of non-degenerate hidden layers, demonstrating the computational benefit of the hidden layers. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SkeL6sCqK7 |
https://openreview.net/pdf?id=SkeL6sCqK7 | |
PWC | https://paperswithcode.com/paper/representation-compression-and-generalization |
Repo | |
Framework | |
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Title | Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019) |
Authors | |
Abstract | |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-6100/ |
https://www.aclweb.org/anthology/D19-6100 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-2nd-workshop-on-deep |
Repo | |
Framework | |
Cross-Cultural Transfer Learning for Text Classification
Title | Cross-Cultural Transfer Learning for Text Classification |
Authors | Dor Ringel, Gal Lavee, Ido Guy, Kira Radinsky |
Abstract | Large training datasets are required to achieve competitive performance in most natural language tasks. The acquisition process for these datasets is labor intensive, expensive, and time consuming. This process is also prone to human errors. In this work, we show that cross-cultural differences can be harnessed for natural language text classification. We present a transfer-learning framework that leverages widely-available unaligned bilingual corpora for classification tasks, using no task-specific data. Our empirical evaluation on two tasks {–} formality classification and sarcasm detection {–} shows that the cross-cultural difference between German and American English, as manifested in product review text, can be applied to achieve good performance for formality classification, while the difference between Japanese and American English can be applied to achieve good performance for sarcasm detection {–} both without any task-specific labeled data. |
Tasks | Sarcasm Detection, Text Classification, Transfer Learning |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1400/ |
https://www.aclweb.org/anthology/D19-1400 | |
PWC | https://paperswithcode.com/paper/cross-cultural-transfer-learning-for-text |
Repo | |
Framework | |
Siamese Networks: The Tale of Two Manifolds
Title | Siamese Networks: The Tale of Two Manifolds |
Authors | Soumava Kumar Roy, Mehrtash Harandi, Richard Nock, Richard Hartley |
Abstract | Siamese networks are non-linear deep models that have found their ways into a broad set of problems in learning theory, thanks to their embedding capabilities. In this paper, we study Siamese networks from a new perspective and question the validity of their training procedure. We show that in the majority of cases, the objective of a Siamese network is endowed with an invariance property. Neglecting the invariance property leads to a hindrance in training the Siamese networks. To alleviate this issue, we propose two Riemannian structures and generalize a well-established accelerated stochastic gradient descent method to take into account the proposed Riemannian structures. Our empirical evaluations suggest that by making use of the Riemannian geometry, we achieve state-of-the-art results against several algorithms for the challenging problem of fine-grained image classification. |
Tasks | Fine-Grained Image Classification, Image Classification |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Roy_Siamese_Networks_The_Tale_of_Two_Manifolds_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Roy_Siamese_Networks_The_Tale_of_Two_Manifolds_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/siamese-networks-the-tale-of-two-manifolds |
Repo | |
Framework | |