Paper Group NANR 11
An Automatic Operation Batching Strategy for the Backward Propagation of Neural Networks Having Dynamic Computation Graphs. Deep Learning and Sociophonetics: Automatic Coding of Rhoticity Using Neural Networks. Abstraction based Output Range Analysis for Neural Networks. My Turn To Read: An Interleaved E-book Reading Tool for Developing and Struggl …
An Automatic Operation Batching Strategy for the Backward Propagation of Neural Networks Having Dynamic Computation Graphs
Title | An Automatic Operation Batching Strategy for the Backward Propagation of Neural Networks Having Dynamic Computation Graphs |
Authors | Yuchen Qiao, Kenjiro Taura |
Abstract | Organizing the same operations in the computation graph of a neural network into batches is one of the important methods to improve the speed of training deep learning models and applications since it helps to execute operations with the same type in parallel and to make full use of the available hardware resources. This batching task is usually done by the developers manually and it becomes more dif- ficult when the neural networks have dynamic computation graphs because of the input data with varying structures or the dynamic flow control. Several automatic batching strategies were proposed and integrated into some deep learning toolkits so that the programmers don’t have to be responsible for this task. These strategies, however, will miss some important opportunities to group the operations in the backward propagation of training neural networks. In this paper, we proposed a strategy which provides more efficient automatic batching and brings benefits to the memory access in the backward propagation. We also test our strategy on a variety of benchmarks with dynamic computation graphs. The result shows that it really brings further improvements in the training speed when our strategy is working with the existing automatic strategies. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SkxXwo0qYm |
https://openreview.net/pdf?id=SkxXwo0qYm | |
PWC | https://paperswithcode.com/paper/an-automatic-operation-batching-strategy-for |
Repo | |
Framework | |
Deep Learning and Sociophonetics: Automatic Coding of Rhoticity Using Neural Networks
Title | Deep Learning and Sociophonetics: Automatic Coding of Rhoticity Using Neural Networks |
Authors | Sarah Gupta, Anthony DiPadova |
Abstract | Automated extraction methods are widely available for vowels, but automated methods for coding rhoticity have lagged far behind. R-fulness versus r-lessness (in words like park, store, etc.) is a classic and frequently cited variable, but it is still commonly coded by human analysts rather than automated methods. Human-coding requires extensive resources and lacks replicability, making it difficult to compare large datasets across research groups. Can reliable automated methods be developed to aid in coding rhoticity? In this study, we use Neural Networks/Deep Learning, training our model on 208 Boston-area speakers. |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-3013/ |
https://www.aclweb.org/anthology/N19-3013 | |
PWC | https://paperswithcode.com/paper/deep-learning-and-sociophonetics-automatic |
Repo | |
Framework | |
Abstraction based Output Range Analysis for Neural Networks
Title | Abstraction based Output Range Analysis for Neural Networks |
Authors | Pavithra Prabhakar, Zahra Rahimi Afzal |
Abstract | In this paper, we consider the problem of output range analysis for feed-forward neural networks. The current approaches reduce the problem to satisfiability and optimization solving which are NP-hard problems, and whose computational complexity increases with the number of neurons in the network. We present a novel abstraction technique that constructs a simpler neural network with fewer neurons, albeit with interval weights called interval neural network (INN) which over-approximates the output range of the given neural network. We reduce the output range analysis on the INNs to solving a mixed integer linear programming problem. Our experimental results highlight the trade-off between the computation time and the precision of the computed output range. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9708-abstraction-based-output-range-analysis-for-neural-networks |
http://papers.nips.cc/paper/9708-abstraction-based-output-range-analysis-for-neural-networks.pdf | |
PWC | https://paperswithcode.com/paper/abstraction-based-output-range-analysis-for |
Repo | |
Framework | |
My Turn To Read: An Interleaved E-book Reading Tool for Developing and Struggling Readers
Title | My Turn To Read: An Interleaved E-book Reading Tool for Developing and Struggling Readers |
Authors | Nitin Madnani, Beata Beigman Klebanov, Anastassia Loukina, Binod Gyawali, Patrick Lange, John Sabatini, Michael Flor |
Abstract | Literacy is crucial for functioning in modern society. It underpins everything from educational attainment and employment opportunities to health outcomes. We describe My Turn To Read, an app that uses interleaved reading to help developing and struggling readers improve reading skills while reading for meaning and pleasure. We hypothesize that the longer-term impact of the app will be to help users become better, more confident readers with an increased stamina for extended reading. We describe the technology and present preliminary evidence in support of this hypothesis. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-3024/ |
https://www.aclweb.org/anthology/P19-3024 | |
PWC | https://paperswithcode.com/paper/my-turn-to-read-an-interleaved-e-book-reading |
Repo | |
Framework | |
HR-TD: A Regularized TD Method to Avoid Over-Generalization
Title | HR-TD: A Regularized TD Method to Avoid Over-Generalization |
Authors | Ishan Durugkar, Bo Liu, Peter Stone |
Abstract | Temporal Difference learning with function approximation has been widely used recently and has led to several successful results. However, compared with the original tabular-based methods, one major drawback of temporal difference learning with neural networks and other function approximators is that they tend to over-generalize across temporally successive states, resulting in slow convergence and even instability. In this work, we propose a novel TD learning method, Hadamard product Regularized TD (HR-TD), that reduces over-generalization and thus leads to faster convergence. This approach can be easily applied to both linear and nonlinear function approximators. HR-TD is evaluated on several linear and nonlinear benchmark domains, where we show improvement in learning behavior and performance. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=rylbWhC5Ym |
https://openreview.net/pdf?id=rylbWhC5Ym | |
PWC | https://paperswithcode.com/paper/hr-td-a-regularized-td-method-to-avoid-over |
Repo | |
Framework | |
Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding
Title | Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding |
Authors | Hwa-Yeon Kim, Yoon-Hyung Roh, Young-Kil Kim |
Abstract | One of the main challenges in Spoken Language Understanding (SLU) is dealing with {}open-vocabulary{'} slots. Recently, SLU models based on neural network were proposed, but it is still difficult to recognize the slots of unknown words or { }open-vocabulary{'} slots because of the high cost of creating a manually tagged SLU dataset. This paper proposes data noising, which reflects the characteristics of the {`}open-vocabulary{'} slots, for data augmentation. We applied it to an attention based bi-directional recurrent neural network (Liu and Lane, 2016) and experimented with three datasets: Airline Travel Information System (ATIS), Snips, and MIT-Restaurant. We achieved performance improvements of up to 0.57{%} and 3.25 in intent prediction (accuracy) and slot filling (f1-score), respectively. Our method is advantageous because it does not require additional memory and it can be applied simultaneously with the training process of the model. | |
Tasks | Data Augmentation, Slot Filling, Spoken Language Understanding |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-3014/ |
https://www.aclweb.org/anthology/N19-3014 | |
PWC | https://paperswithcode.com/paper/data-augmentation-by-data-noising-for-open |
Repo | |
Framework | |
Triad Constraints for Learning Causal Structure of Latent Variables
Title | Triad Constraints for Learning Causal Structure of Latent Variables |
Authors | Ruichu Cai, Feng Xie, Clark Glymour, Zhifeng Hao, Kun Zhang |
Abstract | Learning causal structure from observational data has attracted much attention, and it is notoriously challenging to find the underlying structure in the presence of confounders (hidden direct common causes of two variables). In this paper, by properly leveraging the non-Gaussianity of the data, we propose to estimate the structure over latent variables with the so-called Triad constraints: we design a form of “pseudo-residual” from three variables, and show that when causal relations are linear and noise terms are non-Gaussian, the causal direction between the latent variables for the three observed variables is identifiable by checking a certain kind of independence relationship. In other words, the Triad constraints help us to locate latent confounders and determine the causal direction between them. This goes far beyond the Tetrad constraints and reveals more information about the underlying structure from non-Gaussian data. Finally, based on the Triad constraints, we develop a two-step algorithm to learn the causal structure corresponding to measurement models. Experimental results on both synthetic and real data demonstrate the effectiveness and reliability of our method. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9448-triad-constraints-for-learning-causal-structure-of-latent-variables |
http://papers.nips.cc/paper/9448-triad-constraints-for-learning-causal-structure-of-latent-variables.pdf | |
PWC | https://paperswithcode.com/paper/triad-constraints-for-learning-causal |
Repo | |
Framework | |
Divisive Language and Propaganda Detection using Multi-head Attention Transformers with Deep Learning BERT-based Language Models for Binary Classification
Title | Divisive Language and Propaganda Detection using Multi-head Attention Transformers with Deep Learning BERT-based Language Models for Binary Classification |
Authors | Norman Mapes, Anna White, Radhika Medury, Sumeet Dua |
Abstract | On the NLP4IF 2019 sentence level propaganda classification task, we used a BERT language model that was pre-trained on Wikipedia and BookCorpus as team ltuorp ranking {#}1 of 26. It uses deep learning in the form of an attention transformer. We substituted the final layer of the neural network to a linear real valued output neuron from a layer of softmaxes. The backpropagation trained the entire neural network and not just the last layer. Training took 3 epochs and on our computation resources this took approximately one day. The pre-trained model consisted of uncased words and there were 12-layers, 768-hidden neurons with 12-heads for a total of 110 million parameters. The articles used in the training data promote divisive language similar to state-actor-funded influence operations on social media. Twitter shows state-sponsored examples designed to maximize division occurring across political lines, ranging from {}Obama calls me a clinger, Hillary calls me deplorable, ... and Trump calls me an American{''} oriented to the political right, to Russian propaganda featuring { }Black Lives Matter{''} material with suggestions of institutional racism in US police forces oriented to the political left. We hope that raising awareness through our work will reduce the polarizing dialogue for the betterment of nations. |
Tasks | Language Modelling |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5014/ |
https://www.aclweb.org/anthology/D19-5014 | |
PWC | https://paperswithcode.com/paper/divisive-language-and-propaganda-detection |
Repo | |
Framework | |
Proceedings of the Workshop on Language Technology for Digital Historical Archives
Title | Proceedings of the Workshop on Language Technology for Digital Historical Archives |
Authors | |
Abstract | |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-9000/ |
https://www.aclweb.org/anthology/W19-9000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-workshop-on-language-3 |
Repo | |
Framework | |
Sea-Thru: A Method for Removing Water From Underwater Images
Title | Sea-Thru: A Method for Removing Water From Underwater Images |
Authors | Derya Akkaynak, Tali Treibitz |
Abstract | Robust recovery of lost colors in underwater images remains a challenging problem. We recently showed that this was partly due to the prevalent use of an atmospheric image formation model for underwater images. We proposed a physically accurate model that explicitly showed: 1) the attenuation coefficient of the signal is not uniform across the scene but depends on object range and reflectance, 2) the coefficient governing the increase in backscatter with distance differs from the signal attenuation coefficient. Here, we present a method that recovers color with the revised model using RGBD images. The Sea-thru method first calculates backscatter using the darkest pixels in the image and their known range information. Then, it uses an estimate of the spatially varying illuminant to obtain the range-dependent attenuation coefficient. Using more than 1,100 images from two optically different water bodies, which we make available, we show that our method outperforms those using the atmospheric model. Consistent removal of water will open up large underwater datasets to powerful computer vision and machine learning algorithms, creating exciting opportunities for the future of underwater exploration and conservation. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Akkaynak_Sea-Thru_A_Method_for_Removing_Water_From_Underwater_Images_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Akkaynak_Sea-Thru_A_Method_for_Removing_Water_From_Underwater_Images_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/sea-thru-a-method-for-removing-water-from |
Repo | |
Framework | |
An adaptive Mirror-Prox method for variational inequalities with singular operators
Title | An adaptive Mirror-Prox method for variational inequalities with singular operators |
Authors | Kimon Antonakopoulos, Veronica Belmega, Panayotis Mertikopoulos |
Abstract | Lipschitz continuity is a central requirement for achieving the optimal O(1/T) rate of convergence in monotone, deterministic variational inequalities (a setting that includes convex minimization, convex-concave optimization, nonatomic games, and many other problems). However, in many cases of practical interest, the operator defining the variational inequality may become singular at the boundary of the feasible region, precluding in this way the use of fast gradient methods that attain this rate (such as Nemirovski’s mirror-prox algorithm and its variants). To address this issue, we propose a novel smoothness condition which we call Bregman smoothness, and which relates the variation of the operator to that of a suitably chosen Bregman function. Leveraging this condition, we derive an adaptive mirror prox algorithm which attains an O(1/T) rate of convergence in problems with possibly singular operators, without any prior knowledge of the problem’s Bregman constant (the Bregman analogue of the Lipschitz constant). We also present an extension of our algorithm to stochastic variational inequalities where the algorithm achieves a $O(1/\sqrt{T})$ convergence rate. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9053-an-adaptive-mirror-prox-method-for-variational-inequalities-with-singular-operators |
http://papers.nips.cc/paper/9053-an-adaptive-mirror-prox-method-for-variational-inequalities-with-singular-operators.pdf | |
PWC | https://paperswithcode.com/paper/an-adaptive-mirror-prox-method-for |
Repo | |
Framework | |
Evaluating Research Novelty Detection: Counterfactual Approaches
Title | Evaluating Research Novelty Detection: Counterfactual Approaches |
Authors | Reinald Kim Amplayo, Seung-won Hwang, Min Song |
Abstract | In this paper, we explore strategies to evaluate models for the task research paper novelty detection: Given all papers released at a given date, which of the papers discuss new ideas and influence future research? We find the novelty is not a singular concept, and thus inherently lacks of ground truth annotations with cross-annotator agreement, which is a major obstacle in evaluating these models. Test-of-time award is closest to such annotation, which can only be made retrospectively and is extremely scarce. We thus propose to compare and evaluate models using counterfactual simulations. First, we ask models if they can differentiate papers at time $t$ and counterfactual paper from future time $t+d$. Second, we ask models if they can predict test-of-time award at $t+d$. These are proxies that can be agreed by human annotators and easily augmented by correlated signals, using which evaluation can be done through four tasks: classification, ranking, correlation and feature selection. We show these proxy evaluation methods complement each other regarding error handling, coverage, interpretability, and scope, and thus altogether contribute to the observation of the relative strength of existing models. |
Tasks | Feature Selection |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5315/ |
https://www.aclweb.org/anthology/D19-5315 | |
PWC | https://paperswithcode.com/paper/evaluating-research-novelty-detection |
Repo | |
Framework | |
Detecting harassment in real-time as conversations develop
Title | Detecting harassment in real-time as conversations develop |
Authors | Wessel Stoop, Florian Kunneman, Antal van den Bosch, Ben Miller |
Abstract | We developed a machine-learning-based method to detect video game players that harass teammates or opponents in chat earlier in the conversation. This real-time technology would allow gaming companies to intervene during games, such as issue warnings or muting or banning a player. In a proof-of-concept experiment on League of Legends data we compute and visualize evaluation metrics for a machine learning classifier as conversations unfold, and observe that the optimal precision and recall of detecting toxic players at each moment in the conversation depends on the confidence threshold of the classifier: the threshold should start low, and increase as the conversation unfolds. How fast this sliding threshold should increase depends on the training set size. |
Tasks | League of Legends |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3503/ |
https://www.aclweb.org/anthology/W19-3503 | |
PWC | https://paperswithcode.com/paper/detecting-harassment-in-real-time-as |
Repo | |
Framework | |
Learning to Control the Fine-grained Sentiment for Story Ending Generation
Title | Learning to Control the Fine-grained Sentiment for Story Ending Generation |
Authors | Fuli Luo, Damai Dai, Pengcheng Yang, Tianyu Liu, Baobao Chang, Zhifang Sui, Xu Sun |
Abstract | Automatic story ending generation is an interesting and challenging task in natural language generation. Previous studies are mainly limited to generate coherent, reasonable and diversified story endings, and few works focus on controlling the sentiment of story endings. This paper focuses on generating a story ending which meets the given fine-grained sentiment intensity. There are two major challenges to this task. First is the lack of story corpus which has fine-grained sentiment labels. Second is the difficulty of explicitly controlling sentiment intensity when generating endings. Therefore, we propose a generic and novel framework which consists of a sentiment analyzer and a sentimental generator, respectively addressing the two challenges. The sentiment analyzer adopts a series of methods to acquire sentiment intensities of the story dataset. The sentimental generator introduces the sentiment intensity into decoder via a Gaussian Kernel Layer to control the sentiment of the output. To the best of our knowledge, this is the first endeavor to control the fine-grained sentiment for story ending generation without manually annotating sentiment labels. Experiments show that our proposed framework can generate story endings which are not only more coherent and fluent but also able to meet the given sentiment intensity better. |
Tasks | Text Generation |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1603/ |
https://www.aclweb.org/anthology/P19-1603 | |
PWC | https://paperswithcode.com/paper/learning-to-control-the-fine-grained |
Repo | |
Framework | |
The evolution of spatial rationales in Tesni`ere’s stemmas
Title | The evolution of spatial rationales in Tesni`ere’s stemmas |
Authors | Nicolas Mazziotta |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-7709/ |
https://www.aclweb.org/anthology/W19-7709 | |
PWC | https://paperswithcode.com/paper/the-evolution-of-spatial-rationales-in |
Repo | |
Framework | |