October 15, 2019

2719 words 13 mins read

Paper Group NANR 179

Paper Group NANR 179

Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design. Using Hedge Detection to Improve Committed Belief Tagging. Action Verb Corpus. No Spurious Local Minima in a Two Hidden Unit ReLU Network. Supervised Open Information Extraction. The Manifold Assumption and Defenses Against Adversarial Perturbations. Towards Robust Int …

Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design

Title Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design
Authors Daniel Neil, Marwin Segler, Laura Guasch, Mohamed Ahmed, Dean Plumbley, Matthew Sellwood, Nathan Brown
Abstract The design of small molecules with bespoke properties is of central importance to drug discovery. However significant challenges yet remain for computational methods, despite recent advances such as deep recurrent networks and reinforcement learning strategies for sequence generation, and it can be difficult to compare results across different works. This work proposes 19 benchmarks selected by subject experts, expands smaller datasets previously used to approximately 1.1 million training molecules, and explores how to apply new reinforcement learning techniques effectively for molecular design. The benchmarks here, built as OpenAI Gym environments, will be open-sourced to encourage innovation in molecular design algorithms and to enable usage by those without a background in chemistry. Finally, this work explores recent development in reinforcement-learning methods with excellent sample complexity (the A2C and PPO algorithms) and investigates their behavior in molecular generation, demonstrating significant performance gains compared to standard reinforcement learning techniques.
Tasks Drug Discovery
Published 2018-01-01
URL https://openreview.net/forum?id=HkcTe-bR-
PDF https://openreview.net/pdf?id=HkcTe-bR-
PWC https://paperswithcode.com/paper/exploring-deep-recurrent-models-with
Repo
Framework

Using Hedge Detection to Improve Committed Belief Tagging

Title Using Hedge Detection to Improve Committed Belief Tagging
Authors Morgan Ulinski, Seth Benjamin, Julia Hirschberg
Abstract We describe a novel method for identifying hedge terms using a set of manually constructed rules. We present experiments adding hedge features to a committed belief system to improve classification. We compare performance of this system (a) without hedging features, (b) with dictionary-based features, and (c) with rule-based features. We find that using hedge features improves performance of the committed belief system, particularly in identifying instances of non-committed belief and reported belief.
Tasks Sentence Classification
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-1301/
PDF https://www.aclweb.org/anthology/W18-1301
PWC https://paperswithcode.com/paper/using-hedge-detection-to-improve-committed
Repo
Framework

Action Verb Corpus

Title Action Verb Corpus
Authors Stephanie Gross, Matthias Hirschmanner, Brigitte Krenn, Friedrich Neubarth, Michael Zillich
Abstract
Tasks Action Classification, Language Acquisition, Language Modelling, Question Answering, Visual Question Answering
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1338/
PDF https://www.aclweb.org/anthology/L18-1338
PWC https://paperswithcode.com/paper/action-verb-corpus
Repo
Framework

No Spurious Local Minima in a Two Hidden Unit ReLU Network

Title No Spurious Local Minima in a Two Hidden Unit ReLU Network
Authors Chenwei Wu, Jiajun Luo, Jason D. Lee
Abstract Deep learning models can be efficiently optimized via stochastic gradient descent, but there is little theoretical evidence to support this. A key question in optimization is to understand when the optimization landscape of a neural network is amenable to gradient-based optimization. We focus on a simple neural network two-layer ReLU network with two hidden units, and show that all local minimizers are global. This combined with recent work of Lee et al. (2017); Lee et al. (2016) show that gradient descent converges to the global minimizer.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=B14uJzW0b
PDF https://openreview.net/pdf?id=B14uJzW0b
PWC https://paperswithcode.com/paper/no-spurious-local-minima-in-a-two-hidden-unit
Repo
Framework

Supervised Open Information Extraction

Title Supervised Open Information Extraction
Authors Gabriel Stanovsky, Julian Michael, Luke Zettlemoyer, Ido Dagan
Abstract We present data and methods that enable a supervised learning approach to Open Information Extraction (Open IE). Central to the approach is a novel formulation of Open IE as a sequence tagging problem, addressing challenges such as encoding multiple extractions for a predicate. We also develop a bi-LSTM transducer, extending recent deep Semantic Role Labeling models to extract Open IE tuples and provide confidence scores for tuning their precision-recall tradeoff. Furthermore, we show that the recently released Question-Answer Meaning Representation dataset can be automatically converted into an Open IE corpus which significantly increases the amount of available training data. Our supervised model outperforms the existing state-of-the-art Open IE systems on benchmark datasets.
Tasks Knowledge Base Population, Natural Language Inference, Open Information Extraction, Question Answering, Semantic Role Labeling
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1081/
PDF https://www.aclweb.org/anthology/N18-1081
PWC https://paperswithcode.com/paper/supervised-open-information-extraction
Repo
Framework

The Manifold Assumption and Defenses Against Adversarial Perturbations

Title The Manifold Assumption and Defenses Against Adversarial Perturbations
Authors Xi Wu, Uyeong Jang, Lingjiao Chen, Somesh Jha
Abstract In the adversarial-perturbation problem of neural networks, an adversary starts with a neural network model $F$ and a point $\bfx$ that $F$ classifies correctly, and applies a \emph{small perturbation} to $\bfx$ to produce another point $\bfx'$ that $F$ classifies \emph{incorrectly}. In this paper, we propose taking into account \emph{the inherent confidence information} produced by models when studying adversarial perturbations, where a natural measure of confidence'' is \F(\bfx)\_\infty$ (i.e. how confident $F$ is about its prediction?). Motivated by a thought experiment based on the manifold assumption, we propose a goodness property’’ of models which states that \emph{confident regions of a good model should be well separated}. We give formalizations of this property and examine existing robust training objectives in view of them. Interestingly, we find that a recent objective by Madry et al. encourages training a model that satisfies well our formal version of the goodness property, but has a weak control of points that are wrong but with low confidence. However, if Madry et al.‘s model is indeed a good solution to their objective, then good and bad points are now distinguishable and we can try to embed uncertain points back to the closest confident region to get (hopefully) correct predictions. We thus propose embedding objectives and algorithms, and perform an empirical study using this method. Our experimental results are encouraging: Madry et al.‘s model wrapped with our embedding procedure achieves almost perfect success rate in defending against attacks that the base model fails on, while retaining good generalization behavior.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=Hk-FlMbAZ
PDF https://openreview.net/pdf?id=Hk-FlMbAZ
PWC https://paperswithcode.com/paper/the-manifold-assumption-and-defenses-against
Repo
Framework

Towards Robust Interpretability with Self-Explaining Neural Networks

Title Towards Robust Interpretability with Self-Explaining Neural Networks
Authors David Alvarez Melis, Tommi Jaakkola
Abstract Most recent work on interpretability of complex machine learning models has focused on estimating a-posteriori explanations for previously trained models around specific predictions. Self-explaining models where interpretability plays a key role already during learning have received much less attention. We propose three desiderata for explanations in general – explicitness, faithfulness, and stability – and show that existing methods do not satisfy them. In response, we design self-explaining models in stages, progressively generalizing linear classifiers to complex yet architecturally explicit models. Faithfulness and stability are enforced via regularization specifically tailored to such models. Experimental results across various benchmark datasets show that our framework offers a promising direction for reconciling model complexity and interpretability.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/8003-towards-robust-interpretability-with-self-explaining-neural-networks
PDF http://papers.nips.cc/paper/8003-towards-robust-interpretability-with-self-explaining-neural-networks.pdf
PWC https://paperswithcode.com/paper/towards-robust-interpretability-with-self
Repo
Framework

Attention-free encoder decoder for morphological processing

Title Attention-free encoder decoder for morphological processing
Authors Stefan Daniel Dumitrescu, Tiberiu Boros
Abstract
Tasks Lemmatization
Published 2018-10-01
URL https://www.aclweb.org/anthology/K18-3007/
PDF https://www.aclweb.org/anthology/K18-3007
PWC https://paperswithcode.com/paper/attention-free-encoder-decoder-for
Repo
Framework

Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation

Title Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation
Authors Chaitanya Ryali, Gautam Reddy, Angela J. Yu
Abstract Understanding how humans and animals learn about statistical regularities in stable and volatile environments, and utilize these regularities to make predictions and decisions, is an important problem in neuroscience and psychology. Using a Bayesian modeling framework, specifically the Dynamic Belief Model (DBM), it has previously been shown that humans tend to make the {\it default} assumption that environmental statistics undergo abrupt, unsignaled changes, even when environmental statistics are actually stable. Because exact Bayesian inference in this setting, an example of switching state space models, is computationally intense, a number of approximately Bayesian and heuristic algorithms have been proposed to account for learning/prediction in the brain. Here, we examine a neurally plausible algorithm, a special case of leaky integration dynamics we denote as EXP (for exponential filtering), that is significantly simpler than all previously suggested algorithms except for the delta-learning rule, and which far outperforms the delta rule in approximating Bayesian prediction performance. We derive the theoretical relationship between DBM and EXP, and show that EXP gains computational efficiency by foregoing the representation of inferential uncertainty (as does the delta rule), but that it nevertheless achieves near-Bayesian performance due to its ability to incorporate a “persistent prior” influence unique to DBM and absent from the other algorithms. Furthermore, we show that EXP is comparable to DBM but better than all other models in reproducing human behavior in a visual search task, suggesting that human learning and prediction also incorporates an element of persistent prior. More broadly, our work demonstrates that when observations are information-poor, detecting changes or modulating the learning rate is both {\it difficult} and (thus) {\it unnecessary} for making Bayes-optimal predictions.
Tasks Bayesian Inference
Published 2018-12-01
URL http://papers.nips.cc/paper/7543-demystifying-excessively-volatile-human-learning-a-bayesian-persistent-prior-and-a-neural-approximation
PDF http://papers.nips.cc/paper/7543-demystifying-excessively-volatile-human-learning-a-bayesian-persistent-prior-and-a-neural-approximation.pdf
PWC https://paperswithcode.com/paper/demystifying-excessively-volatile-human
Repo
Framework

HUMIR at IEST-2018: Lexicon-Sensitive and Left-Right Context-Sensitive BiLSTM for Implicit Emotion Recognition

Title HUMIR at IEST-2018: Lexicon-Sensitive and Left-Right Context-Sensitive BiLSTM for Implicit Emotion Recognition
Authors Behzad Naderalvojoud, Alaettin Ucan, Ebru Akcapinar Sezer
Abstract This paper describes the approaches used in HUMIR system for the WASSA-2018 shared task on the implicit emotion recognition. The objective of this task is to predict the emotion expressed by the target word that has been excluded from the given tweet. We suppose this task as a word sense disambiguation in which the target word is considered as a synthetic word that can express 6 emotions depending on the context. To predict the correct emotion, we propose a deep neural network model that uses two BiLSTM networks to represent the contexts in the left and right sides of the target word. The BiLSTM outputs achieved from the left and right contexts are considered as context-sensitive features. These features are used in a feed-forward neural network to predict the target word emotion. Besides this approach, we also combine the BiLSTM model with lexicon-based and emotion-based features. Finally, we employ all models in the final system using Bagging ensemble method. We achieved macro F-measure value of 68.8 on the official test set and ranked sixth out of 30 participants.
Tasks Emotion Recognition, Feature Engineering, Word Sense Disambiguation
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-6225/
PDF https://www.aclweb.org/anthology/W18-6225
PWC https://paperswithcode.com/paper/humir-at-iest-2018-lexicon-sensitive-and-left
Repo
Framework

End-to-End Abnormality Detection in Medical Imaging

Title End-to-End Abnormality Detection in Medical Imaging
Authors Dufan Wu, Kyungsang Kim, Bin Dong, Quanzheng Li
Abstract Deep neural networks (DNN) have shown promising performance in computer vision. In medical imaging, encouraging results have been achieved with deep learning for applications such as segmentation, lesion detection and classification. Nearly all of the deep learning based image analysis methods work on reconstructed images, which are obtained from original acquisitions via solving inverse problems (reconstruction). The reconstruction algorithms are designed for human observers, but not necessarily optimized for DNNs which can often observe features that are incomprehensible for human eyes. Hence, it is desirable to train the DNNs directly from the original data which lie in a different domain with the images. In this paper, we proposed an end-to-end DNN for abnormality detection in medical imaging. To align the acquisition with the annotations made by radiologists in the image domain, a DNN was built as the unrolled version of iterative reconstruction algorithms to map the acquisitions to images, and followed by a 3D convolutional neural network (CNN) to detect the abnormality in the reconstructed images. The two networks were trained jointly in order to optimize the entire DNN for the detection task from the original acquisitions. The DNN was implemented for lung nodule detection in low-dose chest computed tomography (CT), where a numerical simulation was done to generate acquisitions from 1,018 chest CT images with radiologists’ annotations. The proposed end-to-end DNN demonstrated better sensitivity and accuracy for the task compared to a two-step approach, in which the reconstruction and detection DNNs were trained separately. A significant reduction of false positive rate on suspicious lesions were observed, which is crucial for the known over-diagnosis in low-dose lung CT imaging. The images reconstructed by the proposed end-to-end network also presented enhanced details in the region of interest.
Tasks Anomaly Detection, Computed Tomography (CT), Lung Nodule Detection
Published 2018-01-01
URL https://openreview.net/forum?id=rk1FQA0pW
PDF https://openreview.net/pdf?id=rk1FQA0pW
PWC https://paperswithcode.com/paper/end-to-end-abnormality-detection-in-medical
Repo
Framework

The Word Sense Disambiguation Test Suite at WMT18

Title The Word Sense Disambiguation Test Suite at WMT18
Authors Annette Rios, Mathias M{"u}ller, Rico Sennrich
Abstract We present a task to measure an MT system{'}s capability to translate ambiguous words with their correct sense according to the given context. The task is based on the German{–}English Word Sense Disambiguation (WSD) test set ContraWSD (Rios Gonzales et al., 2017), but it has been filtered to reduce noise, and the evaluation has been adapted to assess MT output directly rather than scoring existing translations. We evaluate all German{–}English submissions to the WMT{'}18 shared translation task, plus a number of submissions from previous years, and find that performance on the task has markedly improved compared to the 2016 WMT submissions (81{%}→93{%} accuracy on the WSD task). We also find that the unsupervised submissions to the task have a low WSD capability, and predominantly translate ambiguous source words with the same sense.
Tasks Machine Translation, Word Sense Disambiguation
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-6437/
PDF https://www.aclweb.org/anthology/W18-6437
PWC https://paperswithcode.com/paper/the-word-sense-disambiguation-test-suite-at
Repo
Framework

SocialNLP 2018 EmotionX Challenge Overview: Recognizing Emotions in Dialogues

Title SocialNLP 2018 EmotionX Challenge Overview: Recognizing Emotions in Dialogues
Authors Chao-Chun Hsu, Lun-Wei Ku
Abstract This paper describes an overview of the Dialogue Emotion Recognition Challenge, EmotionX, at the Sixth SocialNLP Workshop, which recognizes the emotion of each utterance in dialogues. This challenge offers the EmotionLines dataset as the experimental materials. The EmotionLines dataset contains conversations from Friends TV show transcripts (Friends) and real chatting logs (EmotionPush), where every dialogue utterance is labeled with emotions. Organizers provide baseline results. 18 teams registered in this challenge and 5 of them submitted their results successfully. The best team achieves the unweighted accuracy 62.48 and 62.5 on EmotionPush and Friends, respectively. In this paper we present the task definition, test collection, the evaluation results of the groups that participated in this challenge, and their approach.
Tasks Common Sense Reasoning, Emotion Recognition
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3505/
PDF https://www.aclweb.org/anthology/W18-3505
PWC https://paperswithcode.com/paper/socialnlp-2018-emotionx-challenge-overview
Repo
Framework

Sheffield Submissions for WMT18 Multimodal Translation Shared Task

Title Sheffield Submissions for WMT18 Multimodal Translation Shared Task
Authors Chiraag Lala, Pranava Swaroop Madhyastha, Carolina Scarton, Lucia Specia
Abstract This paper describes the University of Sheffield{'}s submissions to the WMT18 Multimodal Machine Translation shared task. We participated in both tasks 1 and 1b. For task 1, we build on a standard sequence to sequence attention-based neural machine translation system (NMT) and investigate the utility of multimodal re-ranking approaches. More specifically, n-best translation candidates from this system are re-ranked using novel multimodal cross-lingual word sense disambiguation models. For task 1b, we explore three approaches: (i) re-ranking based on cross-lingual word sense disambiguation (as for task 1), (ii) re-ranking based on consensus of NMT n-best lists from German-Czech, French-Czech and English-Czech systems, and (iii) data augmentation by generating English source data through machine translation from French to English and from German to English followed by hypothesis selection using a multimodal-reranker.
Tasks Data Augmentation, Machine Translation, Multimodal Machine Translation, Word Sense Disambiguation
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-6442/
PDF https://www.aclweb.org/anthology/W18-6442
PWC https://paperswithcode.com/paper/sheffield-submissions-for-wmt18-multimodal
Repo
Framework

Multilingual Short Text Responses Clustering for Mobile Educational Activities: a Preliminary Exploration

Title Multilingual Short Text Responses Clustering for Mobile Educational Activities: a Preliminary Exploration
Authors Yuen-Hsien Tseng, Lung-Hao Lee, Yu-Ta Chien, Chun-Yen Chang, Tsung-Yen Li
Abstract Text clustering is a powerful technique to detect topics from document corpora, so as to provide information browsing, analysis, and organization. On the other hand, the Instant Response System (IRS) has been widely used in recent years to enhance student engagement in class and thus improve their learning effectiveness. However, the lack of functions to process short text responses from the IRS prevents the further application of IRS in classes. Therefore, this study aims to propose a proper short text clustering module for the IRS, and demonstrate our implemented techniques through real-world examples, so as to provide experiences and insights for further study. In particular, we have compared three clustering methods and the result shows that theoretically better methods need not lead to better results, as there are various factors that may affect the final performance.
Tasks Text Clustering
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3723/
PDF https://www.aclweb.org/anthology/W18-3723
PWC https://paperswithcode.com/paper/multilingual-short-text-responses-clustering
Repo
Framework
comments powered by Disqus