Paper Group NANR 179
Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design. Using Hedge Detection to Improve Committed Belief Tagging. Action Verb Corpus. No Spurious Local Minima in a Two Hidden Unit ReLU Network. Supervised Open Information Extraction. The Manifold Assumption and Defenses Against Adversarial Perturbations. Towards Robust Int …
Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design
Title | Exploring Deep Recurrent Models with Reinforcement Learning for Molecule Design |
Authors | Daniel Neil, Marwin Segler, Laura Guasch, Mohamed Ahmed, Dean Plumbley, Matthew Sellwood, Nathan Brown |
Abstract | The design of small molecules with bespoke properties is of central importance to drug discovery. However significant challenges yet remain for computational methods, despite recent advances such as deep recurrent networks and reinforcement learning strategies for sequence generation, and it can be difficult to compare results across different works. This work proposes 19 benchmarks selected by subject experts, expands smaller datasets previously used to approximately 1.1 million training molecules, and explores how to apply new reinforcement learning techniques effectively for molecular design. The benchmarks here, built as OpenAI Gym environments, will be open-sourced to encourage innovation in molecular design algorithms and to enable usage by those without a background in chemistry. Finally, this work explores recent development in reinforcement-learning methods with excellent sample complexity (the A2C and PPO algorithms) and investigates their behavior in molecular generation, demonstrating significant performance gains compared to standard reinforcement learning techniques. |
Tasks | Drug Discovery |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HkcTe-bR- |
https://openreview.net/pdf?id=HkcTe-bR- | |
PWC | https://paperswithcode.com/paper/exploring-deep-recurrent-models-with |
Repo | |
Framework | |
Using Hedge Detection to Improve Committed Belief Tagging
Title | Using Hedge Detection to Improve Committed Belief Tagging |
Authors | Morgan Ulinski, Seth Benjamin, Julia Hirschberg |
Abstract | We describe a novel method for identifying hedge terms using a set of manually constructed rules. We present experiments adding hedge features to a committed belief system to improve classification. We compare performance of this system (a) without hedging features, (b) with dictionary-based features, and (c) with rule-based features. We find that using hedge features improves performance of the committed belief system, particularly in identifying instances of non-committed belief and reported belief. |
Tasks | Sentence Classification |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-1301/ |
https://www.aclweb.org/anthology/W18-1301 | |
PWC | https://paperswithcode.com/paper/using-hedge-detection-to-improve-committed |
Repo | |
Framework | |
Action Verb Corpus
Title | Action Verb Corpus |
Authors | Stephanie Gross, Matthias Hirschmanner, Brigitte Krenn, Friedrich Neubarth, Michael Zillich |
Abstract | |
Tasks | Action Classification, Language Acquisition, Language Modelling, Question Answering, Visual Question Answering |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1338/ |
https://www.aclweb.org/anthology/L18-1338 | |
PWC | https://paperswithcode.com/paper/action-verb-corpus |
Repo | |
Framework | |
No Spurious Local Minima in a Two Hidden Unit ReLU Network
Title | No Spurious Local Minima in a Two Hidden Unit ReLU Network |
Authors | Chenwei Wu, Jiajun Luo, Jason D. Lee |
Abstract | Deep learning models can be efficiently optimized via stochastic gradient descent, but there is little theoretical evidence to support this. A key question in optimization is to understand when the optimization landscape of a neural network is amenable to gradient-based optimization. We focus on a simple neural network two-layer ReLU network with two hidden units, and show that all local minimizers are global. This combined with recent work of Lee et al. (2017); Lee et al. (2016) show that gradient descent converges to the global minimizer. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=B14uJzW0b |
https://openreview.net/pdf?id=B14uJzW0b | |
PWC | https://paperswithcode.com/paper/no-spurious-local-minima-in-a-two-hidden-unit |
Repo | |
Framework | |
Supervised Open Information Extraction
Title | Supervised Open Information Extraction |
Authors | Gabriel Stanovsky, Julian Michael, Luke Zettlemoyer, Ido Dagan |
Abstract | We present data and methods that enable a supervised learning approach to Open Information Extraction (Open IE). Central to the approach is a novel formulation of Open IE as a sequence tagging problem, addressing challenges such as encoding multiple extractions for a predicate. We also develop a bi-LSTM transducer, extending recent deep Semantic Role Labeling models to extract Open IE tuples and provide confidence scores for tuning their precision-recall tradeoff. Furthermore, we show that the recently released Question-Answer Meaning Representation dataset can be automatically converted into an Open IE corpus which significantly increases the amount of available training data. Our supervised model outperforms the existing state-of-the-art Open IE systems on benchmark datasets. |
Tasks | Knowledge Base Population, Natural Language Inference, Open Information Extraction, Question Answering, Semantic Role Labeling |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1081/ |
https://www.aclweb.org/anthology/N18-1081 | |
PWC | https://paperswithcode.com/paper/supervised-open-information-extraction |
Repo | |
Framework | |
The Manifold Assumption and Defenses Against Adversarial Perturbations
Title | The Manifold Assumption and Defenses Against Adversarial Perturbations |
Authors | Xi Wu, Uyeong Jang, Lingjiao Chen, Somesh Jha |
Abstract | In the adversarial-perturbation problem of neural networks, an adversary starts with a neural network model $F$ and a point $\bfx$ that $F$ classifies correctly, and applies a \emph{small perturbation} to $\bfx$ to produce another point $\bfx'$ that $F$ classifies \emph{incorrectly}. In this paper, we propose taking into account \emph{the inherent confidence information} produced by models when studying adversarial perturbations, where a natural measure of confidence'' is \F(\bfx)\_\infty$ (i.e. how confident $F$ is about its prediction?). Motivated by a thought experiment based on the manifold assumption, we propose a goodness property’’ of models which states that \emph{confident regions of a good model should be well separated}. We give formalizations of this property and examine existing robust training objectives in view of them. Interestingly, we find that a recent objective by Madry et al. encourages training a model that satisfies well our formal version of the goodness property, but has a weak control of points that are wrong but with low confidence. However, if Madry et al.‘s model is indeed a good solution to their objective, then good and bad points are now distinguishable and we can try to embed uncertain points back to the closest confident region to get (hopefully) correct predictions. We thus propose embedding objectives and algorithms, and perform an empirical study using this method. Our experimental results are encouraging: Madry et al.‘s model wrapped with our embedding procedure achieves almost perfect success rate in defending against attacks that the base model fails on, while retaining good generalization behavior. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Hk-FlMbAZ |
https://openreview.net/pdf?id=Hk-FlMbAZ | |
PWC | https://paperswithcode.com/paper/the-manifold-assumption-and-defenses-against |
Repo | |
Framework | |
Towards Robust Interpretability with Self-Explaining Neural Networks
Title | Towards Robust Interpretability with Self-Explaining Neural Networks |
Authors | David Alvarez Melis, Tommi Jaakkola |
Abstract | Most recent work on interpretability of complex machine learning models has focused on estimating a-posteriori explanations for previously trained models around specific predictions. Self-explaining models where interpretability plays a key role already during learning have received much less attention. We propose three desiderata for explanations in general – explicitness, faithfulness, and stability – and show that existing methods do not satisfy them. In response, we design self-explaining models in stages, progressively generalizing linear classifiers to complex yet architecturally explicit models. Faithfulness and stability are enforced via regularization specifically tailored to such models. Experimental results across various benchmark datasets show that our framework offers a promising direction for reconciling model complexity and interpretability. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8003-towards-robust-interpretability-with-self-explaining-neural-networks |
http://papers.nips.cc/paper/8003-towards-robust-interpretability-with-self-explaining-neural-networks.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-interpretability-with-self |
Repo | |
Framework | |
Attention-free encoder decoder for morphological processing
Title | Attention-free encoder decoder for morphological processing |
Authors | Stefan Daniel Dumitrescu, Tiberiu Boros |
Abstract | |
Tasks | Lemmatization |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-3007/ |
https://www.aclweb.org/anthology/K18-3007 | |
PWC | https://paperswithcode.com/paper/attention-free-encoder-decoder-for |
Repo | |
Framework | |
Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation
Title | Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation |
Authors | Chaitanya Ryali, Gautam Reddy, Angela J. Yu |
Abstract | Understanding how humans and animals learn about statistical regularities in stable and volatile environments, and utilize these regularities to make predictions and decisions, is an important problem in neuroscience and psychology. Using a Bayesian modeling framework, specifically the Dynamic Belief Model (DBM), it has previously been shown that humans tend to make the {\it default} assumption that environmental statistics undergo abrupt, unsignaled changes, even when environmental statistics are actually stable. Because exact Bayesian inference in this setting, an example of switching state space models, is computationally intense, a number of approximately Bayesian and heuristic algorithms have been proposed to account for learning/prediction in the brain. Here, we examine a neurally plausible algorithm, a special case of leaky integration dynamics we denote as EXP (for exponential filtering), that is significantly simpler than all previously suggested algorithms except for the delta-learning rule, and which far outperforms the delta rule in approximating Bayesian prediction performance. We derive the theoretical relationship between DBM and EXP, and show that EXP gains computational efficiency by foregoing the representation of inferential uncertainty (as does the delta rule), but that it nevertheless achieves near-Bayesian performance due to its ability to incorporate a “persistent prior” influence unique to DBM and absent from the other algorithms. Furthermore, we show that EXP is comparable to DBM but better than all other models in reproducing human behavior in a visual search task, suggesting that human learning and prediction also incorporates an element of persistent prior. More broadly, our work demonstrates that when observations are information-poor, detecting changes or modulating the learning rate is both {\it difficult} and (thus) {\it unnecessary} for making Bayes-optimal predictions. |
Tasks | Bayesian Inference |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7543-demystifying-excessively-volatile-human-learning-a-bayesian-persistent-prior-and-a-neural-approximation |
http://papers.nips.cc/paper/7543-demystifying-excessively-volatile-human-learning-a-bayesian-persistent-prior-and-a-neural-approximation.pdf | |
PWC | https://paperswithcode.com/paper/demystifying-excessively-volatile-human |
Repo | |
Framework | |
HUMIR at IEST-2018: Lexicon-Sensitive and Left-Right Context-Sensitive BiLSTM for Implicit Emotion Recognition
Title | HUMIR at IEST-2018: Lexicon-Sensitive and Left-Right Context-Sensitive BiLSTM for Implicit Emotion Recognition |
Authors | Behzad Naderalvojoud, Alaettin Ucan, Ebru Akcapinar Sezer |
Abstract | This paper describes the approaches used in HUMIR system for the WASSA-2018 shared task on the implicit emotion recognition. The objective of this task is to predict the emotion expressed by the target word that has been excluded from the given tweet. We suppose this task as a word sense disambiguation in which the target word is considered as a synthetic word that can express 6 emotions depending on the context. To predict the correct emotion, we propose a deep neural network model that uses two BiLSTM networks to represent the contexts in the left and right sides of the target word. The BiLSTM outputs achieved from the left and right contexts are considered as context-sensitive features. These features are used in a feed-forward neural network to predict the target word emotion. Besides this approach, we also combine the BiLSTM model with lexicon-based and emotion-based features. Finally, we employ all models in the final system using Bagging ensemble method. We achieved macro F-measure value of 68.8 on the official test set and ranked sixth out of 30 participants. |
Tasks | Emotion Recognition, Feature Engineering, Word Sense Disambiguation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6225/ |
https://www.aclweb.org/anthology/W18-6225 | |
PWC | https://paperswithcode.com/paper/humir-at-iest-2018-lexicon-sensitive-and-left |
Repo | |
Framework | |
End-to-End Abnormality Detection in Medical Imaging
Title | End-to-End Abnormality Detection in Medical Imaging |
Authors | Dufan Wu, Kyungsang Kim, Bin Dong, Quanzheng Li |
Abstract | Deep neural networks (DNN) have shown promising performance in computer vision. In medical imaging, encouraging results have been achieved with deep learning for applications such as segmentation, lesion detection and classification. Nearly all of the deep learning based image analysis methods work on reconstructed images, which are obtained from original acquisitions via solving inverse problems (reconstruction). The reconstruction algorithms are designed for human observers, but not necessarily optimized for DNNs which can often observe features that are incomprehensible for human eyes. Hence, it is desirable to train the DNNs directly from the original data which lie in a different domain with the images. In this paper, we proposed an end-to-end DNN for abnormality detection in medical imaging. To align the acquisition with the annotations made by radiologists in the image domain, a DNN was built as the unrolled version of iterative reconstruction algorithms to map the acquisitions to images, and followed by a 3D convolutional neural network (CNN) to detect the abnormality in the reconstructed images. The two networks were trained jointly in order to optimize the entire DNN for the detection task from the original acquisitions. The DNN was implemented for lung nodule detection in low-dose chest computed tomography (CT), where a numerical simulation was done to generate acquisitions from 1,018 chest CT images with radiologists’ annotations. The proposed end-to-end DNN demonstrated better sensitivity and accuracy for the task compared to a two-step approach, in which the reconstruction and detection DNNs were trained separately. A significant reduction of false positive rate on suspicious lesions were observed, which is crucial for the known over-diagnosis in low-dose lung CT imaging. The images reconstructed by the proposed end-to-end network also presented enhanced details in the region of interest. |
Tasks | Anomaly Detection, Computed Tomography (CT), Lung Nodule Detection |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rk1FQA0pW |
https://openreview.net/pdf?id=rk1FQA0pW | |
PWC | https://paperswithcode.com/paper/end-to-end-abnormality-detection-in-medical |
Repo | |
Framework | |
The Word Sense Disambiguation Test Suite at WMT18
Title | The Word Sense Disambiguation Test Suite at WMT18 |
Authors | Annette Rios, Mathias M{"u}ller, Rico Sennrich |
Abstract | We present a task to measure an MT system{'}s capability to translate ambiguous words with their correct sense according to the given context. The task is based on the German{–}English Word Sense Disambiguation (WSD) test set ContraWSD (Rios Gonzales et al., 2017), but it has been filtered to reduce noise, and the evaluation has been adapted to assess MT output directly rather than scoring existing translations. We evaluate all German{–}English submissions to the WMT{'}18 shared translation task, plus a number of submissions from previous years, and find that performance on the task has markedly improved compared to the 2016 WMT submissions (81{%}→93{%} accuracy on the WSD task). We also find that the unsupervised submissions to the task have a low WSD capability, and predominantly translate ambiguous source words with the same sense. |
Tasks | Machine Translation, Word Sense Disambiguation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6437/ |
https://www.aclweb.org/anthology/W18-6437 | |
PWC | https://paperswithcode.com/paper/the-word-sense-disambiguation-test-suite-at |
Repo | |
Framework | |
SocialNLP 2018 EmotionX Challenge Overview: Recognizing Emotions in Dialogues
Title | SocialNLP 2018 EmotionX Challenge Overview: Recognizing Emotions in Dialogues |
Authors | Chao-Chun Hsu, Lun-Wei Ku |
Abstract | This paper describes an overview of the Dialogue Emotion Recognition Challenge, EmotionX, at the Sixth SocialNLP Workshop, which recognizes the emotion of each utterance in dialogues. This challenge offers the EmotionLines dataset as the experimental materials. The EmotionLines dataset contains conversations from Friends TV show transcripts (Friends) and real chatting logs (EmotionPush), where every dialogue utterance is labeled with emotions. Organizers provide baseline results. 18 teams registered in this challenge and 5 of them submitted their results successfully. The best team achieves the unweighted accuracy 62.48 and 62.5 on EmotionPush and Friends, respectively. In this paper we present the task definition, test collection, the evaluation results of the groups that participated in this challenge, and their approach. |
Tasks | Common Sense Reasoning, Emotion Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3505/ |
https://www.aclweb.org/anthology/W18-3505 | |
PWC | https://paperswithcode.com/paper/socialnlp-2018-emotionx-challenge-overview |
Repo | |
Framework | |
Sheffield Submissions for WMT18 Multimodal Translation Shared Task
Title | Sheffield Submissions for WMT18 Multimodal Translation Shared Task |
Authors | Chiraag Lala, Pranava Swaroop Madhyastha, Carolina Scarton, Lucia Specia |
Abstract | This paper describes the University of Sheffield{'}s submissions to the WMT18 Multimodal Machine Translation shared task. We participated in both tasks 1 and 1b. For task 1, we build on a standard sequence to sequence attention-based neural machine translation system (NMT) and investigate the utility of multimodal re-ranking approaches. More specifically, n-best translation candidates from this system are re-ranked using novel multimodal cross-lingual word sense disambiguation models. For task 1b, we explore three approaches: (i) re-ranking based on cross-lingual word sense disambiguation (as for task 1), (ii) re-ranking based on consensus of NMT n-best lists from German-Czech, French-Czech and English-Czech systems, and (iii) data augmentation by generating English source data through machine translation from French to English and from German to English followed by hypothesis selection using a multimodal-reranker. |
Tasks | Data Augmentation, Machine Translation, Multimodal Machine Translation, Word Sense Disambiguation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6442/ |
https://www.aclweb.org/anthology/W18-6442 | |
PWC | https://paperswithcode.com/paper/sheffield-submissions-for-wmt18-multimodal |
Repo | |
Framework | |
Multilingual Short Text Responses Clustering for Mobile Educational Activities: a Preliminary Exploration
Title | Multilingual Short Text Responses Clustering for Mobile Educational Activities: a Preliminary Exploration |
Authors | Yuen-Hsien Tseng, Lung-Hao Lee, Yu-Ta Chien, Chun-Yen Chang, Tsung-Yen Li |
Abstract | Text clustering is a powerful technique to detect topics from document corpora, so as to provide information browsing, analysis, and organization. On the other hand, the Instant Response System (IRS) has been widely used in recent years to enhance student engagement in class and thus improve their learning effectiveness. However, the lack of functions to process short text responses from the IRS prevents the further application of IRS in classes. Therefore, this study aims to propose a proper short text clustering module for the IRS, and demonstrate our implemented techniques through real-world examples, so as to provide experiences and insights for further study. In particular, we have compared three clustering methods and the result shows that theoretically better methods need not lead to better results, as there are various factors that may affect the final performance. |
Tasks | Text Clustering |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3723/ |
https://www.aclweb.org/anthology/W18-3723 | |
PWC | https://paperswithcode.com/paper/multilingual-short-text-responses-clustering |
Repo | |
Framework | |