January 25, 2020

3296 words 16 mins read

Paper Group NAWR 14

Paper Group NAWR 14

Answering Naturally: Factoid to Full length Answer Generation. One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. SUPERVISED POLICY UPDATE. Manifold Mixup: Learning Better Representations by Interpolating Hidden States. Efficient characterization of electrically evoke …

Answering Naturally: Factoid to Full length Answer Generation

Title Answering Naturally: Factoid to Full length Answer Generation
Authors Vaishali Pal, Manish Shrivastava, Irshad Bhat
Abstract In recent years, the task of Question Answering over passages, also pitched as a reading comprehension, has evolved into a very active research area. A reading comprehension system extracts a span of text, comprising of named entities, dates, small phrases, etc., which serve as the answer to a given question. However, these spans of text would result in an unnatural reading experience in a conversational system. Usually, dialogue systems solve this issue by using template-based language generation. These systems, though adequate for a domain specific task, are too restrictive and predefined for a domain independent system. In order to present the user with a more conversational experience, we propose a pointer generator based full-length answer generator which can be used with most QA systems. Our system generates a full length answer given a question and the extracted factoid/span answer without relying on the passage from where the answer was extracted. We also present a dataset of 315000 question, factoid answer and full length answer triples. We have evaluated our system using ROUGE-1,2,L and BLEU and achieved 74.05 BLEU score and 86.25 Rogue-L score.
Tasks Question Answering, Reading Comprehension, Text Generation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5401/
PDF https://www.aclweb.org/anthology/D19-5401
PWC https://paperswithcode.com/paper/answering-naturally-factoid-to-full-length
Repo https://github.com/kolk/AnsweringNaturally
Framework pytorch

One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues

Title One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues
Authors Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, Rui Yan
Abstract Currently, researchers have paid great attention to retrieval-based dialogues in open-domain. In particular, people study the problem by investigating context-response matching for multi-turn response selection based on publicly recognized benchmark data sets. State-of-the-art methods require a response to interact with each utterance in a context from the beginning, but the interaction is performed in a shallow way. In this work, we let utterance-response interaction go deep by proposing an interaction-over-interaction network (IoI). The model performs matching by stacking multiple interaction blocks in which residual information from one time of interaction initiates the interaction process again. Thus, matching information within an utterance-response pair is extracted from the interaction of the pair in an iterative fashion, and the information flows along the chain of the blocks via representations. Evaluation results on three benchmark data sets indicate that IoI can significantly outperform state-of-the-art methods in terms of various matching metrics. Through further analysis, we also unveil how the depth of interaction affects the performance of IoI.
Tasks Conversational Response Selection
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1001/
PDF https://www.aclweb.org/anthology/P19-1001
PWC https://paperswithcode.com/paper/one-time-of-interaction-may-not-be-enough-go
Repo https://github.com/chongyangtao/IOI
Framework tf

SUPERVISED POLICY UPDATE

Title SUPERVISED POLICY UPDATE
Authors Quan Vuong, Yiming Zhang, Keith W. Ross
Abstract We propose a new sample-efficient methodology, called Supervised Policy Update (SPU), for deep reinforcement learning. Starting with data generated by the current policy, SPU formulates and solves a constrained optimization problem in the non-parameterized proximal policy space. Using supervised regression, it then converts the optimal non-parameterized policy to a parameterized policy, from which it draws new samples. The methodology is general in that it applies to both discrete and continuous action spaces, and can handle a wide variety of proximity constraints for the non-parameterized optimization problem. We show how the Natural Policy Gradient and Trust Region Policy Optimization (NPG/TRPO) problems, and the Proximal Policy Optimization (PPO) problem can be addressed by this methodology. The SPU implementation is much simpler than TRPO. In terms of sample efficiency, our extensive experiments show SPU outperforms TRPO in Mujoco simulated robotic tasks and outperforms PPO in Atari video game tasks.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=SJxTroR9F7
PDF https://openreview.net/pdf?id=SJxTroR9F7
PWC https://paperswithcode.com/paper/supervised-policy-update
Repo https://github.com/quanvuong/Supervised_Policy_Update
Framework tf

Manifold Mixup: Learning Better Representations by Interpolating Hidden States

Title Manifold Mixup: Learning Better Representations by Interpolating Hidden States
Authors Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Aaron Courville, Ioannis Mitliagkas, Yoshua Bengio
Abstract Deep networks often perform well on the data distribution on which they are trained, yet give incorrect (and often very confident) answers when evaluated on points from off of the training distribution. This is exemplified by the adversarial examples phenomenon but can also be seen in terms of model generalization and domain shift. Ideally, a model would assign lower confidence to points unlike those from the training distribution. We propose a regularizer which addresses this issue by training with interpolated hidden states and encouraging the classifier to be less confident at these points. Because the hidden states are learned, this has an important effect of encouraging the hidden states for a class to be concentrated in such a way so that interpolations within the same class or between two different classes do not intersect with the real data points from other classes. This has a major advantage in that it avoids the underfitting which can result from interpolating in the input space. We prove that the exact condition for this problem of underfitting to be avoided by Manifold Mixup is that the dimensionality of the hidden states exceeds the number of classes, which is often the case in practice. Additionally, this concentration can be seen as making the features in earlier layers more discriminative. We show that despite requiring no significant additional computation, Manifold Mixup achieves large improvements over strong baselines in supervised learning, robustness to single-step adversarial attacks, semi-supervised learning, and Negative Log-Likelihood on held out samples.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=rJlRKjActQ
PDF https://openreview.net/pdf?id=rJlRKjActQ
PWC https://paperswithcode.com/paper/manifold-mixup-learning-better
Repo https://github.com/vikasverma1077/manifold_mixup
Framework pytorch

Efficient characterization of electrically evoked responses for neural interfaces

Title Efficient characterization of electrically evoked responses for neural interfaces
Authors Nishal Shah, Sasidhar Madugula, Pawel Hottowy, Alexander Sher, Alan Litke, Liam Paninski, E.J. Chichilnisky
Abstract Future neural interfaces will read and write population neural activity with high spatial and temporal resolution, for diverse applications. For example, an artificial retina may restore vision to the blind by electrically stimulating retinal ganglion cells. Such devices must tune their function, based on stimulating and recording, to match the function of the circuit. However, existing methods for characterizing the neural interface scale poorly with the number of electrodes, limiting their practical applicability. This work tests the idea that using prior information from previous experiments and closed-loop measurements may greatly increase the efficiency of the neural interface. Large-scale, high-density electrical recording and stimulation in primate retina were used as a lab prototype for an artificial retina. Three key calibration steps were optimized: spike sorting in the presence of stimulation artifacts, response modeling, and adaptive stimulation. For spike sorting, exploiting the similarity of electrical artifact across electrodes and experiments substantially reduced the number of required measurements. For response modeling, a joint model that captures the inverse relationship between recorded spike amplitude and electrical stimulation threshold from previously recorded retinas resulted in greater consistency and efficiency. For adaptive stimulation, choosing which electrodes to stimulate based on probability estimates from previous measurements improved efficiency. Similar improvements resulted from using either non-adaptive stimulation with a joint model across cells, or adaptive stimulation with an independent model for each cell. Finally, image reconstruction revealed that these improvements may translate to improved performance of an artificial retina.
Tasks Calibration, Image Reconstruction
Published 2019-12-01
URL http://papers.nips.cc/paper/9588-efficient-characterization-of-electrically-evoked-responses-for-neural-interfaces
PDF http://papers.nips.cc/paper/9588-efficient-characterization-of-electrically-evoked-responses-for-neural-interfaces.pdf
PWC https://paperswithcode.com/paper/efficient-characterization-of-electrically
Repo https://github.com/Chichilnisky-Lab/shah-neurips-2019
Framework tf

Post training 4-bit quantization of convolutional networks for rapid-deployment

Title Post training 4-bit quantization of convolutional networks for rapid-deployment
Authors Ron Banner, Yury Nahshan, Daniel Soudry
Abstract Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources. Neural network quantization has significant benefits in reducing the amount of intermediate results, but it often requires the full datasets and time-consuming fine tuning to recover the accuracy lost after quantization. This paper introduces the first practical 4-bit post training quantization approach: it does not involve training the quantized model (fine-tuning), nor it requires the availability of the full dataset. We target the quantization of both activations and weights and suggest three complementary methods for minimizing quantization error at the tensor level, two of whom obtain a closed-form analytical solution. Combining these methods, our approach achieves accuracy that is just a few percents less the state-of-the-art baseline across a wide range of convolutional models. The source code to replicate all experiments is available on GitHub: \url{https://github.com/submission2019/cnn-quantization}.
Tasks Quantization
Published 2019-12-01
URL http://papers.nips.cc/paper/9008-post-training-4-bit-quantization-of-convolutional-networks-for-rapid-deployment
PDF http://papers.nips.cc/paper/9008-post-training-4-bit-quantization-of-convolutional-networks-for-rapid-deployment.pdf
PWC https://paperswithcode.com/paper/post-training-4-bit-quantization-of-1
Repo https://github.com/submission2019/cnn-quantization
Framework pytorch

Selecting Optimal Decisions via Distributionally Robust Nearest-Neighbor Regression

Title Selecting Optimal Decisions via Distributionally Robust Nearest-Neighbor Regression
Authors Ruidi Chen, Ioannis Paschalidis
Abstract This paper develops a prediction-based prescriptive model for optimal decision making that (i) predicts the outcome under each action using a robust nonlinear model, and (ii) adopts a randomized prescriptive policy determined by the predicted outcomes. The predictive model combines a new regularized regression technique, which was developed using Distributionally Robust Optimization (DRO) with an ambiguity set constructed from the Wasserstein metric, with the K-Nearest Neighbors (K-NN) regression, which helps to capture the nonlinearity embedded in the data. We show theoretical results that guarantee the out-of-sample performance of the predictive model, and prove the optimality of the randomized policy in terms of the expected true future outcome. We demonstrate the proposed methodology on a hypertension dataset, showing that our prescribed treatment leads to a larger reduction in the systolic blood pressure compared to a series of alternatives. A clinically meaningful threshold level used to activate the randomized policy is also derived under a sub-Gaussian assumption on the predicted outcome.
Tasks Decision Making
Published 2019-12-01
URL http://papers.nips.cc/paper/8363-selecting-optimal-decisions-via-distributionally-robust-nearest-neighbor-regression
PDF http://papers.nips.cc/paper/8363-selecting-optimal-decisions-via-distributionally-robust-nearest-neighbor-regression.pdf
PWC https://paperswithcode.com/paper/selecting-optimal-decisions-via
Repo https://github.com/noc-lab/Select-Optimal-Decisions-via-DRO-KNN.git
Framework none

More Complete Resultset Retrieval from Large Heterogeneous RDF Sources

Title More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
Authors Andre Valdestilhas, Tommaso Soru, Muhammad Saleem
Abstract Over the last years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laudromat, SPARQL endpoints provide access to the hundered of thousands of RDF datasets, representing billions of facts. These datasets are available in different formats such as raw data dumps and HDT files or directly accessible via SPARQL endpoints. Querying such large amount of distributed data is particularly challenging and many of these datasets cannot be directly queried using the SPARQL query language. In order to tackle these problems, we present WimuQ, an integrated query engine to execute SPARQL queries and retrieve results from large amount of heterogeneous RDF data sources. Presently, WimuQ is able to execute both federated and non-federated SPARQL queries over a total of 668,166 datasets from LOD Stats and LOD Laudromat as well as 559 active SPARQL endpoints. These data sources represent a total of 221.7 billion triples from more than 5 terabytes of information from datasets retrieved using the service “Where is My URI” (WIMU). Our evaluation on state-of-the-art real-data benchmarks shows that WimuQ retrieves more complete results for the benchmark queries.
Tasks RDF Dataset Discovery
Published 2019-11-12
URL https://dl.acm.org/doi/10.1145/3360901.3364436#d2419191e1
PDF https://svn.aksw.org/papers/2019/KCAP2019_WIMUQ/public.pdf
PWC https://paperswithcode.com/paper/more-complete-resultset-retrieval-from-large
Repo https://github.com/firmao/wimut
Framework none

A Dataset for Noun Compositionality Detection for a Slavic Language

Title A Dataset for Noun Compositionality Detection for a Slavic Language
Authors Dmitry Puzyrev, Artem Shelmanov, Alex Panchenko, er, Ekaterina Artemova
Abstract This paper presents the first gold-standard resource for Russian annotated with compositionality information of noun compounds. The compound phrases are collected from the Universal Dependency treebanks according to part of speech patterns, such as ADJ+NOUN or NOUN+NOUN, using the gold-standard annotations. Each compound phrase is annotated by two experts and a moderator according to the following schema: the phrase can be either compositional, non-compositional, or ambiguous (i.e., depending on the context it can be interpreted both as compositional or non-compositional). We conduct an experimental evaluation of models and methods for predicting compositionality of noun compounds in unsupervised and supervised setups. We show that methods from previous work evaluated on the proposed Russian-language resource achieve the performance comparable with results on English corpora.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3708/
PDF https://www.aclweb.org/anthology/W19-3708
PWC https://paperswithcode.com/paper/a-dataset-for-noun-compositionality-detection
Repo https://github.com/slangtech/ru-comps
Framework none

Muscle-actuated Human Simulation and Control

Title Muscle-actuated Human Simulation and Control
Authors Seunghwan Lee, Kyoungmin Lee, Moonseok Park, Jehee Lee
Abstract Many anatomical factors, such as bone geometry and muscle condition, interact to affect human movements. This work aims to build a comprehensive musculoskeletal model and its control system that reproduces realistic human movements driven by muscle contraction dynamics. The variations in the anatomic model generate a spectrum of human movements ranging from typical to highly stylistic movements. To do so, we discuss scalable and reliable simulation of anatomical features, robust control of under-actuated dynamical systems based on deep reinforcement learning, and modeling of pose-dependent joint limits. The key technical contribution is a scalable, two-level imitation learning algorithm that can deal with a comprehensive full-body musculoskeletal model with 346 muscles. We demonstrate the predictive simulation of dynamic motor skills under anatomical conditions including bone deformity, muscle weakness, contracture, and the use of a prosthesis. We also simulate various pathological gaits and predictively visualize how orthopedic surgeries improve post-operative gaits.
Tasks Imitation Learning
Published 2019-07-23
URL http://mrl.snu.ac.kr/research/ProjectScalable/Page.htm
PDF http://mrl.snu.ac.kr/research/ProjectScalable/Paper.pdf
PWC https://paperswithcode.com/paper/muscle-actuated-human-simulation-and-control
Repo https://github.com/lsw9021/MASS
Framework pytorch

A state-space model for inferring effective connectivity of latent neural dynamics from simultaneous EEG/fMRI

Title A state-space model for inferring effective connectivity of latent neural dynamics from simultaneous EEG/fMRI
Authors Tao Tu, John Paisley, Stefan Haufe, Paul Sajda
Abstract Inferring effective connectivity between spatially segregated brain regions is important for understanding human brain dynamics in health and disease. Non-invasive neuroimaging modalities, such as electroencephalography (EEG) and functional magnetic resonance imaging (fMRI), are often used to make measurements and infer connectivity. However most studies do not consider integrating the two modalities even though each is an indirect measure of the latent neural dynamics and each has its own spatial and/or temporal limitations. In this study, we develop a linear state-space model to infer the effective connectivity in a distributed brain network based on simultaneously recorded EEG and fMRI data. Our method first identifies task-dependent and subject-dependent regions of interest (ROI) based on the analysis of fMRI data. Directed influences between the latent neural states at these ROIs are then modeled as a multivariate autogressive (MVAR) process driven by various exogenous inputs. The latent neural dynamics give rise to the observed scalp EEG measurements via a biophysically informed linear EEG forward model. We use a mean-field variational Bayesian approach to infer the posterior distribution of latent states and model parameters. The performance of the model was evaluated on two sets of simulations. Our results emphasize the importance of obtaining accurate spatial localization of ROIs from fMRI. Finally, we applied the model to simultaneously recorded EEG-fMRI data from 10 subjects during a Face-Car-House visual categorization task and compared the change in connectivity induced by different stimulus categories.
Tasks EEG
Published 2019-12-01
URL http://papers.nips.cc/paper/8714-a-state-space-model-for-inferring-effective-connectivity-of-latent-neural-dynamics-from-simultaneous-eegfmri
PDF http://papers.nips.cc/paper/8714-a-state-space-model-for-inferring-effective-connectivity-of-latent-neural-dynamics-from-simultaneous-eegfmri.pdf
PWC https://paperswithcode.com/paper/a-state-space-model-for-inferring-effective
Repo https://github.com/taotu/VBLDS_Connectivity_EEG_fMRI
Framework none

TopNet: Structural Point Cloud Decoder

Title TopNet: Structural Point Cloud Decoder
Authors Lyne P. Tchapmi, Vineet Kosaraju, Hamid Rezatofighi, Ian Reid, Silvio Savarese
Abstract 3D point cloud generation is of great use for 3D scene modeling and understanding. Real-world 3D object point clouds can be properly described by a collection of low-level and high-level structures such as surfaces, geometric primitives, semantic parts,etc. In fact, there exist many different representations of a 3D object point cloud as a set of point groups. Existing frameworks for point cloud genera-ion either do not consider structure in their proposed solutions, or assume and enforce a specific structure/topology,e.g. a collection of manifolds or surfaces, for the generated point cloud of a 3D object. In this work, we pro-pose a novel decoder that generates a structured point cloud without assuming any specific structure or topology on the underlying point set. Our decoder is softly constrained to generate a point cloud following a hierarchical rooted tree structure. We show that given enough capacity and allowing for redundancies, the proposed decoder is very flexible and able to learn any arbitrary grouping of points including any topology on the point set. We evaluate our decoder on the task of point cloud generation for 3D point cloud shape completion. Combined with encoders from existing frameworks, we show that our proposed decoder significantly outperforms state-of-the-art 3D point cloud completion methods on the Shapenet dataset
Tasks Point Cloud Generation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Tchapmi_TopNet_Structural_Point_Cloud_Decoder_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Tchapmi_TopNet_Structural_Point_Cloud_Decoder_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/topnet-structural-point-cloud-decoder
Repo https://github.com/lynetcha/completion3d
Framework pytorch

Towards Improving Neural Named Entity Recognition with Gazetteers

Title Towards Improving Neural Named Entity Recognition with Gazetteers
Authors Tianyu Liu, Jin-Ge Yao, Chin-Yew Lin
Abstract Most of the recently proposed neural models for named entity recognition have been purely data-driven, with a strong emphasis on getting rid of the efforts for collecting external resources or designing hand-crafted features. This could increase the chance of overfitting since the models cannot access any supervision signal beyond the small amount of annotated data, limiting their power to generalize beyond the annotated entities. In this work, we show that properly utilizing external gazetteers could benefit segmental neural NER models. We add a simple module on the recently proposed hybrid semi-Markov CRF architecture and observe some promising results.
Tasks Named Entity Recognition
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1524/
PDF https://www.aclweb.org/anthology/P19-1524
PWC https://paperswithcode.com/paper/towards-improving-neural-named-entity
Repo https://github.com/lyutyuh/acl19_subtagger
Framework pytorch

Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter

Title Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter
Authors Muhammad Okky Ibrohim, Indra Budi
Abstract Hate speech and abusive language spreading on social media need to be detected automatically to avoid conflict between citizen. Moreover, hate speech has a target, category, and level that also needs to be detected to help the authority in prioritizing which hate speech must be addressed immediately. This research discusses multi-label text classification for abusive language and hate speech detection including detecting the target, category, and level of hate speech in Indonesian Twitter using machine learning approach with Support Vector Machine (SVM), Naive Bayes (NB), and Random Forest Decision Tree (RFDT) classifier and Binary Relevance (BR), Label Power-set (LP), and Classifier Chains (CC) as the data transformation method. We used several kinds of feature extractions which are term frequency, orthography, and lexicon features. Our experiment results show that in general RFDT classifier using LP as the transformation method gives the best accuracy with fast computational time.
Tasks Hate Speech Detection, Multi-Label Text Classification, Text Classification
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3506/
PDF https://www.aclweb.org/anthology/W19-3506
PWC https://paperswithcode.com/paper/multi-label-hate-speech-and-abusive-language
Repo https://github.com/okkyibrohim/id-multi-label-hate-speech-and-abusive-language-detection
Framework none

Transfer and Exploration via the Information Bottleneck

Title Transfer and Exploration via the Information Bottleneck
Authors Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew Botvinick, Yoshua Bengio, Sergey Levine
Abstract A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned model with an information bottleneck, we can identify decision states by examining where the model accesses the goal state through the bottleneck. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=rJg8yhAqKm
PDF https://openreview.net/pdf?id=rJg8yhAqKm
PWC https://paperswithcode.com/paper/transfer-and-exploration-via-the-information
Repo https://github.com/maximecb/gym-minigrid
Framework pytorch
comments powered by Disqus