January 25, 2020

3296 words 16 mins read

Paper Group NAWR 14

Answering Naturally: Factoid to Full length Answer Generation. One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. SUPERVISED POLICY UPDATE. Manifold Mixup: Learning Better Representations by Interpolating Hidden States. Efficient characterization of electrically evoke …

Answering Naturally: Factoid to Full length Answer Generation


Title	Answering Naturally: Factoid to Full length Answer Generation
Authors	Vaishali Pal, Manish Shrivastava, Irshad Bhat
Abstract	In recent years, the task of Question Answering over passages, also pitched as a reading comprehension, has evolved into a very active research area. A reading comprehension system extracts a span of text, comprising of named entities, dates, small phrases, etc., which serve as the answer to a given question. However, these spans of text would result in an unnatural reading experience in a conversational system. Usually, dialogue systems solve this issue by using template-based language generation. These systems, though adequate for a domain specific task, are too restrictive and predefined for a domain independent system. In order to present the user with a more conversational experience, we propose a pointer generator based full-length answer generator which can be used with most QA systems. Our system generates a full length answer given a question and the extracted factoid/span answer without relying on the passage from where the answer was extracted. We also present a dataset of 315000 question, factoid answer and full length answer triples. We have evaluated our system using ROUGE-1,2,L and BLEU and achieved 74.05 BLEU score and 86.25 Rogue-L score.
Tasks	Question Answering, Reading Comprehension, Text Generation
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5401/
PDF	https://www.aclweb.org/anthology/D19-5401
PWC	https://paperswithcode.com/paper/answering-naturally-factoid-to-full-length
Repo	https://github.com/kolk/AnsweringNaturally
Framework	pytorch

One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues


Title	One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues
Authors	Chongyang Tao, Wei Wu, Can Xu, Wenpeng Hu, Dongyan Zhao, Rui Yan
Abstract	Currently, researchers have paid great attention to retrieval-based dialogues in open-domain. In particular, people study the problem by investigating context-response matching for multi-turn response selection based on publicly recognized benchmark data sets. State-of-the-art methods require a response to interact with each utterance in a context from the beginning, but the interaction is performed in a shallow way. In this work, we let utterance-response interaction go deep by proposing an interaction-over-interaction network (IoI). The model performs matching by stacking multiple interaction blocks in which residual information from one time of interaction initiates the interaction process again. Thus, matching information within an utterance-response pair is extracted from the interaction of the pair in an iterative fashion, and the information flows along the chain of the blocks via representations. Evaluation results on three benchmark data sets indicate that IoI can significantly outperform state-of-the-art methods in terms of various matching metrics. Through further analysis, we also unveil how the depth of interaction affects the performance of IoI.
Tasks	Conversational Response Selection
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1001/
PDF	https://www.aclweb.org/anthology/P19-1001
PWC	https://paperswithcode.com/paper/one-time-of-interaction-may-not-be-enough-go
Repo	https://github.com/chongyangtao/IOI
Framework	tf

SUPERVISED POLICY UPDATE


Title	SUPERVISED POLICY UPDATE
Authors	Quan Vuong, Yiming Zhang, Keith W. Ross
Abstract	We propose a new sample-efficient methodology, called Supervised Policy Update (SPU), for deep reinforcement learning. Starting with data generated by the current policy, SPU formulates and solves a constrained optimization problem in the non-parameterized proximal policy space. Using supervised regression, it then converts the optimal non-parameterized policy to a parameterized policy, from which it draws new samples. The methodology is general in that it applies to both discrete and continuous action spaces, and can handle a wide variety of proximity constraints for the non-parameterized optimization problem. We show how the Natural Policy Gradient and Trust Region Policy Optimization (NPG/TRPO) problems, and the Proximal Policy Optimization (PPO) problem can be addressed by this methodology. The SPU implementation is much simpler than TRPO. In terms of sample efficiency, our extensive experiments show SPU outperforms TRPO in Mujoco simulated robotic tasks and outperforms PPO in Atari video game tasks.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=SJxTroR9F7
PDF	https://openreview.net/pdf?id=SJxTroR9F7
PWC	https://paperswithcode.com/paper/supervised-policy-update
Repo	https://github.com/quanvuong/Supervised_Policy_Update
Framework	tf

Manifold Mixup: Learning Better Representations by Interpolating Hidden States


Title	Manifold Mixup: Learning Better Representations by Interpolating Hidden States
Authors	Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Aaron Courville, Ioannis Mitliagkas, Yoshua Bengio
Abstract	Deep networks often perform well on the data distribution on which they are trained, yet give incorrect (and often very confident) answers when evaluated on points from off of the training distribution. This is exemplified by the adversarial examples phenomenon but can also be seen in terms of model generalization and domain shift. Ideally, a model would assign lower confidence to points unlike those from the training distribution. We propose a regularizer which addresses this issue by training with interpolated hidden states and encouraging the classifier to be less confident at these points. Because the hidden states are learned, this has an important effect of encouraging the hidden states for a class to be concentrated in such a way so that interpolations within the same class or between two different classes do not intersect with the real data points from other classes. This has a major advantage in that it avoids the underfitting which can result from interpolating in the input space. We prove that the exact condition for this problem of underfitting to be avoided by Manifold Mixup is that the dimensionality of the hidden states exceeds the number of classes, which is often the case in practice. Additionally, this concentration can be seen as making the features in earlier layers more discriminative. We show that despite requiring no significant additional computation, Manifold Mixup achieves large improvements over strong baselines in supervised learning, robustness to single-step adversarial attacks, semi-supervised learning, and Negative Log-Likelihood on held out samples.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=rJlRKjActQ
PDF	https://openreview.net/pdf?id=rJlRKjActQ
PWC	https://paperswithcode.com/paper/manifold-mixup-learning-better
Repo	https://github.com/vikasverma1077/manifold_mixup
Framework	pytorch

Efficient characterization of electrically evoked responses for neural interfaces


Title	Efficient characterization of electrically evoked responses for neural interfaces
Authors	Nishal Shah, Sasidhar Madugula, Pawel Hottowy, Alexander Sher, Alan Litke, Liam Paninski, E.J. Chichilnisky
Abstract	Future neural interfaces will read and write population neural activity with high spatial and temporal resolution, for diverse applications. For example, an artificial retina may restore vision to the blind by electrically stimulating retinal ganglion cells. Such devices must tune their function, based on stimulating and recording, to match the function of the circuit. However, existing methods for characterizing the neural interface scale poorly with the number of electrodes, limiting their practical applicability. This work tests the idea that using prior information from previous experiments and closed-loop measurements may greatly increase the efficiency of the neural interface. Large-scale, high-density electrical recording and stimulation in primate retina were used as a lab prototype for an artificial retina. Three key calibration steps were optimized: spike sorting in the presence of stimulation artifacts, response modeling, and adaptive stimulation. For spike sorting, exploiting the similarity of electrical artifact across electrodes and experiments substantially reduced the number of required measurements. For response modeling, a joint model that captures the inverse relationship between recorded spike amplitude and electrical stimulation threshold from previously recorded retinas resulted in greater consistency and efficiency. For adaptive stimulation, choosing which electrodes to stimulate based on probability estimates from previous measurements improved efficiency. Similar improvements resulted from using either non-adaptive stimulation with a joint model across cells, or adaptive stimulation with an independent model for each cell. Finally, image reconstruction revealed that these improvements may translate to improved performance of an artificial retina.
Tasks	Calibration, Image Reconstruction
Published	2019-12-01
URL	http://papers.nips.cc/paper/9588-efficient-characterization-of-electrically-evoked-responses-for-neural-interfaces
PDF	http://papers.nips.cc/paper/9588-efficient-characterization-of-electrically-evoked-responses-for-neural-interfaces.pdf
PWC	https://paperswithcode.com/paper/efficient-characterization-of-electrically
Repo	https://github.com/Chichilnisky-Lab/shah-neurips-2019
Framework	tf

Post training 4-bit quantization of convolutional networks for rapid-deployment


Title	Post training 4-bit quantization of convolutional networks for rapid-deployment
Authors	Ron Banner, Yury Nahshan, Daniel Soudry
Abstract	Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources. Neural network quantization has significant benefits in reducing the amount of intermediate results, but it often requires the full datasets and time-consuming fine tuning to recover the accuracy lost after quantization. This paper introduces the first practical 4-bit post training quantization approach: it does not involve training the quantized model (fine-tuning), nor it requires the availability of the full dataset. We target the quantization of both activations and weights and suggest three complementary methods for minimizing quantization error at the tensor level, two of whom obtain a closed-form analytical solution. Combining these methods, our approach achieves accuracy that is just a few percents less the state-of-the-art baseline across a wide range of convolutional models. The source code to replicate all experiments is available on GitHub: \url{https://github.com/submission2019/cnn-quantization}.
Tasks	Quantization
Published	2019-12-01
URL	http://papers.nips.cc/paper/9008-post-training-4-bit-quantization-of-convolutional-networks-for-rapid-deployment
PDF	http://papers.nips.cc/paper/9008-post-training-4-bit-quantization-of-convolutional-networks-for-rapid-deployment.pdf
PWC	https://paperswithcode.com/paper/post-training-4-bit-quantization-of-1
Repo	https://github.com/submission2019/cnn-quantization
Framework	pytorch

Selecting Optimal Decisions via Distributionally Robust Nearest-Neighbor Regression


Title	Selecting Optimal Decisions via Distributionally Robust Nearest-Neighbor Regression
Authors	Ruidi Chen, Ioannis Paschalidis
Abstract	This paper develops a prediction-based prescriptive model for optimal decision making that (i) predicts the outcome under each action using a robust nonlinear model, and (ii) adopts a randomized prescriptive policy determined by the predicted outcomes. The predictive model combines a new regularized regression technique, which was developed using Distributionally Robust Optimization (DRO) with an ambiguity set constructed from the Wasserstein metric, with the K-Nearest Neighbors (K-NN) regression, which helps to capture the nonlinearity embedded in the data. We show theoretical results that guarantee the out-of-sample performance of the predictive model, and prove the optimality of the randomized policy in terms of the expected true future outcome. We demonstrate the proposed methodology on a hypertension dataset, showing that our prescribed treatment leads to a larger reduction in the systolic blood pressure compared to a series of alternatives. A clinically meaningful threshold level used to activate the randomized policy is also derived under a sub-Gaussian assumption on the predicted outcome.
Tasks	Decision Making
Published	2019-12-01
URL	http://papers.nips.cc/paper/8363-selecting-optimal-decisions-via-distributionally-robust-nearest-neighbor-regression
PDF	http://papers.nips.cc/paper/8363-selecting-optimal-decisions-via-distributionally-robust-nearest-neighbor-regression.pdf
PWC	https://paperswithcode.com/paper/selecting-optimal-decisions-via
Repo	https://github.com/noc-lab/Select-Optimal-Decisions-via-DRO-KNN.git
Framework	none

More Complete Resultset Retrieval from Large Heterogeneous RDF Sources


Title	More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
Authors	Andre Valdestilhas, Tommaso Soru, Muhammad Saleem
Abstract	Over the last years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laudromat, SPARQL endpoints provide access to the hundered of thousands of RDF datasets, representing billions of facts. These datasets are available in different formats such as raw data dumps and HDT files or directly accessible via SPARQL endpoints. Querying such large amount of distributed data is particularly challenging and many of these datasets cannot be directly queried using the SPARQL query language. In order to tackle these problems, we present WimuQ, an integrated query engine to execute SPARQL queries and retrieve results from large amount of heterogeneous RDF data sources. Presently, WimuQ is able to execute both federated and non-federated SPARQL queries over a total of 668,166 datasets from LOD Stats and LOD Laudromat as well as 559 active SPARQL endpoints. These data sources represent a total of 221.7 billion triples from more than 5 terabytes of information from datasets retrieved using the service “Where is My URI” (WIMU). Our evaluation on state-of-the-art real-data benchmarks shows that WimuQ retrieves more complete results for the benchmark queries.
Tasks	RDF Dataset Discovery
Published	2019-11-12
URL	https://dl.acm.org/doi/10.1145/3360901.3364436#d2419191e1
PDF	https://svn.aksw.org/papers/2019/KCAP2019_WIMUQ/public.pdf
PWC	https://paperswithcode.com/paper/more-complete-resultset-retrieval-from-large
Repo	https://github.com/firmao/wimut
Framework	none

A Dataset for Noun Compositionality Detection for a Slavic Language


Title	A Dataset for Noun Compositionality Detection for a Slavic Language
Authors	Dmitry Puzyrev, Artem Shelmanov, Alex Panchenko, er, Ekaterina Artemova
Abstract	This paper presents the first gold-standard resource for Russian annotated with compositionality information of noun compounds. The compound phrases are collected from the Universal Dependency treebanks according to part of speech patterns, such as ADJ+NOUN or NOUN+NOUN, using the gold-standard annotations. Each compound phrase is annotated by two experts and a moderator according to the following schema: the phrase can be either compositional, non-compositional, or ambiguous (i.e., depending on the context it can be interpreted both as compositional or non-compositional). We conduct an experimental evaluation of models and methods for predicting compositionality of noun compounds in unsupervised and supervised setups. We show that methods from previous work evaluated on the proposed Russian-language resource achieve the performance comparable with results on English corpora.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3708/
PDF	https://www.aclweb.org/anthology/W19-3708
PWC	https://paperswithcode.com/paper/a-dataset-for-noun-compositionality-detection
Repo	https://github.com/slangtech/ru-comps
Framework	none

Muscle-actuated Human Simulation and Control


Title	Muscle-actuated Human Simulation and Control
Authors	Seunghwan Lee, Kyoungmin Lee, Moonseok Park, Jehee Lee
Abstract	Many anatomical factors, such as bone geometry and muscle condition, interact to affect human movements. This work aims to build a comprehensive musculoskeletal model and its control system that reproduces realistic human movements driven by muscle contraction dynamics. The variations in the anatomic model generate a spectrum of human movements ranging from typical to highly stylistic movements. To do so, we discuss scalable and reliable simulation of anatomical features, robust control of under-actuated dynamical systems based on deep reinforcement learning, and modeling of pose-dependent joint limits. The key technical contribution is a scalable, two-level imitation learning algorithm that can deal with a comprehensive full-body musculoskeletal model with 346 muscles. We demonstrate the predictive simulation of dynamic motor skills under anatomical conditions including bone deformity, muscle weakness, contracture, and the use of a prosthesis. We also simulate various pathological gaits and predictively visualize how orthopedic surgeries improve post-operative gaits.
Tasks	Imitation Learning
Published	2019-07-23
URL	http://mrl.snu.ac.kr/research/ProjectScalable/Page.htm
PDF	http://mrl.snu.ac.kr/research/ProjectScalable/Paper.pdf
PWC	https://paperswithcode.com/paper/muscle-actuated-human-simulation-and-control
Repo	https://github.com/lsw9021/MASS
Framework	pytorch

A state-space model for inferring effective connectivity of latent neural dynamics from simultaneous EEG/fMRI


Title	A state-space model for inferring effective connectivity of latent neural dynamics from simultaneous EEG/fMRI
Authors	Tao Tu, John Paisley, Stefan Haufe, Paul Sajda
Abstract	Inferring effective connectivity between spatially segregated brain regions is important for understanding human brain dynamics in health and disease. Non-invasive neuroimaging modalities, such as electroencephalography (EEG) and functional magnetic resonance imaging (fMRI), are often used to make measurements and infer connectivity. However most studies do not consider integrating the two modalities even though each is an indirect measure of the latent neural dynamics and each has its own spatial and/or temporal limitations. In this study, we develop a linear state-space model to infer the effective connectivity in a distributed brain network based on simultaneously recorded EEG and fMRI data. Our method first identifies task-dependent and subject-dependent regions of interest (ROI) based on the analysis of fMRI data. Directed influences between the latent neural states at these ROIs are then modeled as a multivariate autogressive (MVAR) process driven by various exogenous inputs. The latent neural dynamics give rise to the observed scalp EEG measurements via a biophysically informed linear EEG forward model. We use a mean-field variational Bayesian approach to infer the posterior distribution of latent states and model parameters. The performance of the model was evaluated on two sets of simulations. Our results emphasize the importance of obtaining accurate spatial localization of ROIs from fMRI. Finally, we applied the model to simultaneously recorded EEG-fMRI data from 10 subjects during a Face-Car-House visual categorization task and compared the change in connectivity induced by different stimulus categories.
Tasks	EEG
Published	2019-12-01
URL	http://papers.nips.cc/paper/8714-a-state-space-model-for-inferring-effective-connectivity-of-latent-neural-dynamics-from-simultaneous-eegfmri
PDF	http://papers.nips.cc/paper/8714-a-state-space-model-for-inferring-effective-connectivity-of-latent-neural-dynamics-from-simultaneous-eegfmri.pdf
PWC	https://paperswithcode.com/paper/a-state-space-model-for-inferring-effective
Repo	https://github.com/taotu/VBLDS_Connectivity_EEG_fMRI
Framework	none

TopNet: Structural Point Cloud Decoder


Title	TopNet: Structural Point Cloud Decoder
Authors	Lyne P. Tchapmi, Vineet Kosaraju, Hamid Rezatofighi, Ian Reid, Silvio Savarese
Abstract	3D point cloud generation is of great use for 3D scene modeling and understanding. Real-world 3D object point clouds can be properly described by a collection of low-level and high-level structures such as surfaces, geometric primitives, semantic parts,etc. In fact, there exist many different representations of a 3D object point cloud as a set of point groups. Existing frameworks for point cloud genera-ion either do not consider structure in their proposed solutions, or assume and enforce a specific structure/topology,e.g. a collection of manifolds or surfaces, for the generated point cloud of a 3D object. In this work, we pro-pose a novel decoder that generates a structured point cloud without assuming any specific structure or topology on the underlying point set. Our decoder is softly constrained to generate a point cloud following a hierarchical rooted tree structure. We show that given enough capacity and allowing for redundancies, the proposed decoder is very flexible and able to learn any arbitrary grouping of points including any topology on the point set. We evaluate our decoder on the task of point cloud generation for 3D point cloud shape completion. Combined with encoders from existing frameworks, we show that our proposed decoder significantly outperforms state-of-the-art 3D point cloud completion methods on the Shapenet dataset
Tasks	Point Cloud Generation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Tchapmi_TopNet_Structural_Point_Cloud_Decoder_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Tchapmi_TopNet_Structural_Point_Cloud_Decoder_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/topnet-structural-point-cloud-decoder
Repo	https://github.com/lynetcha/completion3d
Framework	pytorch

Towards Improving Neural Named Entity Recognition with Gazetteers


Title	Towards Improving Neural Named Entity Recognition with Gazetteers
Authors	Tianyu Liu, Jin-Ge Yao, Chin-Yew Lin
Abstract	Most of the recently proposed neural models for named entity recognition have been purely data-driven, with a strong emphasis on getting rid of the efforts for collecting external resources or designing hand-crafted features. This could increase the chance of overfitting since the models cannot access any supervision signal beyond the small amount of annotated data, limiting their power to generalize beyond the annotated entities. In this work, we show that properly utilizing external gazetteers could benefit segmental neural NER models. We add a simple module on the recently proposed hybrid semi-Markov CRF architecture and observe some promising results.
Tasks	Named Entity Recognition
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1524/
PDF	https://www.aclweb.org/anthology/P19-1524
PWC	https://paperswithcode.com/paper/towards-improving-neural-named-entity
Repo	https://github.com/lyutyuh/acl19_subtagger
Framework	pytorch

Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter


Title	Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter
Authors	Muhammad Okky Ibrohim, Indra Budi
Abstract	Hate speech and abusive language spreading on social media need to be detected automatically to avoid conflict between citizen. Moreover, hate speech has a target, category, and level that also needs to be detected to help the authority in prioritizing which hate speech must be addressed immediately. This research discusses multi-label text classification for abusive language and hate speech detection including detecting the target, category, and level of hate speech in Indonesian Twitter using machine learning approach with Support Vector Machine (SVM), Naive Bayes (NB), and Random Forest Decision Tree (RFDT) classifier and Binary Relevance (BR), Label Power-set (LP), and Classifier Chains (CC) as the data transformation method. We used several kinds of feature extractions which are term frequency, orthography, and lexicon features. Our experiment results show that in general RFDT classifier using LP as the transformation method gives the best accuracy with fast computational time.
Tasks	Hate Speech Detection, Multi-Label Text Classification, Text Classification
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3506/
PDF	https://www.aclweb.org/anthology/W19-3506
PWC	https://paperswithcode.com/paper/multi-label-hate-speech-and-abusive-language
Repo	https://github.com/okkyibrohim/id-multi-label-hate-speech-and-abusive-language-detection
Framework	none

Transfer and Exploration via the Information Bottleneck


Title	Transfer and Exploration via the Information Bottleneck
Authors	Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew Botvinick, Yoshua Bengio, Sergey Levine
Abstract	A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned model with an information bottleneck, we can identify decision states by examining where the model accesses the goal state through the bottleneck. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=rJg8yhAqKm
PDF	https://openreview.net/pdf?id=rJg8yhAqKm
PWC	https://paperswithcode.com/paper/transfer-and-exploration-via-the-information
Repo	https://github.com/maximecb/gym-minigrid
Framework	pytorch