October 16, 2019

2642 words 13 mins read

Paper Group NAWR 31

Veyn at PARSEME Shared Task 2018: Recurrent Neural Networks for VMWE Identification. Real-world multiobject, multigrasp detection. Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control. Optimally Selected Minimal Learning Machine. DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion …

Veyn at PARSEME Shared Task 2018: Recurrent Neural Networks for VMWE Identification


Title	Veyn at PARSEME Shared Task 2018: Recurrent Neural Networks for VMWE Identification
Authors	Nicolas Zampieri, Manon Scholivet, Carlos Ramisch, Benoit Favre
Abstract	This paper describes the Veyn system, submitted to the closed track of the PARSEME Shared Task 2018 on automatic identification of verbal multiword expressions (VMWEs). Veyn is based on a sequence tagger using recurrent neural networks. We represent VMWEs using a variant of the begin-inside-outside encoding scheme combined with the VMWE category tag. In addition to the system description, we present development experiments to determine the best tagging scheme. Veyn is freely available, covers 19 languages, and was ranked ninth (MWE-based) and eight (Token-based) among 13 submissions, considering macro-averaged F1 across languages.
Tasks	Machine Translation, Word Embeddings
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4933/
PDF	https://www.aclweb.org/anthology/W18-4933
PWC	https://paperswithcode.com/paper/veyn-at-parseme-shared-task-2018-recurrent
Repo	https://github.com/zamp13/Veyn
Framework	tf

Real-world multiobject, multigrasp detection


Title	Real-world multiobject, multigrasp detection
Authors	Fu-Jen Chu, Ruinian Xu and Patricio A. Vela
Abstract	A deep learning architecture is proposed to predict graspable locations for robotic manipulation. It considers situations where no, one, or multiple object(s) are seen. By defining the learning problem to be classified with null hypothesis competition instead of regression, the deep neural network with red, green, blue and depth (RGB-D) image input predicts multiple grasp candidates for a single object or multiple objects, in a single shot. The method outperforms state-of-the-art approaches on the Cornell dataset with 96.0% and 96.1% accuracy on imagewise and object-wise splits, respectively. Evaluation on a multiobject dataset illustrates the generalization capability of the architecture. Grasping experiments achieve 96.0% grasp localization and 89.0% grasping success rates on a test set of household objects. The real-time process takes less than 0.25 s from image to plan.
Tasks	Robotic Grasping
Published	2018-10-01
URL	https://arxiv.org/abs/1802.00520
PDF	https://arxiv.org/pdf/1802.00520.pdf
PWC	https://paperswithcode.com/paper/real-world-multiobject-multigrasp-detection
Repo	https://github.com/ivalab/grasp_multiObject_multiGrasp
Framework	tf

Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control


Title	Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control
Authors	Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
Abstract	A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization. To this end, we introduce universal planning networks (UPN). UPNs embed differentiable planning within a goal-directed policy. This planning computation unrolls a forward model in a latent space and infers an optimal action plan through gradient descent trajectory optimization. The plan-by-gradient-descent process and its underlying representations are learned end-to-end to directly optimize a supervised imitation learning objective. We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images. The learned representations can be leveraged to specify distance-based rewards to reach new target states for model-free reinforcement learning, resulting in substantially more effective learning when solving new tasks described via image based goals. We were able to achieve successful transfer of visuomotor planning strategies across robots with significantly different morphologies and actuation capabilities. Visit https://sites.google. com/view/upn-public/home for video highlights.
Tasks	Imitation Learning
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2340
PDF	http://proceedings.mlr.press/v80/srinivas18b/srinivas18b.pdf
PWC	https://paperswithcode.com/paper/universal-planning-networks-learning
Repo	https://github.com/aravindsrinivas/upn
Framework	tf

Optimally Selected Minimal Learning Machine


Title	Optimally Selected Minimal Learning Machine
Authors	Átilla N. Maia, Madson L. D. Dias, João P. P. Gomes, Ajalmar R. da Rocha Neto
Abstract	This paper introduces a new approach to select reference points (RPs) to minimal learning machine (MLM) for classification tasks. A critical issue related to the training process in MLM is the selection of RPs, from which the distances are taken. In its original formulation, the MLM selects the RPs randomly from the data. We propose a new method called optimally selected minimal learning machine (OS-MLM) to select the RPs. Our proposal relies on the multiresponse sparse regression (MRSR) ranking method, which is used to sort the patterns in terms of relevance. After doing so, the leave-one-out (LOO) criterion is also used in order to select an appropriate number of reference points. Based on the simulations we carried out, one can see our proposal achieved a lower number of reference points with an equivalent, or even superior, accuracy with respect to the original MLM and its variants.
Tasks
Published	2018-11-09
URL	https://link.springer.com/chapter/10.1007%2F978-3-030-03493-1_70
PDF	https://www.researchgate.net/profile/Madson_Dias/publication/328819483_Optimally_Selected_Minimal_Learning_Machine/links/5d7a605e4585157fde0fce53/Optimally-Selected-Minimal-Learning-Machine.pdf
PWC	https://paperswithcode.com/paper/optimally-selected-minimal-learning-machine
Repo	https://github.com/omadson/scikit-mlm
Framework	none

DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion Classification of Tweets with Deep Learning


Title	DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion Classification of Tweets with Deep Learning
Authors	Yasas Senarath, Uthayasanker Thayasivam
Abstract	This paper describes an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple neural networks. The system described in this paper is composed of a sequential combination of Long Short-Term Memory and Convolutional Neural Network for feature extraction and Feedforward Neural Network for classification. In this paper, we successfully show that features extracted using multiple pre-trained embeddings can be used to improve the overall performance of the system with Emoji being one of the significant features. The evaluations show that our approach outperforms the baseline system by more than 8{%} without using any external corpus or lexicon. This approach is ranked 8th in Implicit Emotion Shared Task (IEST) at WASSA-2018.
Tasks	Emotion Classification, Opinion Mining, Sentiment Analysis, Text Classification
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-6230/
PDF	https://www.aclweb.org/anthology/W18-6230
PWC	https://paperswithcode.com/paper/datasearch-at-iest-2018-multiple-word
Repo	https://github.com/ysenarath/opinion-lab
Framework	none

Can We Gain More from Orthogonality Regularizations in Training Deep Networks?


Title	Can We Gain More from Orthogonality Regularizations in Training Deep Networks?
Authors	Nitin Bansal, Xiaohan Chen, Zhangyang Wang
Abstract	This paper seeks to answer the question: as the (near-) orthogonality of weights is found to be a favorable property for training deep convolutional neural networks, how can we enforce it in more effective and easy-to-use ways? We develop novel orthogonality regularizations on training deep CNNs, utilizing various advanced analytical tools such as mutual coherence and restricted isometry property. These plug-and-play regularizations can be conveniently incorporated into training almost any CNN without extra hassle. We then benchmark their effects on state-of-the-art models: ResNet, WideResNet, and ResNeXt, on several most popular computer vision datasets: CIFAR-10, CIFAR-100, SVHN and ImageNet. We observe consistent performance gains after applying those proposed regularizations, in terms of both the final accuracies achieved, and faster and more stable convergences. We have made our codes and pre-trained models publicly available: https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7680-can-we-gain-more-from-orthogonality-regularizations-in-training-deep-networks
PDF	http://papers.nips.cc/paper/7680-can-we-gain-more-from-orthogonality-regularizations-in-training-deep-networks.pdf
PWC	https://paperswithcode.com/paper/can-we-gain-more-from-orthogonality-1
Repo	https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality
Framework	pytorch

Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning


Title	Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Authors	Jinyoung Yeo, Gyeongbok Lee, Gengyu Wang, Seungtaek Choi, Hyunsouk Cho, Reinald Kim Amplayo, Seung-won Hwang
Abstract
Tasks	Image Captioning, Question Answering, Video Question Answering, Visual Question Answering, Visual Reasoning
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1316/
PDF	https://www.aclweb.org/anthology/L18-1316
PWC	https://paperswithcode.com/paper/visual-choice-of-plausible-alternatives-an
Repo	https://github.com/antest1/VCOPA-Dataset
Framework	none

Discretely Relaxing Continuous Variables for tractable Variational Inference


Title	Discretely Relaxing Continuous Variables for tractable Variational Inference
Authors	Trefor Evans, Prasanth Nair
Abstract	We explore a new research direction in Bayesian variational inference with discrete latent variable priors where we exploit Kronecker matrix algebra for efficient and exact computations of the evidence lower bound (ELBO). The proposed “DIRECT” approach has several advantages over its predecessors; (i) it can exactly compute ELBO gradients (i.e. unbiased, zero-variance gradient estimates), eliminating the need for high-variance stochastic gradient estimators and enabling the use of quasi-Newton optimization methods; (ii) its training complexity is independent of the number of training points, permitting inference on large datasets; and (iii) its posterior samples consist of sparse and low-precision quantized integers which permit fast inference on hardware limited devices. In addition, our DIRECT models can exactly compute statistical moments of the parameterized predictive posterior without relying on Monte Carlo sampling. The DIRECT approach is not practical for all likelihoods, however, we identify a popular model structure which is practical, and demonstrate accurate inference using latent variables discretized as extremely low-precision 4-bit quantized integers. While the ELBO computations considered in the numerical studies require over 10^2352 log-likelihood evaluations, we train on datasets with over two-million points in just seconds.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/8247-discretely-relaxing-continuous-variables-for-tractable-variational-inference
PDF	http://papers.nips.cc/paper/8247-discretely-relaxing-continuous-variables-for-tractable-variational-inference.pdf
PWC	https://paperswithcode.com/paper/discretely-relaxing-continuous-variables-for-1
Repo	https://github.com/treforevans/direct
Framework	tf

Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages


Title	Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages
Authors	Michelle Yuan, Benjamin Van Durme, Jordan L. Ying
Abstract	Multilingual topic models can reveal patterns in cross-lingual document collections. However, existing models lack speed and interactivity, which prevents adoption in everyday corpora exploration or quick moving situations (e.g., natural disasters, political instability). First, we propose a multilingual anchoring algorithm that builds an anchor-based topic model for documents in different languages. Then, we incorporate interactivity to develop MTAnchor (Multilingual Topic Anchors), a system that allows users to refine the topic model. We test our algorithms on labeled English, Chinese, and Sinhalese documents. Within minutes, our methods can produce interpretable topics that are useful for specific classification tasks.
Tasks	Topic Models
Published	2018-12-01
URL	http://papers.nips.cc/paper/8083-multilingual-anchoring-interactive-topic-modeling-and-alignment-across-languages
PDF	http://papers.nips.cc/paper/8083-multilingual-anchoring-interactive-topic-modeling-and-alignment-across-languages.pdf
PWC	https://paperswithcode.com/paper/multilingual-anchoring-interactive-topic
Repo	https://github.com/forest-snow/mtanchor_demo
Framework	none

PyCM: Multiclass confusion matrix library in Python


Title	PyCM: Multiclass confusion matrix library in Python
Authors	Sepand Haghighi, Masoomeh Jasemi, Shaahin Hessabi, Alireza Zolanvari
Abstract	PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers.
Tasks
Published	2018-05-29
URL	http://joss.theoj.org/papers/10.21105/joss.00729
PDF	https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf
PWC	https://paperswithcode.com/paper/pycm-multiclass-confusion-matrix-library-in
Repo	https://github.com/sepandhaghighi/pycm
Framework	none

AMLA: an AutoML frAmework for Neural Network Design


Title	AMLA: an AutoML frAmework for Neural Network Design
Authors	Purushotham Kamath, Abhishek Singh, Debo Dutta
Abstract	AMLA is an Automatic Machine Learning frAmework for implementing and deploying neural architecture search algorithms. Neural architecture search algorithms are AutoML algorithms whose goal is to generate optimal neural network structures for a given task. AMLA is designed to deploy these algorithms at scale and allow comparison of the performance of the networks generated by different AutoML algorithms. Its key architectural features are the decoupling of the network generation from the network evaluation, support for network instrumentation, open model specification and a microservices based architecture for deployment at scale. In AMLA, AutoML algorithms and training/evaluation code are written as containerized microservices that can be deployed at scale on a public or private infrastructure. The microservices communicate via well defined interfaces and models are persisted using standard model definition formats, allowing the plug and play of the AutoML algorithms as well as the AI/ML libraries. This makes it easy to prototype, compare, benchmark and deploy autoML algorithms in production. AMLA is currently being used to deploy an AutoML algorithm that generates Convolutional Neural Networks (CNNs) used for image classification.
Tasks	AutoML, Hyperparameter Optimization, Image Classification, Neural Architecture Search
Published	2018-01-01
URL	http://pkamath.com/publications/papers/amla_automl18.pdf
PDF	http://pkamath.com/publications/papers/amla_automl18.pdf
PWC	https://paperswithcode.com/paper/amla-an-automl-framework-for-neural-network
Repo	https://github.com/CiscoAI/amla
Framework	tf

Long-Term Mobile Traffic Forecasting Using Deep Spatio-Temporal Neural Networks


Title	Long-Term Mobile Traffic Forecasting Using Deep Spatio-Temporal Neural Networks
Authors	Chaoyun Zhang
Abstract	Forecasting with high accuracy the volume of data traffic that mobile users will consume is becoming increasingly important for precision traffic engineering, demand-aware network resource allocation, as well as public transportation. Measurements collection in dense urban deployments is however complex and expensive, and the post-processing required to make predictions is highly non-trivial, given the intricate spatio-temporal variability of mobile traffic due to user mobility. To overcome these challenges, in this paper we harness the exceptional feature extraction abilities of deep learning and propose a Spatio-Temporal neural Network (STN) architecture purposely designed for precise network-wide mobile traffic forecasting. We present a mechanism that fine tunes the STN and enables its operation with only limited ground truth observations. We then introduce a Double STN technique (D-STN), which uniquely combines the STN predictions with historical statistics, thereby making faithful long-term mobile traffic projections. Experiments we conduct with real-world mobile traffic data sets, collected over 60 days in both urban and rural areas, demonstrate that the proposed (D-)STN schemes perform up to 10-hour long predictions with remarkable accuracy, irrespective of the time of day when they are triggered. Specifically, our solutions achieve up to 61% smaller prediction errors as compared to widely used forecasting approaches, while operating with up to 600 times shorter measurement intervals.
Tasks
Published	2018-07-26
URL	https://sci-hub.tw/10.1145/3209582.3209606
PDF	https://sci-hub.tw/10.1145/3209582.3209606
PWC	https://paperswithcode.com/paper/long-term-mobile-traffic-forecasting-using
Repo	https://github.com/vyokky/Mobihoc-18-STN-mobile-traffic-forecasting
Framework	tf

Aspect-based summarization of pros and cons in unstructured product reviews


Title	Aspect-based summarization of pros and cons in unstructured product reviews
Authors	Florian Kunneman, S Wubben, er, Antal van den Bosch, Emiel Krahmer
Abstract	We developed three systems for generating pros and cons summaries of product reviews. Automating this task eases the writing of product reviews, and offers readers quick access to the most important information. We compared SynPat, a system based on syntactic phrases selected on the basis of valence scores, against a neural-network-based system trained to map bag-of-words representations of reviews directly to pros and cons, and the same neural system trained on clusters of word-embedding encodings of similar pros and cons. We evaluated the systems in two ways: first on held-out reviews with gold-standard pros and cons, and second by asking human annotators to rate the systems{'} output on relevance and completeness. In the second evaluation, the gold-standard pros and cons were assessed along with the system output. We find that the human-generated summaries are not deemed as significantly more relevant or complete than the SynPat systems; the latter are scored higher than the human-generated summaries on a precision metric. The neural approaches yield a lower performance in the human assessment, and are outperformed by the baseline.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1188/
PDF	https://www.aclweb.org/anthology/C18-1188
PWC	https://paperswithcode.com/paper/aspect-based-summarization-of-pros-and-cons
Repo	https://github.com/fkunneman/Product_review_summary
Framework	none

NLP-Cube: End-to-End Raw Text Processing With Neural Networks


Title	NLP-Cube: End-to-End Raw Text Processing With Neural Networks
Authors	Tiberiu Boros, Stefan Daniel Dumitrescu, Rux Burtica, ra
Abstract	We introduce NLP-Cube: an end-to-end Natural Language Processing framework, evaluated in CoNLL{'}s {``}Multilingual Parsing from Raw Text to Universal Dependencies 2018{''} Shared Task. It performs sentence splitting, tokenization, compound word expansion, lemmatization, tagging and parsing. Based entirely on recurrent neural networks, written in Python, this ready-to-use open source system is freely available on GitHub. For each task we describe and discuss its specific network architecture, closing with an overview on the results obtained in the competition. \|
Tasks	Lemmatization, Tokenization
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-2017/
PDF	https://www.aclweb.org/anthology/K18-2017
PWC	https://paperswithcode.com/paper/nlp-cube-end-to-end-raw-text-processing-with
Repo	https://github.com/adobe/NLP-Cube
Framework	none

HPI-DHC at TREC 2018 Precision Medicine Track


Title	HPI-DHC at TREC 2018 Precision Medicine Track
Authors	Michel Oleynik, Erik Faessler, Ariane Morassi Sasso, Arpita Kappattanavar, Benjamin Bergner, Harry Freitas da Cruz, Jan-Philipp Sachs, Suparno Datta, Erwin Bottinger
Abstract	The TREC-PM challenge aims for advances in the field of information retrieval applied to precision medicine. Here we describe our experimental setup and the achieved results in its 2018 edition. We explored the use of unsupervised topic models, supervised document classification, and rule-based query-time search term boosting and expansion. We participated in the biomedical articles and clinical trials subtasks and were among the three highest-scoring teams. Our results showed that query expansion associated with hand-crafted rules contribute to better values of information retrieval metrics. However, the use of a precision medicine classifier did not show the expected improvement for the biomedical abstracts subtask. In the future, we plan to add different terminologies to replace hand-crafted rules and experiment with negation detection.
Tasks	Document Classification, Information Retrieval, Negation Detection, Topic Models
Published	2018-11-14
URL	https://trec.nist.gov/pubs/trec27/papers/hpi-dhc-PM.pdf
PDF	https://trec.nist.gov/pubs/trec27/papers/hpi-dhc-PM.pdf
PWC	https://paperswithcode.com/paper/hpi-dhc-at-trec-2018-precision-medicine-track
Repo	https://github.com/hpi-dhc/trec-pm
Framework	none