Paper Group NAWR 31
Veyn at PARSEME Shared Task 2018: Recurrent Neural Networks for VMWE Identification. Real-world multiobject, multigrasp detection. Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control. Optimally Selected Minimal Learning Machine. DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion …
Veyn at PARSEME Shared Task 2018: Recurrent Neural Networks for VMWE Identification
Title | Veyn at PARSEME Shared Task 2018: Recurrent Neural Networks for VMWE Identification |
Authors | Nicolas Zampieri, Manon Scholivet, Carlos Ramisch, Benoit Favre |
Abstract | This paper describes the Veyn system, submitted to the closed track of the PARSEME Shared Task 2018 on automatic identification of verbal multiword expressions (VMWEs). Veyn is based on a sequence tagger using recurrent neural networks. We represent VMWEs using a variant of the begin-inside-outside encoding scheme combined with the VMWE category tag. In addition to the system description, we present development experiments to determine the best tagging scheme. Veyn is freely available, covers 19 languages, and was ranked ninth (MWE-based) and eight (Token-based) among 13 submissions, considering macro-averaged F1 across languages. |
Tasks | Machine Translation, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4933/ |
https://www.aclweb.org/anthology/W18-4933 | |
PWC | https://paperswithcode.com/paper/veyn-at-parseme-shared-task-2018-recurrent |
Repo | https://github.com/zamp13/Veyn |
Framework | tf |
Real-world multiobject, multigrasp detection
Title | Real-world multiobject, multigrasp detection |
Authors | Fu-Jen Chu, Ruinian Xu and Patricio A. Vela |
Abstract | A deep learning architecture is proposed to predict graspable locations for robotic manipulation. It considers situations where no, one, or multiple object(s) are seen. By defining the learning problem to be classified with null hypothesis competition instead of regression, the deep neural network with red, green, blue and depth (RGB-D) image input predicts multiple grasp candidates for a single object or multiple objects, in a single shot. The method outperforms state-of-the-art approaches on the Cornell dataset with 96.0% and 96.1% accuracy on imagewise and object-wise splits, respectively. Evaluation on a multiobject dataset illustrates the generalization capability of the architecture. Grasping experiments achieve 96.0% grasp localization and 89.0% grasping success rates on a test set of household objects. The real-time process takes less than 0.25 s from image to plan. |
Tasks | Robotic Grasping |
Published | 2018-10-01 |
URL | https://arxiv.org/abs/1802.00520 |
https://arxiv.org/pdf/1802.00520.pdf | |
PWC | https://paperswithcode.com/paper/real-world-multiobject-multigrasp-detection |
Repo | https://github.com/ivalab/grasp_multiObject_multiGrasp |
Framework | tf |
Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control
Title | Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control |
Authors | Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn |
Abstract | A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization. To this end, we introduce universal planning networks (UPN). UPNs embed differentiable planning within a goal-directed policy. This planning computation unrolls a forward model in a latent space and infers an optimal action plan through gradient descent trajectory optimization. The plan-by-gradient-descent process and its underlying representations are learned end-to-end to directly optimize a supervised imitation learning objective. We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images. The learned representations can be leveraged to specify distance-based rewards to reach new target states for model-free reinforcement learning, resulting in substantially more effective learning when solving new tasks described via image based goals. We were able to achieve successful transfer of visuomotor planning strategies across robots with significantly different morphologies and actuation capabilities. Visit https://sites.google. com/view/upn-public/home for video highlights. |
Tasks | Imitation Learning |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2340 |
http://proceedings.mlr.press/v80/srinivas18b/srinivas18b.pdf | |
PWC | https://paperswithcode.com/paper/universal-planning-networks-learning |
Repo | https://github.com/aravindsrinivas/upn |
Framework | tf |
Optimally Selected Minimal Learning Machine
Title | Optimally Selected Minimal Learning Machine |
Authors | Átilla N. Maia, Madson L. D. Dias, João P. P. Gomes, Ajalmar R. da Rocha Neto |
Abstract | This paper introduces a new approach to select reference points (RPs) to minimal learning machine (MLM) for classification tasks. A critical issue related to the training process in MLM is the selection of RPs, from which the distances are taken. In its original formulation, the MLM selects the RPs randomly from the data. We propose a new method called optimally selected minimal learning machine (OS-MLM) to select the RPs. Our proposal relies on the multiresponse sparse regression (MRSR) ranking method, which is used to sort the patterns in terms of relevance. After doing so, the leave-one-out (LOO) criterion is also used in order to select an appropriate number of reference points. Based on the simulations we carried out, one can see our proposal achieved a lower number of reference points with an equivalent, or even superior, accuracy with respect to the original MLM and its variants. |
Tasks | |
Published | 2018-11-09 |
URL | https://link.springer.com/chapter/10.1007%2F978-3-030-03493-1_70 |
https://www.researchgate.net/profile/Madson_Dias/publication/328819483_Optimally_Selected_Minimal_Learning_Machine/links/5d7a605e4585157fde0fce53/Optimally-Selected-Minimal-Learning-Machine.pdf | |
PWC | https://paperswithcode.com/paper/optimally-selected-minimal-learning-machine |
Repo | https://github.com/omadson/scikit-mlm |
Framework | none |
DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion Classification of Tweets with Deep Learning
Title | DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion Classification of Tweets with Deep Learning |
Authors | Yasas Senarath, Uthayasanker Thayasivam |
Abstract | This paper describes an approach to solve implicit emotion classification with the use of pre-trained word embedding models to train multiple neural networks. The system described in this paper is composed of a sequential combination of Long Short-Term Memory and Convolutional Neural Network for feature extraction and Feedforward Neural Network for classification. In this paper, we successfully show that features extracted using multiple pre-trained embeddings can be used to improve the overall performance of the system with Emoji being one of the significant features. The evaluations show that our approach outperforms the baseline system by more than 8{%} without using any external corpus or lexicon. This approach is ranked 8th in Implicit Emotion Shared Task (IEST) at WASSA-2018. |
Tasks | Emotion Classification, Opinion Mining, Sentiment Analysis, Text Classification |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6230/ |
https://www.aclweb.org/anthology/W18-6230 | |
PWC | https://paperswithcode.com/paper/datasearch-at-iest-2018-multiple-word |
Repo | https://github.com/ysenarath/opinion-lab |
Framework | none |
Can We Gain More from Orthogonality Regularizations in Training Deep Networks?
Title | Can We Gain More from Orthogonality Regularizations in Training Deep Networks? |
Authors | Nitin Bansal, Xiaohan Chen, Zhangyang Wang |
Abstract | This paper seeks to answer the question: as the (near-) orthogonality of weights is found to be a favorable property for training deep convolutional neural networks, how can we enforce it in more effective and easy-to-use ways? We develop novel orthogonality regularizations on training deep CNNs, utilizing various advanced analytical tools such as mutual coherence and restricted isometry property. These plug-and-play regularizations can be conveniently incorporated into training almost any CNN without extra hassle. We then benchmark their effects on state-of-the-art models: ResNet, WideResNet, and ResNeXt, on several most popular computer vision datasets: CIFAR-10, CIFAR-100, SVHN and ImageNet. We observe consistent performance gains after applying those proposed regularizations, in terms of both the final accuracies achieved, and faster and more stable convergences. We have made our codes and pre-trained models publicly available: https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7680-can-we-gain-more-from-orthogonality-regularizations-in-training-deep-networks |
http://papers.nips.cc/paper/7680-can-we-gain-more-from-orthogonality-regularizations-in-training-deep-networks.pdf | |
PWC | https://paperswithcode.com/paper/can-we-gain-more-from-orthogonality-1 |
Repo | https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality |
Framework | pytorch |
Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Title | Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning |
Authors | Jinyoung Yeo, Gyeongbok Lee, Gengyu Wang, Seungtaek Choi, Hyunsouk Cho, Reinald Kim Amplayo, Seung-won Hwang |
Abstract | |
Tasks | Image Captioning, Question Answering, Video Question Answering, Visual Question Answering, Visual Reasoning |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1316/ |
https://www.aclweb.org/anthology/L18-1316 | |
PWC | https://paperswithcode.com/paper/visual-choice-of-plausible-alternatives-an |
Repo | https://github.com/antest1/VCOPA-Dataset |
Framework | none |
Discretely Relaxing Continuous Variables for tractable Variational Inference
Title | Discretely Relaxing Continuous Variables for tractable Variational Inference |
Authors | Trefor Evans, Prasanth Nair |
Abstract | We explore a new research direction in Bayesian variational inference with discrete latent variable priors where we exploit Kronecker matrix algebra for efficient and exact computations of the evidence lower bound (ELBO). The proposed “DIRECT” approach has several advantages over its predecessors; (i) it can exactly compute ELBO gradients (i.e. unbiased, zero-variance gradient estimates), eliminating the need for high-variance stochastic gradient estimators and enabling the use of quasi-Newton optimization methods; (ii) its training complexity is independent of the number of training points, permitting inference on large datasets; and (iii) its posterior samples consist of sparse and low-precision quantized integers which permit fast inference on hardware limited devices. In addition, our DIRECT models can exactly compute statistical moments of the parameterized predictive posterior without relying on Monte Carlo sampling. The DIRECT approach is not practical for all likelihoods, however, we identify a popular model structure which is practical, and demonstrate accurate inference using latent variables discretized as extremely low-precision 4-bit quantized integers. While the ELBO computations considered in the numerical studies require over 10^2352 log-likelihood evaluations, we train on datasets with over two-million points in just seconds. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8247-discretely-relaxing-continuous-variables-for-tractable-variational-inference |
http://papers.nips.cc/paper/8247-discretely-relaxing-continuous-variables-for-tractable-variational-inference.pdf | |
PWC | https://paperswithcode.com/paper/discretely-relaxing-continuous-variables-for-1 |
Repo | https://github.com/treforevans/direct |
Framework | tf |
Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages
Title | Multilingual Anchoring: Interactive Topic Modeling and Alignment Across Languages |
Authors | Michelle Yuan, Benjamin Van Durme, Jordan L. Ying |
Abstract | Multilingual topic models can reveal patterns in cross-lingual document collections. However, existing models lack speed and interactivity, which prevents adoption in everyday corpora exploration or quick moving situations (e.g., natural disasters, political instability). First, we propose a multilingual anchoring algorithm that builds an anchor-based topic model for documents in different languages. Then, we incorporate interactivity to develop MTAnchor (Multilingual Topic Anchors), a system that allows users to refine the topic model. We test our algorithms on labeled English, Chinese, and Sinhalese documents. Within minutes, our methods can produce interpretable topics that are useful for specific classification tasks. |
Tasks | Topic Models |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8083-multilingual-anchoring-interactive-topic-modeling-and-alignment-across-languages |
http://papers.nips.cc/paper/8083-multilingual-anchoring-interactive-topic-modeling-and-alignment-across-languages.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-anchoring-interactive-topic |
Repo | https://github.com/forest-snow/mtanchor_demo |
Framework | none |
PyCM: Multiclass confusion matrix library in Python
Title | PyCM: Multiclass confusion matrix library in Python |
Authors | Sepand Haghighi, Masoomeh Jasemi, Shaahin Hessabi, Alireza Zolanvari |
Abstract | PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers. |
Tasks | |
Published | 2018-05-29 |
URL | http://joss.theoj.org/papers/10.21105/joss.00729 |
https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf | |
PWC | https://paperswithcode.com/paper/pycm-multiclass-confusion-matrix-library-in |
Repo | https://github.com/sepandhaghighi/pycm |
Framework | none |
AMLA: an AutoML frAmework for Neural Network Design
Title | AMLA: an AutoML frAmework for Neural Network Design |
Authors | Purushotham Kamath, Abhishek Singh, Debo Dutta |
Abstract | AMLA is an Automatic Machine Learning frAmework for implementing and deploying neural architecture search algorithms. Neural architecture search algorithms are AutoML algorithms whose goal is to generate optimal neural network structures for a given task. AMLA is designed to deploy these algorithms at scale and allow comparison of the performance of the networks generated by different AutoML algorithms. Its key architectural features are the decoupling of the network generation from the network evaluation, support for network instrumentation, open model specification and a microservices based architecture for deployment at scale. In AMLA, AutoML algorithms and training/evaluation code are written as containerized microservices that can be deployed at scale on a public or private infrastructure. The microservices communicate via well defined interfaces and models are persisted using standard model definition formats, allowing the plug and play of the AutoML algorithms as well as the AI/ML libraries. This makes it easy to prototype, compare, benchmark and deploy autoML algorithms in production. AMLA is currently being used to deploy an AutoML algorithm that generates Convolutional Neural Networks (CNNs) used for image classification. |
Tasks | AutoML, Hyperparameter Optimization, Image Classification, Neural Architecture Search |
Published | 2018-01-01 |
URL | http://pkamath.com/publications/papers/amla_automl18.pdf |
http://pkamath.com/publications/papers/amla_automl18.pdf | |
PWC | https://paperswithcode.com/paper/amla-an-automl-framework-for-neural-network |
Repo | https://github.com/CiscoAI/amla |
Framework | tf |
Long-Term Mobile Traffic Forecasting Using Deep Spatio-Temporal Neural Networks
Title | Long-Term Mobile Traffic Forecasting Using Deep Spatio-Temporal Neural Networks |
Authors | Chaoyun Zhang |
Abstract | Forecasting with high accuracy the volume of data traffic that mobile users will consume is becoming increasingly important for precision traffic engineering, demand-aware network resource allocation, as well as public transportation. Measurements collection in dense urban deployments is however complex and expensive, and the post-processing required to make predictions is highly non-trivial, given the intricate spatio-temporal variability of mobile traffic due to user mobility. To overcome these challenges, in this paper we harness the exceptional feature extraction abilities of deep learning and propose a Spatio-Temporal neural Network (STN) architecture purposely designed for precise network-wide mobile traffic forecasting. We present a mechanism that fine tunes the STN and enables its operation with only limited ground truth observations. We then introduce a Double STN technique (D-STN), which uniquely combines the STN predictions with historical statistics, thereby making faithful long-term mobile traffic projections. Experiments we conduct with real-world mobile traffic data sets, collected over 60 days in both urban and rural areas, demonstrate that the proposed (D-)STN schemes perform up to 10-hour long predictions with remarkable accuracy, irrespective of the time of day when they are triggered. Specifically, our solutions achieve up to 61% smaller prediction errors as compared to widely used forecasting approaches, while operating with up to 600 times shorter measurement intervals. |
Tasks | |
Published | 2018-07-26 |
URL | https://sci-hub.tw/10.1145/3209582.3209606 |
https://sci-hub.tw/10.1145/3209582.3209606 | |
PWC | https://paperswithcode.com/paper/long-term-mobile-traffic-forecasting-using |
Repo | https://github.com/vyokky/Mobihoc-18-STN-mobile-traffic-forecasting |
Framework | tf |
Aspect-based summarization of pros and cons in unstructured product reviews
Title | Aspect-based summarization of pros and cons in unstructured product reviews |
Authors | Florian Kunneman, S Wubben, er, Antal van den Bosch, Emiel Krahmer |
Abstract | We developed three systems for generating pros and cons summaries of product reviews. Automating this task eases the writing of product reviews, and offers readers quick access to the most important information. We compared SynPat, a system based on syntactic phrases selected on the basis of valence scores, against a neural-network-based system trained to map bag-of-words representations of reviews directly to pros and cons, and the same neural system trained on clusters of word-embedding encodings of similar pros and cons. We evaluated the systems in two ways: first on held-out reviews with gold-standard pros and cons, and second by asking human annotators to rate the systems{'} output on relevance and completeness. In the second evaluation, the gold-standard pros and cons were assessed along with the system output. We find that the human-generated summaries are not deemed as significantly more relevant or complete than the SynPat systems; the latter are scored higher than the human-generated summaries on a precision metric. The neural approaches yield a lower performance in the human assessment, and are outperformed by the baseline. |
Tasks | Aspect-Based Sentiment Analysis, Sentiment Analysis |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1188/ |
https://www.aclweb.org/anthology/C18-1188 | |
PWC | https://paperswithcode.com/paper/aspect-based-summarization-of-pros-and-cons |
Repo | https://github.com/fkunneman/Product_review_summary |
Framework | none |
NLP-Cube: End-to-End Raw Text Processing With Neural Networks
Title | NLP-Cube: End-to-End Raw Text Processing With Neural Networks |
Authors | Tiberiu Boros, Stefan Daniel Dumitrescu, Rux Burtica, ra |
Abstract | We introduce NLP-Cube: an end-to-end Natural Language Processing framework, evaluated in CoNLL{'}s {``}Multilingual Parsing from Raw Text to Universal Dependencies 2018{''} Shared Task. It performs sentence splitting, tokenization, compound word expansion, lemmatization, tagging and parsing. Based entirely on recurrent neural networks, written in Python, this ready-to-use open source system is freely available on GitHub. For each task we describe and discuss its specific network architecture, closing with an overview on the results obtained in the competition. | |
Tasks | Lemmatization, Tokenization |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2017/ |
https://www.aclweb.org/anthology/K18-2017 | |
PWC | https://paperswithcode.com/paper/nlp-cube-end-to-end-raw-text-processing-with |
Repo | https://github.com/adobe/NLP-Cube |
Framework | none |
HPI-DHC at TREC 2018 Precision Medicine Track
Title | HPI-DHC at TREC 2018 Precision Medicine Track |
Authors | Michel Oleynik, Erik Faessler, Ariane Morassi Sasso, Arpita Kappattanavar, Benjamin Bergner, Harry Freitas da Cruz, Jan-Philipp Sachs, Suparno Datta, Erwin Bottinger |
Abstract | The TREC-PM challenge aims for advances in the field of information retrieval applied to precision medicine. Here we describe our experimental setup and the achieved results in its 2018 edition. We explored the use of unsupervised topic models, supervised document classification, and rule-based query-time search term boosting and expansion. We participated in the biomedical articles and clinical trials subtasks and were among the three highest-scoring teams. Our results showed that query expansion associated with hand-crafted rules contribute to better values of information retrieval metrics. However, the use of a precision medicine classifier did not show the expected improvement for the biomedical abstracts subtask. In the future, we plan to add different terminologies to replace hand-crafted rules and experiment with negation detection. |
Tasks | Document Classification, Information Retrieval, Negation Detection, Topic Models |
Published | 2018-11-14 |
URL | https://trec.nist.gov/pubs/trec27/papers/hpi-dhc-PM.pdf |
https://trec.nist.gov/pubs/trec27/papers/hpi-dhc-PM.pdf | |
PWC | https://paperswithcode.com/paper/hpi-dhc-at-trec-2018-precision-medicine-track |
Repo | https://github.com/hpi-dhc/trec-pm |
Framework | none |