January 25, 2020

3048 words 15 mins read

Paper Group ANR 1667

Paper Group ANR 1667

Word-order biases in deep-agent emergent communication. Motion Planning Networks: Bridging the Gap Between Learning-based and Classical Motion Planners. A Speech Act Classifier for Persian Texts and its Application in Identify Speech Act of Rumors. MGP-AttTCN: An Interpretable Machine Learning Model for the Prediction of Sepsis. Time series model s …

Word-order biases in deep-agent emergent communication

Title Word-order biases in deep-agent emergent communication
Authors Rahma Chaabouni, Eugene Kharitonov, Alessandro Lazaric, Emmanuel Dupoux, Marco Baroni
Abstract Sequence-processing neural networks led to remarkable progress on many NLP tasks. As a consequence, there has been increasing interest in understanding to what extent they process language as humans do. We aim here to uncover which biases such models display with respect to “natural” word-order constraints. We train models to communicate about paths in a simple gridworld, using miniature languages that reflect or violate various natural language trends, such as the tendency to avoid redundancy or to minimize long-distance dependencies. We study how the controlled characteristics of our miniature languages affect individual learning and their stability across multiple network generations. The results draw a mixed picture. On the one hand, neural networks show a strong tendency to avoid long-distance dependencies. On the other hand, there is no clear preference for the efficient, non-redundant encoding of information that is widely attested in natural language. We thus suggest inoculating a notion of “effort” into neural networks, as a possible way to make their linguistic behavior more human-like.
Tasks
Published 2019-05-29
URL https://arxiv.org/abs/1905.12330v3
PDF https://arxiv.org/pdf/1905.12330v3.pdf
PWC https://paperswithcode.com/paper/word-order-biases-in-deep-agent-emergent
Repo
Framework

Motion Planning Networks: Bridging the Gap Between Learning-based and Classical Motion Planners

Title Motion Planning Networks: Bridging the Gap Between Learning-based and Classical Motion Planners
Authors Ahmed H. Qureshi, Yinglong Miao, Anthony Simeonov, Michael C. Yip
Abstract This paper describes Motion Planning Networks (MPNet), a computationally efficient, learning-based neural planner for solving motion planning problems. MPNet uses neural networks to learn general near-optimal heuristics for path planning in seen and unseen environments. It receives environment information as point-clouds, as well as a robot’s initial and desired goal configurations and recursively calls itself to bidirectionally generate connectable paths. In addition to finding directly connectable and near-optimal paths in a single pass, we show that worst-case theoretical guarantees can be proven if we merge this neural network strategy with classical sample-based planners in a hybrid approach while still retaining significant computational and optimality improvements. To learn the MPNet models, we present an active continual learning approach that enables MPNet to learn from streaming data and actively ask for expert demonstrations when needed, drastically reducing data for training. We validate MPNet against gold-standard and state-of-the-art planning methods in a variety of problems from 2D to 7D robot configuration spaces in challenging and cluttered environments, with results showing significant and consistently stronger performance metrics, and motivating neural planning in general as a modern strategy for solving motion planning problems efficiently.
Tasks Continual Learning, Motion Planning
Published 2019-07-13
URL https://arxiv.org/abs/1907.06013v2
PDF https://arxiv.org/pdf/1907.06013v2.pdf
PWC https://paperswithcode.com/paper/motion-planning-networks-bridging-the-gap
Repo
Framework

A Speech Act Classifier for Persian Texts and its Application in Identify Speech Act of Rumors

Title A Speech Act Classifier for Persian Texts and its Application in Identify Speech Act of Rumors
Authors Zoleikha Jahanbakhsh-Nagadeh, Mohammad-Reza Feizi-Derakhshi, Arash Sharifi
Abstract Speech Acts (SAs) are one of the important areas of pragmatics, which give us a better understanding of the state of mind of the people and convey an intended language function. Knowledge of the SA of a text can be helpful in analyzing that text in natural language processing applications. This study presents a dictionary-based statistical technique for Persian SA recognition. The proposed technique classifies a text into seven classes of SA based on four criteria: lexical, syntactic, semantic, and surface features. WordNet as the tool for extracting synonym and enriching features dictionary is utilized. To evaluate the proposed technique, we utilized four classification methods including Random Forest (RF), Support Vector Machine (SVM), Naive Bayes (NB), and K-Nearest Neighbors (KNN). The experimental results demonstrate that the proposed method using RF and SVM as the best classifiers achieved a state-of-the-art performance with an accuracy of 0.95 for classification of Persian SAs. Our original vision of this work is introducing an application of SA recognition on social media content, especially the common SA in rumors. Therefore, the proposed system utilized to determine the common SAs in rumors. The results showed that Persian rumors are often expressed in three SA classes including narrative, question, and threat, and in some cases with the request SA.
Tasks
Published 2019-01-12
URL http://arxiv.org/abs/1901.03904v2
PDF http://arxiv.org/pdf/1901.03904v2.pdf
PWC https://paperswithcode.com/paper/a-speech-act-classifier-for-persian-texts-and
Repo
Framework

MGP-AttTCN: An Interpretable Machine Learning Model for the Prediction of Sepsis

Title MGP-AttTCN: An Interpretable Machine Learning Model for the Prediction of Sepsis
Authors Margherita Rosnati, Vincent Fortuin
Abstract With a mortality rate of 5.4 million lives worldwide every year and a healthcare cost of more than 16 billion dollars in the USA alone, sepsis is one of the leading causes of hospital mortality and an increasing concern in the ageing western world. Recently, medical and technological advances have helped re-define the illness criteria of this disease, which is otherwise poorly understood by the medical society. Together with the rise of widely accessible Electronic Health Records, the advances in data mining and complex nonlinear algorithms are a promising avenue for the early detection of sepsis. This work contributes to the research effort in the field of automated sepsis detection with an open-access labelling of the medical MIMIC-III data set. Moreover, we propose MGP-AttTCN: a joint multitask Gaussian Process and attention-based deep learning model to early predict the occurrence of sepsis in an interpretable manner. We show that our model outperforms the current state-of-the-art and present evidence that different labelling heuristics lead to discrepancies in task difficulty.
Tasks Interpretable Machine Learning
Published 2019-09-27
URL https://arxiv.org/abs/1909.12637v1
PDF https://arxiv.org/pdf/1909.12637v1.pdf
PWC https://paperswithcode.com/paper/mgp-atttcn-an-interpretable-machine-learning
Repo
Framework

Time series model selection with a meta-learning approach; evidence from a pool of forecasting algorithms

Title Time series model selection with a meta-learning approach; evidence from a pool of forecasting algorithms
Authors Sasan Barak, Mahdi Nasiri, Mehrdad Rostamzadeh
Abstract One of the challenging questions in time series forecasting is how to find the best algorithm. In recent years, a recommender system scheme has been developed for time series analysis using a meta-learning approach. This system selects the best forecasting method with consideration of the time series characteristics. In this paper, we propose a novel approach to focusing on some of the unanswered questions resulting from the use of meta-learning in time series forecasting. Therefore, three main gaps in previous works are addressed including, analyzing various subsets of top forecasters as inputs for meta-learners; evaluating the effect of forecasting error measures; and assessing the role of the dimensionality of the feature space on the forecasting errors of meta-learners. All of these objectives are achieved with the help of a diverse state-of-the-art pool of forecasters and meta-learners. For this purpose, first, a pool of forecasting algorithms is implemented on the NN5 competition dataset and ranked based on the two error measures. Then, six machine-learning classifiers known as meta-learners, are trained on the extracted features of the time series in order to assign the most suitable forecasting method for the various subsets of the pool of forecasters. Furthermore, two-dimensionality reduction methods are implemented in order to investigate the role of feature space dimension on the performance of meta-learners. In general, it was found that meta-learners were able to defeat all of the individual benchmark forecasters; this performance was improved even after applying the feature selection method.
Tasks Dimensionality Reduction, Feature Selection, Meta-Learning, Model Selection, Recommendation Systems, Time Series, Time Series Analysis, Time Series Forecasting
Published 2019-08-22
URL https://arxiv.org/abs/1908.08489v1
PDF https://arxiv.org/pdf/1908.08489v1.pdf
PWC https://paperswithcode.com/paper/time-series-model-selection-with-a-meta
Repo
Framework

Is It Worth the Attention? A Comparative Evaluation of Attention Layers for Argument Unit Segmentation

Title Is It Worth the Attention? A Comparative Evaluation of Attention Layers for Argument Unit Segmentation
Authors Maximilian Spliethöver, Jonas Klaff, Hendrik Heuer
Abstract Attention mechanisms have seen some success for natural language processing downstream tasks in recent years and generated new State-of-the-Art results. A thorough evaluation of the attention mechanism for the task of Argumentation Mining is missing, though. With this paper, we report a comparative evaluation of attention layers in combination with a bidirectional long short-term memory network, which is the current state-of-the-art approach to the unit segmentation task. We also compare sentence-level contextualized word embeddings to pre-generated ones. Our findings suggest that for this task the additional attention layer does not improve upon a less complex approach. In most cases, the contextualized embeddings do also not show an improvement on the baseline score.
Tasks Word Embeddings
Published 2019-06-24
URL https://arxiv.org/abs/1906.10068v1
PDF https://arxiv.org/pdf/1906.10068v1.pdf
PWC https://paperswithcode.com/paper/is-it-worth-the-attention-a-comparative
Repo
Framework

Adaptive Weight Decay for Deep Neural Networks

Title Adaptive Weight Decay for Deep Neural Networks
Authors Kensuke Nakamura, Byung-Woo Hong
Abstract Regularization in the optimization of deep neural networks is often critical to avoid undesirable over-fitting leading to better generalization of model. One of the most popular regularization algorithms is to impose L-2 penalty on the model parameters resulting in the decay of parameters, called weight-decay, and the decay rate is generally constant to all the model parameters in the course of optimization. In contrast to the previous approach based on the constant rate of weight-decay, we propose to consider the residual that measures dissimilarity between the current state of model and observations in the determination of the weight-decay for each parameter in an adaptive way, called adaptive weight-decay (AdaDecay) where the gradient norms are normalized within each layer and the degree of regularization for each parameter is determined in proportional to the magnitude of its gradient using the sigmoid function. We empirically demonstrate the effectiveness of AdaDecay in comparison to the state-of-the-art optimization algorithms using popular benchmark datasets: MNIST, Fashion-MNIST, and CIFAR-10 with conventional neural network models ranging from shallow to deep. The quantitative evaluation of our proposed algorithm indicates that AdaDecay improves generalization leading to better accuracy across all the datasets and models.
Tasks
Published 2019-07-21
URL https://arxiv.org/abs/1907.08931v2
PDF https://arxiv.org/pdf/1907.08931v2.pdf
PWC https://paperswithcode.com/paper/adaptive-weight-decay-for-deep-neural
Repo
Framework

QCNN: Quantile Convolutional Neural Network

Title QCNN: Quantile Convolutional Neural Network
Authors Gábor Petneházi
Abstract Convolutional neural networks can do time series forecasting. They can learn local patterns in time, and they can be modified to learn only from the history (ignoring the future) and to use a long look-back window when doing so. A further simple modification enables them to forecast not the mean, but arbitrary quantiles of the distribution. And one last thing to make this all work: the CNN forecaster’s complexity and flexibility requires much data, that is, preferably multiple time series. When this is met, the proposed QCNN framework can be competitive. It is demonstrated on a financial problem of huge practical importance: Value at Risk forecasting. By contributing to the stability of financial systems, deep learning may find one further way to improve our lives.
Tasks Time Series, Time Series Forecasting
Published 2019-08-21
URL https://arxiv.org/abs/1908.07978v2
PDF https://arxiv.org/pdf/1908.07978v2.pdf
PWC https://paperswithcode.com/paper/190807978
Repo
Framework

Non-negative Sparse and Collaborative Representation for Pattern Classification

Title Non-negative Sparse and Collaborative Representation for Pattern Classification
Authors Jun Xu, Zhou Xu, Wangpeng An, Haoqian Wang, David Zhang
Abstract Sparse representation (SR) and collaborative representation (CR) have been successfully applied in many pattern classification tasks such as face recognition. In this paper, we propose a novel Non-negative Sparse and Collaborative Representation (NSCR) for pattern classification. The NSCR representation of each test sample is obtained by seeking a non-negative sparse and collaborative representation vector that represents the test sample as a linear combination of training samples. We observe that the non-negativity can make the SR and CR more discriminative and effective for pattern classification. Based on the proposed NSCR, we propose a NSCR based classifier for pattern classification. Extensive experiments on benchmark datasets demonstrate that the proposed NSCR based classifier outperforms the previous SR or CR based approach, as well as state-of-the-art deep approaches, on diverse challenging pattern classification tasks.
Tasks Face Recognition
Published 2019-08-20
URL https://arxiv.org/abs/1908.07956v2
PDF https://arxiv.org/pdf/1908.07956v2.pdf
PWC https://paperswithcode.com/paper/190807956
Repo
Framework

Understanding Feature Selection and Feature Memorization in Recurrent Neural Networks

Title Understanding Feature Selection and Feature Memorization in Recurrent Neural Networks
Authors Bokang Zhu, Richong Zhang, Dingkun Long, Yongyi Mao
Abstract In this paper, we propose a test, called Flagged-1-Bit (F1B) test, to study the intrinsic capability of recurrent neural networks in sequence learning. Four different recurrent network models are studied both analytically and experimentally using this test. Our results suggest that in general there exists a conflict between feature selection and feature memorization in sequence learning. Such a conflict can be resolved either using a gating mechanism as in LSTM, or by increasing the state dimension as in Vanilla RNN. Gated models resolve this conflict by adaptively adjusting their state-update equations, whereas Vanilla RNN resolves this conflict by assigning different dimensions different tasks. Insights into feature selection and memorization in recurrent networks are given.
Tasks Feature Selection
Published 2019-03-03
URL http://arxiv.org/abs/1903.00906v1
PDF http://arxiv.org/pdf/1903.00906v1.pdf
PWC https://paperswithcode.com/paper/understanding-feature-selection-and-feature
Repo
Framework

DPOD: 6D Pose Object Detector and Refiner

Title DPOD: 6D Pose Object Detector and Refiner
Authors Sergey Zakharov, Ivan Shugurov, Slobodan Ilic
Abstract In this paper we present a novel deep learning method for 3D object detection and 6D pose estimation from RGB images. Our method, named DPOD (Dense Pose Object Detector), estimates dense multi-class 2D-3D correspondence maps between an input image and available 3D models. Given the correspondences, a 6DoF pose is computed via PnP and RANSAC. An additional RGB pose refinement of the initial pose estimates is performed using a custom deep learning-based refinement scheme. Our results and comparison to a vast number of related works demonstrate that a large number of correspondences is beneficial for obtaining high-quality 6D poses both before and after refinement. Unlike other methods that mainly use real data for training and do not train on synthetic renderings, we perform evaluation on both synthetic and real training data demonstrating superior results before and after refinement when compared to all recent detectors. While being precise, the presented approach is still real-time capable.
Tasks 3D Object Detection, 6D Pose Estimation, 6D Pose Estimation using RGB, Object Detection, Pose Estimation
Published 2019-02-28
URL https://arxiv.org/abs/1902.11020v3
PDF https://arxiv.org/pdf/1902.11020v3.pdf
PWC https://paperswithcode.com/paper/dpod-dense-6d-pose-object-detector-in-rgb
Repo
Framework

Defending Against Physically Realizable Attacks on Image Classification

Title Defending Against Physically Realizable Attacks on Image Classification
Authors Tong Wu, Liang Tong, Yevgeniy Vorobeychik
Abstract We study the problem of defending deep neural network approaches for image classification from physically realizable attacks. First, we demonstrate that the two most scalable and effective methods for learning robust models, adversarial training with PGD attacks and randomized smoothing, exhibit very limited effectiveness against three of the highest profile physical attacks. Next, we propose a new abstract adversarial model, rectangular occlusion attacks, in which an adversary places a small adversarially crafted rectangle in an image, and develop two approaches for efficiently computing the resulting adversarial examples. Finally, we demonstrate that adversarial training using our new attack yields image classification models that exhibit high robustness against the physically realizable attacks we study, offering the first effective generic defense against such attacks.
Tasks Image Classification
Published 2019-09-20
URL https://arxiv.org/abs/1909.09552v2
PDF https://arxiv.org/pdf/1909.09552v2.pdf
PWC https://paperswithcode.com/paper/defending-against-physically-realizable
Repo
Framework

Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection

Title Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection
Authors Zhilei Liu, Jiahui Dong, Cuicui Zhang, Longbiao Wang, Jianwu Dang
Abstract Most existing AU detection works considering AU relationships are relying on probabilistic graphical models with manually extracted features. This paper proposes an end-to-end deep learning framework for facial AU detection with graph convolutional network (GCN) for AU relation modeling, which has not been explored before. In particular, AU related regions are extracted firstly, latent representations full of AU information are learned through an auto-encoder. Moreover, each latent representation vector is feed into GCN as a node, the connection mode of GCN is determined based on the relationships of AUs. Finally, the assembled features updated through GCN are concatenated for AU detection. Extensive experiments on BP4D and DISFA benchmarks demonstrate that our framework significantly outperforms the state-of-the-art methods for facial AU detection. The proposed framework is also validated through a series of ablation studies.
Tasks Action Unit Detection, Facial Action Unit Detection
Published 2019-10-23
URL https://arxiv.org/abs/1910.10334v1
PDF https://arxiv.org/pdf/1910.10334v1.pdf
PWC https://paperswithcode.com/paper/relation-modeling-with-graph-convolutional
Repo
Framework

Deep Industrial Espionage

Title Deep Industrial Espionage
Authors Samuel Albanie, James Thewlis, Sebastien Ehrhardt, Joao Henriques
Abstract The theory of deep learning is now considered largely solved, and is well understood by researchers and influencers alike. To maintain our relevance, we therefore seek to apply our skills to under-explored, lucrative applications of this technology. To this end, we propose and Deep Industrial Espionage, an efficient end-to-end framework for industrial information propagation and productisation. Specifically, given a single image of a product or service, we aim to reverse-engineer, rebrand and distribute a copycat of the product at a profitable price-point to consumers in an emerging market—all within in a single forward pass of a Neural Network. Differently from prior work in machine perception which has been restricted to classifying, detecting and reasoning about object instances, our method offers tangible business value in a wide range of corporate settings. Our approach draws heavily on a promising recent arxiv paper until its original authors’ names can no longer be read (we use felt tip pen). We then rephrase the anonymised paper, add the word “novel” to the title, and submit it a prestigious, closed-access espionage journal who assure us that someday, we will be entitled to some fraction of their extortionate readership fees.
Tasks
Published 2019-04-01
URL http://arxiv.org/abs/1904.01114v1
PDF http://arxiv.org/pdf/1904.01114v1.pdf
PWC https://paperswithcode.com/paper/deep-industrial-espionage
Repo
Framework

Investigating the Successes and Failures of BERT for Passage Re-Ranking

Title Investigating the Successes and Failures of BERT for Passage Re-Ranking
Authors Harshith Padigela, Hamed Zamani, W. Bruce Croft
Abstract The bidirectional encoder representations from transformers (BERT) model has recently advanced the state-of-the-art in passage re-ranking. In this paper, we analyze the results produced by a fine-tuned BERT model to better understand the reasons behind such substantial improvements. To this aim, we focus on the MS MARCO passage re-ranking dataset and provide potential reasons for the successes and failures of BERT for retrieval. In more detail, we empirically study a set of hypotheses and provide additional analysis to explain the successful performance of BERT.
Tasks Passage Re-Ranking
Published 2019-05-05
URL https://arxiv.org/abs/1905.01758v1
PDF https://arxiv.org/pdf/1905.01758v1.pdf
PWC https://paperswithcode.com/paper/investigating-the-successes-and-failures-of
Repo
Framework
comments powered by Disqus