January 25, 2020

3064 words 15 mins read

Paper Group NANR 38

PCNN: Environment Adaptive Model Without Finetuning. A Single Attention-Based Combination of CNN and RNN for Relation Classification. Proceedings of the 3rd Workshop on Neural Generation and Translation. An adaptable task-oriented dialog system for stand-alone embedded devices. The Effectiveness of Pre-Trained Code Embeddings. M^3RL: Mind-aware Mul …

PCNN: Environment Adaptive Model Without Finetuning


Title	PCNN: Environment Adaptive Model Without Finetuning
Authors	Boyuan Feng, Kun Wan, Shu Yang, Yufei Ding
Abstract	Convolutional Neural Networks (CNNs) have achieved tremendous success for many computer vision tasks, which shows a promising perspective of deploying CNNs on mobile platforms. An obstacle to this promising perspective is the tension between intensive resource consumption of CNNs and limited resource budget on mobile platforms. Existing works generally utilize a simpler architecture with lower accuracy for a higher energy-efficiency, \textit{i.e.}, trading accuracy for resource consumption. An emerging opportunity to both increasing accuracy and decreasing resource consumption is \textbf{class skew}, \textit{i.e.}, the strong temporal and spatial locality of the appearance of classes. However, it is challenging to efficiently utilize the class skew due to both the frequent switches and the huge number of class skews. Existing works use transfer learning to adapt the model towards the class skew during runtime, which consumes resource intensively. In this paper, we propose \textbf{probability layer}, an \textit{easily-implemented and highly flexible add-on module} to adapt the model efficiently during runtime \textit{without any fine-tuning} and achieving an \textit{equivalent or better} performance than transfer learning. Further, both \textit{increasing accuracy} and \textit{decreasing resource consumption} can be achieved during runtime through the combination of probability layer and pruning methods.
Tasks	Transfer Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=S1eVe2AqKX
PDF	https://openreview.net/pdf?id=S1eVe2AqKX
PWC	https://paperswithcode.com/paper/pcnn-environment-adaptive-model-without
Repo
Framework

A Single Attention-Based Combination of CNN and RNN for Relation Classification


Title	A Single Attention-Based Combination of CNN and RNN for Relation Classification
Authors	XIAOYU GUO1, HUI ZHANG1, 2, HAIJUN YANG 3, LIANYUAN XU4, AND ZHIWEN YE1
Abstract	As a vital task in natural language processing, relation classification aims to identify relation types between entities from texts. In this paper, we propose a novel Att-RCNN model to extract text features and classify relations by combining recurrent neural network (RNN) and convolutional neural network (CNN). This network structure utilizes RNN to extract higher level contextual representations of words and CNN to obtain sentence features for the relation classification task. In addition to this network structure, both word-level and sentence-level attention mechanisms are employed in Att-RCNN to strengthen critical words and features to promote the model performance. Moreover, we conduct experiments on four distinct datasets: SemEval-2010 task 8, SemEval-2018 task 7 (two subtask datasets), and KBP37 dataset. Compared with the previous public models, Att-RCNN has the overall best performance and achieves the highest F1 score, especially on the KBP37 dataset.
Tasks	Relation Classification
Published	2019-02-06
URL	https://ieeexplore.ieee.org/document/8606107
PDF	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8606107
PWC	https://paperswithcode.com/paper/a-single-attention-based-combination-of-cnn
Repo
Framework

Proceedings of the 3rd Workshop on Neural Generation and Translation


Title	Proceedings of the 3rd Workshop on Neural Generation and Translation
Authors
Abstract
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5600/
PDF	https://www.aclweb.org/anthology/D19-5600
PWC	https://paperswithcode.com/paper/proceedings-of-the-3rd-workshop-on-neural
Repo
Framework

An adaptable task-oriented dialog system for stand-alone embedded devices


Title	An adaptable task-oriented dialog system for stand-alone embedded devices
Authors	Long Duong, Vu Cong Duy Hoang, Tuyen Quang Pham, Yu-Heng Hong, Vladislavs Dovgalecs, Guy Bashkansky, Jason Black, Andrew Bleeker, Serge Le Huitouze, Mark Johnson
Abstract	This paper describes a spoken-language end-to-end task-oriented dialogue system for small embedded devices such as home appliances. While the current system implements a smart alarm clock with advanced calendar scheduling functionality, the system is designed to make it easy to port to other application domains (e.g., the dialogue component factors out domain-specific execution from domain-general actions such as requesting and updating slot values). The system does not require internet connectivity because all components, including speech recognition, natural language understanding, dialogue management, execution and text-to-speech, run locally on the embedded device (our demo uses a Raspberry Pi). This simplifies deployment, minimizes server costs and most importantly, eliminates user privacy risks. The demo video in alarm domain is here youtu.be/N3IBMGocvHU
Tasks	Dialogue Management, Speech Recognition
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-3009/
PDF	https://www.aclweb.org/anthology/P19-3009
PWC	https://paperswithcode.com/paper/an-adaptable-task-oriented-dialog-system-for
Repo
Framework

The Effectiveness of Pre-Trained Code Embeddings


Title	The Effectiveness of Pre-Trained Code Embeddings
Authors	Ben Trevett, Donald Reay, N. K. Taylor
Abstract	Word embeddings are widely used in machine learning based natural language processing systems. It is common to use pre-trained word embeddings which provide benefits such as reduced training time and improved overall performance. There has been a recent interest in applying natural language processing techniques to programming languages. However, none of this recent work uses pre-trained embeddings on code tokens. Using extreme summarization as the downstream task, we show that using pre-trained embeddings on code tokens provides the same benefits as it does to natural languages, achieving: over 1.9x speedup, 5% improvement in test loss, 4% improvement in F1 scores, and resistance to over-fitting. We also show that the choice of language used for the embeddings does not have to match that of the task to achieve these benefits and that even embeddings pre-trained on human languages provide these benefits to programming languages.
Tasks	Word Embeddings
Published	2019-05-01
URL	https://openreview.net/forum?id=H1glKiCqtm
PDF	https://openreview.net/pdf?id=H1glKiCqtm
PWC	https://paperswithcode.com/paper/the-effectiveness-of-pre-trained-code
Repo
Framework

M^3RL: Mind-aware Multi-agent Management Reinforcement Learning


Title	M^3RL: Mind-aware Multi-agent Management Reinforcement Learning
Authors	Tianmin Shu, Yuandong Tian
Abstract	Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward. In this paper, we aim to address this from a different angle. In particular, we consider scenarios where there are self-interested agents (i.e., worker agents) which have their own minds (preferences, intentions, skills, etc.) and can not be dictated to perform tasks they do not wish to do. For achieving optimal coordination among these agents, we train a super agent (i.e., the manager) to manage them by first inferring their minds based on both current and past observations and then initiating contracts to assign suitable tasks to workers and promise to reward them with corresponding bonuses so that they will agree to work together. The objective of the manager is maximizing the overall productivity as well as minimizing payments made to the workers for ad-hoc worker teaming. To train the manager, we propose Mind-aware Multi-agent Management Reinforcement Learning (M^3RL), which consists of agent modeling and policy learning. We have evaluated our approach in two environments, Resource Collection and Crafting, to simulate multi-agent management problems with various task settings and multiple designs for the worker agents. The experimental results have validated the effectiveness of our approach in modeling worker agents’ minds online, and in achieving optimal ad-hoc teaming with good generalization and fast adaptation.
Tasks	Multi-agent Reinforcement Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=BkzeUiRcY7
PDF	https://openreview.net/pdf?id=BkzeUiRcY7
PWC	https://paperswithcode.com/paper/m3rl-mind-aware-multi-agent-management
Repo
Framework

Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering


Title	Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering
Authors	Chen Zhang, Joohee Kim
Abstract	Multi-class and multi-scale object detection for autonomous driving is challenging because of the high variation in object scales and the cluttered background in complex street scenes. Context information and high-resolution features are the keys to achieve a good performance in multi-scale object detection. However, context information is typically unevenly distributed, and the high-resolution feature map also contains distractive low-level features. In this paper, we propose a location-aware deformable convolution and a backward attention filtering to improve the detection performance. The location-aware deformable convolution extracts the unevenly distributed context features by sampling the input from where informative context exists. Different from the original deformable convolution, the proposed method applies an individual convolutional layer on each input sampling grid location to obtain a wide and unique receptive field for a better offset estimation. Meanwhile, the backward attention filtering module filters the high-resolution feature map by highlighting the informative features and suppressing the distractive features using the semantic features from the deep layers. Extensive experiments are conducted on the KITTI object detection and PASCAL VOC 2007 datasets. The proposed method shows an average 6% performance improvement over the Faster R-CNN baseline, and it has the top-3 performance on the KITTI leaderboard with the fastest processing speed.
Tasks	Autonomous Driving, Object Detection
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_Object_Detection_With_Location-Aware_Deformable_Convolution_and_Backward_Attention_Filtering_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhang_Object_Detection_With_Location-Aware_Deformable_Convolution_and_Backward_Attention_Filtering_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/object-detection-with-location-aware
Repo
Framework

Jumpout: Improved Dropout for Deep Neural Networks with Rectified Linear Units


Title	Jumpout: Improved Dropout for Deep Neural Networks with Rectified Linear Units
Authors	Shengjie Wang, Tianyi Zhou, Jeff Bilmes
Abstract	Dropout is a simple yet effective technique to improve generalization performance and prevent overfitting in deep neural networks (DNNs). In this paper, we discuss three novel observations about dropout to better understand the generalization of DNNs with rectified linear unit (ReLU) activations: 1) dropout is a smoothing technique that encourages each local linear model of a DNN to be trained on data points from nearby regions; 2) a constant dropout rate can result in effective neural-deactivation rates that are significantly different for layers with different fractions of activated neurons; and 3) the rescaling factor of dropout causes an inconsistency to occur between the normalization during training and testing conditions when batch normalization is also used. The above leads to three simple but nontrivial improvements to dropout resulting in our proposed method “Jumpout.” Jumpout samples the dropout rate using a monotone decreasing distribution (such as the right part of a truncated Gaussian), so the local linear model at each data point is trained, with high probability, to work better for data points from nearby than from more distant regions. Instead of tuning a dropout rate for each layer and applying it to all samples, jumpout moreover adaptively normalizes the dropout rate at each layer and every training sample/batch, so the effective dropout rate applied to the activated neurons are kept the same. Moreover, we rescale the outputs of jumpout for a better trade-off that keeps both the variance and mean of neurons more consistent between training and test phases, which mitigates the incompatibility between dropout and batch normalization. Compared to the original dropout, jumpout shows significantly improved performance on CIFAR10, CIFAR100, Fashion- MNIST, STL10, SVHN, ImageNet-1k, etc., while introducing negligible additional memory and computation costs.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=r1gRCiA5Ym
PDF	https://openreview.net/pdf?id=r1gRCiA5Ym
PWC	https://paperswithcode.com/paper/jumpout-improved-dropout-for-deep-neural
Repo
Framework

Harvey Mudd College at SemEval-2019 Task 4: The Carl Kolchak Hyperpartisan News Detector


Title	Harvey Mudd College at SemEval-2019 Task 4: The Carl Kolchak Hyperpartisan News Detector
Authors	Celena Chen, Celine Park, Jason Dwyer, Julie Medero
Abstract	We use various natural processing and machine learning methods to perform the Hyperpartisan News Detection task. In particular, some of the features we look at are bag-of-words features, the title{'}s length, number of capitalized words in the title, and the sentiment of the sentences and the title. By adding these features, we see improvements in our evaluation metrics compared to the baseline values. We find that sentiment analysis helps improve our evaluation metrics. We do not see a benefit from feature selection. Overall, our system achieves an accuracy of 0.739, finishing 18th out of 42 submissions to the task. From our work, it is evident that both title features and sentiment of articles are meaningful to the hyperpartisanship of news articles.
Tasks	Feature Selection, Sentiment Analysis
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2164/
PDF	https://www.aclweb.org/anthology/S19-2164
PWC	https://paperswithcode.com/paper/harvey-mudd-college-at-semeval-2019-task-4
Repo
Framework

ProSeqo: Projection Sequence Networks for On-Device Text Classification


Title	ProSeqo: Projection Sequence Networks for On-Device Text Classification
Authors	Zornitsa Kozareva, Sujith Ravi
Abstract	We propose a novel on-device sequence model for text classification using recurrent projections. Our model ProSeqo uses dynamic recurrent projections without the need to store or look up any pre-trained embeddings. This results in fast and compact neural networks that can perform on-device inference for complex short and long text classification tasks. We conducted exhaustive evaluation on multiple text classification tasks. Results show that ProSeqo outperformed state-of-the-art neural and on-device approaches for short text classification tasks such as dialog act and intent prediction. To the best of our knowledge, ProSeqo is the first on-device long text classification neural model. It achieved comparable results to previous neural approaches for news article, answers and product categorization, while preserving small memory footprint and maintaining high accuracy.
Tasks	Product Categorization, Text Classification
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1402/
PDF	https://www.aclweb.org/anthology/D19-1402
PWC	https://paperswithcode.com/paper/proseqo-projection-sequence-networks-for-on
Repo
Framework

Automatic Product Categorization for Official Statistics


Title	Automatic Product Categorization for Official Statistics
Authors	Andrea Roberson
Abstract	The North American Product Classification System (NAPCS) is a comprehensive, hierarchical classification system for products (goods and services) that is consistent across the three North American countries. Beginning in 2017, the Economic Census will use NAPCS to produce economy-wide product tabulations. Respondents are asked to report data from a long, pre-specified list of potential products in a given industry, with some lists containing more than 50 potential products. Businesses have expressed the desire to alternatively supply Universal Product Codes (UPC) to the U. S. Census Bureau. Much work has been done around the categorization of products using product descriptions. No study has applied these efforts for the calculation of official statistics (statistics published by government agencies) using only the text of UPC product descriptions. The question we address in this paper is: Given UPC codes and their associated product descriptions, can we accurately predict NAPCS? We tested the feasibility of businesses submitting a spreadsheet with Universal Product Codes and their associated text descriptions. This novel strategy classified text with very high accuracy rates, all of our algorithms surpassed over 90 percent.
Tasks	Product Categorization
Published	2019-08-01
URL	https://www.aclweb.org/anthology/papers/W/W19/W19-3623/
PDF	https://www.aclweb.org/anthology/W19-3623
PWC	https://paperswithcode.com/paper/automatic-product-categorization-for-official
Repo
Framework

Spider-Jerusalem at SemEval-2019 Task 4: Hyperpartisan News Detection


Title	Spider-Jerusalem at SemEval-2019 Task 4: Hyperpartisan News Detection
Authors	Amal Alabdulkarim, Tariq Alhindi
Abstract	This paper describes our system for detecting hyperpartisan news articles, which was submitted for the shared task in SemEval 2019 on Hyperpartisan News Detection. We developed a Support Vector Machine (SVM) model that uses TF-IDF of tokens, Language Inquiry and Word Count (LIWC) features, and structural features such as number of paragraphs and hyperlink count in an article. The model was trained on 645 articles from two classes: mainstream and hyperpartisan. Our system was ranked seventeenth out of forty two participating teams in the binary classification task with an accuracy score of 0.742 on the blind test set (the accuracy of the top ranked system was 0.822). We provide a detailed description of our preprocessing steps, discussion of our experiments using different combinations of features, and analysis of our results and prediction errors.
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2170/
PDF	https://www.aclweb.org/anthology/S19-2170
PWC	https://paperswithcode.com/paper/spider-jerusalem-at-semeval-2019-task-4
Repo
Framework

Relating Word Embedding Gender Biases to Gender Gaps: A Cross-Cultural Analysis


Title	Relating Word Embedding Gender Biases to Gender Gaps: A Cross-Cultural Analysis
Authors	Scott Friedman, Sonja Schmer-Galunder, Anthony Chen, Jeffrey Rye
Abstract	Modern models for common NLP tasks often employ machine learning techniques and train on journalistic, social media, or other culturally-derived text. These have recently been scrutinized for racial and gender biases, rooting from inherent bias in their training text. These biases are often sub-optimal and recent work poses methods to rectify them; however, these biases may shed light on actual racial or gender gaps in the culture(s) that produced the training text, thereby helping us understand cultural context through big data. This paper presents an approach for quantifying gender bias in word embeddings, and then using them to characterize statistical gender gaps in education, politics, economics, and health. We validate these metrics on 2018 Twitter data spanning 51 U.S. regions and 99 countries. We correlate state and country word embedding biases with 18 international and 5 U.S.-based statistical gender gaps, characterizing regularities and predictive strength.
Tasks	Word Embeddings
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3803/
PDF	https://www.aclweb.org/anthology/W19-3803
PWC	https://paperswithcode.com/paper/relating-word-embedding-gender-biases-to
Repo
Framework

DeepVCP: An End-to-End Deep Neural Network for Point Cloud Registration


Title	DeepVCP: An End-to-End Deep Neural Network for Point Cloud Registration
Authors	Weixin Lu, Guowei Wan, Yao Zhou, Xiangyu Fu, Pengfei Yuan, Shiyu Song
Abstract	We present DeepVCP - a novel end-to-end learning-based 3D point cloud registration framework that achieves comparable registration accuracy to prior state-of-the-art geometric methods. Different from other keypoint based methods where a RANSAC procedure is usually needed, we implement the use of various deep neural network structures to establish an end-to-end trainable network. Our keypoint detector is trained through this end-to-end structure and enables the system to avoid the interference of dynamic objects, leverages the help of sufficiently salient features on stationary objects, and as a result, achieves high robustness. Rather than searching the corresponding points among existing points, the key contribution is that we innovatively generate them based on learned matching probabilities among a group of candidates, which can boost the registration accuracy. We comprehensively validate the effectiveness of our approach using both the KITTI dataset and the Apollo-SouthBay dataset. Results demonstrate that our method achieves comparable registration accuracy and runtime efficiency to the state-of-the-art geometry-based methods, but with higher robustness to inaccurate initial poses. Detailed ablation and visualization analysis are included to further illustrate the behavior and insights of our network. The low registration error and high robustness of our method make it attractive to the substantial applications relying on the point cloud registration task.
Tasks	Point Cloud Registration
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Lu_DeepVCP_An_End-to-End_Deep_Neural_Network_for_Point_Cloud_Registration_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Lu_DeepVCP_An_End-to-End_Deep_Neural_Network_for_Point_Cloud_Registration_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/deepvcp-an-end-to-end-deep-neural-network-for
Repo
Framework

Topology Reconstruction of Tree-Like Structure in Images via Structural Similarity Measure and Dominant Set Clustering


Title	Topology Reconstruction of Tree-Like Structure in Images via Structural Similarity Measure and Dominant Set Clustering
Authors	Jianyang Xie, Yitian Zhao, Yonghuai Liu, Pan Su, Yifan Zhao, Jun Cheng, Yalin Zheng, Jiang Liu
Abstract	The reconstruction and analysis of tree-like topological structures in the biomedical images is crucial for biologists and surgeons to understand biomedical conditions and plan surgical procedures. The underlying tree-structure topology reveals how different curvilinear components are anatomically connected to each other. Existing automated topology reconstruction methods have great difficulty in identifying the connectivity when two or more curvilinear components cross or bifurcate, due to their projection ambiguity, imaging noise and low contrast. In this paper, we propose a novel curvilinear structural similarity measure to guide a dominant-set clustering approach to address this indispensable issue. The novel similarity measure takes into account both intensity and geometric properties in representing the curvilinear structure locally and globally, and group curvilinear objects at crossover points into different connected branches by dominant-set clustering. The proposed method is applicable to different imaging modalities, and quantitative and qualitative results on retinal vessel, plant root, and neuronal network datasets show that our methodology is capable of advancing the current state-of-the-art techniques.
Tasks
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Xie_Topology_Reconstruction_of_Tree-Like_Structure_in_Images_via_Structural_Similarity_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Xie_Topology_Reconstruction_of_Tree-Like_Structure_in_Images_via_Structural_Similarity_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/topology-reconstruction-of-tree-like
Repo
Framework