January 25, 2020

2825 words 14 mins read

Paper Group NANR 56

Learning to Learn with Conditional Class Dependencies. UBC-NLP at SemEval-2019 Task 4: Hyperpartisan News Detection With Attention-Based Bi-LSTMs. Exploring Social Bias in Chatbots using Stereotype Knowledge. QE BERT: Bilingual BERT Using Multi-task Learning for Neural Quality Estimation. Sentiment Analysis for Multilingual Corpora. Recursive Routi …

Learning to Learn with Conditional Class Dependencies


Title	Learning to Learn with Conditional Class Dependencies
Authors	Xiang Jiang, Mohammad Havaei, Farshid Varno, Gabriel Chartrand, Nicolas Chapados, Stan Matwin
Abstract	Neural networks can learn to extract statistical properties from data, but they seldom make use of structured information from the label space to help representation learning. Although some label structure can implicitly be obtained when training on huge amounts of data, in a few-shot learning context where little data is available, making explicit use of the label structure can inform the model to reshape the representation space to reflect a global sense of class dependencies. We propose a meta-learning framework, Conditional class-Aware Meta-Learning (CAML), that conditionally transforms feature representations based on a metric space that is trained to capture inter-class dependencies. This enables a conditional modulation of the feature representations of the base-learner to impose regularities informed by the label space. Experiments show that the conditional transformation in CAML leads to more disentangled representations and achieves competitive results on the miniImageNet benchmark.
Tasks	Few-Shot Learning, Meta-Learning, Representation Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=BJfOXnActQ
PDF	https://openreview.net/pdf?id=BJfOXnActQ
PWC	https://paperswithcode.com/paper/learning-to-learn-with-conditional-class
Repo
Framework

UBC-NLP at SemEval-2019 Task 4: Hyperpartisan News Detection With Attention-Based Bi-LSTMs


Title	UBC-NLP at SemEval-2019 Task 4: Hyperpartisan News Detection With Attention-Based Bi-LSTMs
Authors	Chiyu Zhang, Arun Rajendran, Muhammad Abdul-Mageed
Abstract	We present our deep learning models submitted to the SemEval-2019 Task 4 competition focused at Hyperpartisan News Detection. We acquire best results with a Bi-LSTM network equipped with a self-attention mechanism. Among 33 participating teams, our submitted system ranks top 7 (65.3{%} accuracy) on the {`}labels-by-publisher{'} sub-task and top 24 out of 44 teams (68.3{\%} accuracy) on the {`}labels-by-article{'} sub-task (65.3{%} accuracy). We also report a model that scores higher than the 8th ranking system (78.5{%} accuracy) on the {`}labels-by-article{'} sub-task. \|
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2188/
PDF	https://www.aclweb.org/anthology/S19-2188
PWC	https://paperswithcode.com/paper/ubc-nlp-at-semeval-2019-task-4-hyperpartisan
Repo
Framework


Title	Exploring Social Bias in Chatbots using Stereotype Knowledge
Authors	Nayeon Lee, Andrea Madotto, Pascale Fung
Abstract	Exploring social bias in chatbot is an important, yet relatively unexplored problem. In this paper, we propose an approach to understand social bias in chatbots by leveraging stereotype knowledge. It allows interesting comparison of bias between chatbots and humans, and provides intuitive analysis of existing chatbots by borrowing the finer-grain concepts of sexism and racism.
Tasks	Chatbot
Published	2019-08-01
URL	https://www.aclweb.org/anthology/papers/W/W19/W19-3655/
PDF	https://www.aclweb.org/anthology/W19-3655
PWC	https://paperswithcode.com/paper/exploring-social-bias-in-chatbots-using
Repo
Framework

QE BERT: Bilingual BERT Using Multi-task Learning for Neural Quality Estimation


Title	QE BERT: Bilingual BERT Using Multi-task Learning for Neural Quality Estimation
Authors	Hyun Kim, Joon-Ho Lim, Hyun-Ki Kim, Seung-Hoon Na
Abstract	For translation quality estimation at word and sentence levels, this paper presents a novel approach based on BERT that recently has achieved impressive results on various natural language processing tasks. Our proposed model is re-purposed BERT for the translation quality estimation and uses multi-task learning for the sentence-level task and word-level subtasks (i.e., source word, target word, and target gap). Experimental results on Quality Estimation shared task of WMT19 show that our systems show competitive results and provide significant improvements over the baseline.
Tasks	Multi-Task Learning
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5407/
PDF	https://www.aclweb.org/anthology/W19-5407
PWC	https://paperswithcode.com/paper/qe-bert-bilingual-bert-using-multi-task
Repo
Framework

Sentiment Analysis for Multilingual Corpora


Title	Sentiment Analysis for Multilingual Corpora
Authors	Svitlana Galeshchuk, Ju Qiu, Julien Jourdan
Abstract	The paper presents a generic approach to the supervised sentiment analysis of social media content in Slavic languages. The method proposes translating the documents from the original language to English with Google{'}s Neural Translation Model. The resulted texts are then converted to vectors by averaging the vectorial representation of words derived from a pre-trained Word2Vec English model. Testing the approach with several machine learning methods on Polish, Slovenian and Croatian Twitter datasets returns up to 86{%} of classification accuracy on the out-of-sample data.
Tasks	Sentiment Analysis
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3717/
PDF	https://www.aclweb.org/anthology/W19-3717
PWC	https://paperswithcode.com/paper/sentiment-analysis-for-multilingual-corpora
Repo
Framework

Recursive Routing Networks: Learning to Compose Modules for Language Understanding


Title	Recursive Routing Networks: Learning to Compose Modules for Language Understanding
Authors	Ignacio Cases, Clemens Rosenbaum, Matthew Riemer, Atticus Geiger, Tim Klinger, Alex Tamkin, Olivia Li, S Agarwal, hini, Joshua D. Greene, Dan Jurafsky, Christopher Potts, Lauri Karttunen
Abstract	We introduce Recursive Routing Networks (RRNs), which are modular, adaptable models that learn effectively in diverse environments. RRNs consist of a set of functions, typically organized into a grid, and a meta-learner decision-making component called the router. The model jointly optimizes the parameters of the functions and the meta-learner{'}s policy for routing inputs through those functions. RRNs can be incorporated into existing architectures in a number of ways; we explore adding them to word representation layers, recurrent network hidden layers, and classifier layers. Our evaluation task is natural language inference (NLI). Using the MultiNLI corpus, we show that an RRN{'}s routing decisions reflect the high-level genre structure of that corpus. To show that RRNs can learn to specialize to more fine-grained semantic distinctions, we introduce a new corpus of NLI examples involving implicative predicates, and show that the model components become fine-tuned to the inferential signatures that are characteristic of these predicates.
Tasks	Decision Making, Natural Language Inference
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1365/
PDF	https://www.aclweb.org/anthology/N19-1365
PWC	https://paperswithcode.com/paper/recursive-routing-networks-learning-to
Repo
Framework

FEED: Feature-level Ensemble Effect for knowledge Distillation


Title	FEED: Feature-level Ensemble Effect for knowledge Distillation
Authors	SeongUk Park, Nojun Kwak
Abstract	This paper proposes a versatile and powerful training algorithm named Feature-level Ensemble Effect for knowledge Distillation(FEED), which is inspired by the work of factor transfer. The factor transfer is one of the knowledge transfer methods that improves the performance of a student network with a strong teacher network. It transfers the knowledge of a teacher in the feature map level using high-capacity teacher network, and our training algorithm FEED is an extension of it. FEED aims to transfer ensemble knowledge, using either multiple teachers in parallel or multiple training sequences. Adapting the peer-teaching framework, we introduce a couple of training algorithms that transfer ensemble knowledge to the student at the feature map level, both of which help the student network find more generalized solutions in the parameter space. Experimental results on CIFAR-100 and ImageNet show that our method, FEED, has clear performance enhancements,without introducing any additional parameters or computations at test time.
Tasks	Transfer Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=BJxYEsAqY7
PDF	https://openreview.net/pdf?id=BJxYEsAqY7
PWC	https://paperswithcode.com/paper/feed-feature-level-ensemble-effect-for
Repo
Framework

A Partially Rule-Based Approach to AMR Generation


Title	A Partially Rule-Based Approach to AMR Generation
Authors	Emma Manning
Abstract	This paper presents a new approach to generating English text from Abstract Meaning Representation (AMR). In contrast to the neural and statistical MT approaches used in other AMR generation systems, this one is largely rule-based, supplemented only by a language model and simple statistical linearization models, allowing for more control over the output. We also address the difficulties of automatically evaluating AMR generation systems and the problems with BLEU for this task. We compare automatic metrics to human evaluations and show that while METEOR and TER arguably reflect human judgments better than BLEU, further research into suitable evaluation metrics is needed.
Tasks	Language Modelling
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-3009/
PDF	https://www.aclweb.org/anthology/N19-3009
PWC	https://paperswithcode.com/paper/a-partially-rule-based-approach-to-amr
Repo
Framework

Leveraging Medical Literature for Section Prediction in Electronic Health Records


Title	Leveraging Medical Literature for Section Prediction in Electronic Health Records
Authors	Sara Rosenthal, Ken Barker, Zhicheng Liang
Abstract	Electronic Health Records (EHRs) contain both structured content and unstructured (text) content about a patient{'}s medical history. In the unstructured text parts, there are common sections such as Assessment and Plan, Social History, and Medications. These sections help physicians find information easily and can be used by an information retrieval system to return specific information sought by a user. However, it is common that the exact format of sections in a particular EHR does not adhere to known patterns. Therefore, being able to predict sections and headers in EHRs automatically is beneficial to physicians. Prior approaches in EHR section prediction have only used text data from EHRs and have required significant manual annotation. We propose using sections from medical literature (e.g., textbooks, journals, web content) that contain content similar to that found in EHR sections. Our approach uses data from a different kind of source where labels are provided without the need of a time-consuming annotation effort. We use this data to train two models: an RNN and a BERT-based model. We apply the learned models along with source data via transfer learning to predict sections in EHRs. Our results show that medical literature can provide helpful supervision signal for this classification task.
Tasks	Information Retrieval, Transfer Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1492/
PDF	https://www.aclweb.org/anthology/D19-1492
PWC	https://paperswithcode.com/paper/leveraging-medical-literature-for-section
Repo
Framework

Cross-lingual Subjectivity Detection for Resource Lean Languages


Title	Cross-lingual Subjectivity Detection for Resource Lean Languages
Authors	Ida Amini, Samane Karimi, Azadeh Shakery
Abstract	Wide and universal changes in the web content due to the growth of web 2 applications increase the importance of user-generated content on the web. Therefore, the related research areas such as sentiment analysis, opinion mining and subjectivity detection receives much attention from the research community. Due to the diverse languages that web-users use to express their opinions and sentiments, research areas like subjectivity detection should present methods which are practicable on all languages. An important prerequisite to effectively achieve this aim is considering the limitations in resource-lean languages. In this paper, cross-lingual subjectivity detection on resource lean languages is investigated using two different approaches: a language-model based and a learning-to-rank approach. Experimental results show the impact of different factors on the performance of subjectivity detection methods using English resources to detect the subjectivity score of Persian documents. The experiments demonstrate that the proposed learning-to-rank method outperforms the baseline method in ranking documents based on their subjectivity degree.
Tasks	Language Modelling, Learning-To-Rank, Opinion Mining, Sentiment Analysis
Published	2019-06-01
URL	https://www.aclweb.org/anthology/W19-1310/
PDF	https://www.aclweb.org/anthology/W19-1310
PWC	https://paperswithcode.com/paper/cross-lingual-subjectivity-detection-for
Repo
Framework

MANIFOLDNET: A DEEP NEURAL NETWORK FOR MANIFOLD-VALUED DATA


Title	MANIFOLDNET: A DEEP NEURAL NETWORK FOR MANIFOLD-VALUED DATA
Authors	Rudrasis Chakraborty, Jose Bouza, Jonathan Manton, Baba C. Vemuri
Abstract	Developing deep neural networks (DNNs) for manifold-valued data sets has gained much interest of late in the deep learning research community. Examples of manifold-valued data include data from omnidirectional cameras on automobiles, drones etc., diffusion magnetic resonance imaging, elastography and others. In this paper, we present a novel theoretical framework for DNNs to cope with manifold-valued data inputs. In doing this generalization, we draw parallels to the widely popular convolutional neural networks (CNNs). We call our network the ManifoldNet. As in vector spaces where convolutions are equivalent to computing the weighted mean of functions, an analogous definition for manifold-valued data can be constructed involving the computation of the weighted Fr'{e}chet Mean (wFM). To this end, we present a provably convergent recursive computation of the wFM of the given data, where the weights makeup the convolution mask, to be learned. Further, we prove that the proposed wFM layer achieves a contraction mapping and hence the ManifoldNet does not need the additional non-linear ReLU unit used in standard CNNs. Operations such as pooling in traditional CNN are no longer necessary in this setting since wFM is already a pooling type operation. Analogous to the equivariance of convolution in Euclidean space to translations, we prove that the wFM is equivariant to the action of the group of isometries admitted by the Riemannian manifold on which the data reside. This equivariance property facilitates weight sharing within the network. We present experiments, using the ManifoldNet framework, to achieve video classification and image reconstruction using an auto-encoder+decoder setting. Experimental results demonstrate the efficacy of ManifoldNet in the context of classification and reconstruction accuracy.
Tasks	Image Reconstruction, Video Classification
Published	2019-05-01
URL	https://openreview.net/forum?id=SyzjBiR9t7
PDF	https://openreview.net/pdf?id=SyzjBiR9t7
PWC	https://paperswithcode.com/paper/manifoldnet-a-deep-neural-network-for
Repo
Framework

Joint Prediction for Kinematic Trajectories in Vehicle-Pedestrian-Mixed Scenes


Title	Joint Prediction for Kinematic Trajectories in Vehicle-Pedestrian-Mixed Scenes
Authors	Huikun Bi, Zhong Fang, Tianlu Mao, Zhaoqi Wang, Zhigang Deng
Abstract	Trajectory prediction for objects is challenging and critical for various applications (e.g., autonomous driving, and anomaly detection). Most of the existing methods focus on homogeneous pedestrian trajectories prediction, where pedestrians are treated as particles without size. However, they fall short of handling crowded vehicle-pedestrian-mixed scenes directly since vehicles, limited with kinematics in reality, should be treated as rigid, non-particle objects ideally. In this paper, we tackle this problem using separate LSTMs for heterogeneous vehicles and pedestrians. Specifically, we use an oriented bounding box to represent each vehicle, calculated based on its position and orientation, to denote its kinematic trajectories. We then propose a framework called VP-LSTM to predict the kinematic trajectories of both vehicles and pedestrians simultaneously. In order to evaluate our model, a large dataset containing the trajectories of both vehicles and pedestrians in vehicle-pedestrian-mixed scenes is specially built. Through comparisons between our method with state-of-the-art approaches, we show the effectiveness and advantages of our method on kinematic trajectories prediction in vehicle-pedestrian-mixed scenes.
Tasks	Anomaly Detection, Autonomous Driving, Trajectory Prediction
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Bi_Joint_Prediction_for_Kinematic_Trajectories_in_Vehicle-Pedestrian-Mixed_Scenes_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Bi_Joint_Prediction_for_Kinematic_Trajectories_in_Vehicle-Pedestrian-Mixed_Scenes_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/joint-prediction-for-kinematic-trajectories
Repo
Framework

Probabilistic Model-Based Dynamic Architecture Search


Title	Probabilistic Model-Based Dynamic Architecture Search
Authors	Nozomu Yoshinari, Kento Uchida, Shota Saito, Shinichi Shirakawa, Youhei Akimoto
Abstract	The architecture search methods for convolutional neural networks (CNNs) have shown promising results. These methods require significant computational resources, as they repeat the neural network training many times to evaluate and search the architectures. Developing the computationally efficient architecture search method is an important research topic. In this paper, we assume that the structure parameters of CNNs are categorical variables, such as types and connectivities of layers, and they are regarded as the learnable parameters. Introducing the multivariate categorical distribution as the underlying distribution for the structure parameters, we formulate a differentiable loss for the training task, where the training of the weights and the optimization of the parameters of the distribution for the structure parameters are coupled. They are trained using the stochastic gradient descent, leading to the optimization of the structure parameters within a single training. We apply the proposed method to search the architecture for two computer vision tasks: image classification and inpainting. The experimental results show that the proposed architecture search method is fast and can achieve comparable performance to the existing methods.
Tasks	Image Classification, Neural Architecture Search
Published	2019-05-01
URL	https://openreview.net/forum?id=Ske1-209Y7
PDF	https://openreview.net/pdf?id=Ske1-209Y7
PWC	https://paperswithcode.com/paper/probabilistic-model-based-dynamic
Repo
Framework

Understanding and Improving Hidden Representations for Neural Machine Translation


Title	Understanding and Improving Hidden Representations for Neural Machine Translation
Authors	Guanlin Li, Lemao Liu, Xintong Li, Conghui Zhu, Tiejun Zhao, Shuming Shi
Abstract	Multilayer architectures are currently the gold standard for large-scale neural machine translation. Existing works have explored some methods for understanding the hidden representations, however, they have not sought to improve the translation quality rationally according to their understanding. Towards understanding for performance improvement, we first artificially construct a sequence of nested relative tasks and measure the feature generalization ability of the learned hidden representation over these tasks. Based on our understanding, we then propose to regularize the layer-wise representations with all tree-induced tasks. To overcome the computational bottleneck resulting from the large number of regularization terms, we design efficient approximation methods by selecting a few coarse-to-fine tasks for regularization. Extensive experiments on two widely-used datasets demonstrate the proposed methods only lead to small extra overheads in training but no additional overheads in testing, and achieve consistent improvements (up to +1.3 BLEU) compared to the state-of-the-art translation model.
Tasks	Machine Translation
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1046/
PDF	https://www.aclweb.org/anthology/N19-1046
PWC	https://paperswithcode.com/paper/understanding-and-improving-hidden
Repo
Framework

Deep RL-based Trajectory Planning for AoI Minimization in UAV-assisted IoT


Title	Deep RL-based Trajectory Planning for AoI Minimization in UAV-assisted IoT
Authors	Conghao Zhou, Hongli He, Peng Yang, Feng Lyu, WenWu, Nan Cheng, and Xuemin (Sherman) Shen
Abstract	Due to the flexibility and low deployment cost, unmanned aerial vehicles (UAVs) have been widely used to assist cellular networks in providing extended coverage for Internet of Things (IoT) networks. Existing throughput or delay-based UAV trajectory planning methods cannot meet the requirement of collecting fresh information from IoT devices. In this paper, by taking age-of-information (AoI) as a measure of information freshness, we investigate AoI-based UAV trajectory planning for fresh data collection. To model the complicated association and interaction pattern between UAV and IoT devices, the UAV trajectory planning problem is formulated as a Markov decision process (MDP) to capture the dynamics of UAV locations. As net- work topology and traffic generation pattern are unknown ahead, we propose an AoI-based trajectory planning (A-TP) algorithm using deep reinforcement learning (RL) technique. To accelerate the learning process during online decision making, the off-line pre-training of deep neural networks is performed. Extensive simulation results demonstrate that the proposed algorithm can significantly reduce the AoI of collected IoT data, as compared to other benchmark approaches.
Tasks	Decision Making
Published	2019-12-09
URL	https://ieeexplore.ieee.org/document/8928091/
PDF	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8928091
PWC	https://paperswithcode.com/paper/deep-rl-based-trajectory-planning-for-aoi
Repo
Framework