Paper Group NANR 56
Learning to Learn with Conditional Class Dependencies. UBC-NLP at SemEval-2019 Task 4: Hyperpartisan News Detection With Attention-Based Bi-LSTMs. Exploring Social Bias in Chatbots using Stereotype Knowledge. QE BERT: Bilingual BERT Using Multi-task Learning for Neural Quality Estimation. Sentiment Analysis for Multilingual Corpora. Recursive Routi …
Learning to Learn with Conditional Class Dependencies
Title | Learning to Learn with Conditional Class Dependencies |
Authors | Xiang Jiang, Mohammad Havaei, Farshid Varno, Gabriel Chartrand, Nicolas Chapados, Stan Matwin |
Abstract | Neural networks can learn to extract statistical properties from data, but they seldom make use of structured information from the label space to help representation learning. Although some label structure can implicitly be obtained when training on huge amounts of data, in a few-shot learning context where little data is available, making explicit use of the label structure can inform the model to reshape the representation space to reflect a global sense of class dependencies. We propose a meta-learning framework, Conditional class-Aware Meta-Learning (CAML), that conditionally transforms feature representations based on a metric space that is trained to capture inter-class dependencies. This enables a conditional modulation of the feature representations of the base-learner to impose regularities informed by the label space. Experiments show that the conditional transformation in CAML leads to more disentangled representations and achieves competitive results on the miniImageNet benchmark. |
Tasks | Few-Shot Learning, Meta-Learning, Representation Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJfOXnActQ |
https://openreview.net/pdf?id=BJfOXnActQ | |
PWC | https://paperswithcode.com/paper/learning-to-learn-with-conditional-class |
Repo | |
Framework | |
UBC-NLP at SemEval-2019 Task 4: Hyperpartisan News Detection With Attention-Based Bi-LSTMs
Title | UBC-NLP at SemEval-2019 Task 4: Hyperpartisan News Detection With Attention-Based Bi-LSTMs |
Authors | Chiyu Zhang, Arun Rajendran, Muhammad Abdul-Mageed |
Abstract | We present our deep learning models submitted to the SemEval-2019 Task 4 competition focused at Hyperpartisan News Detection. We acquire best results with a Bi-LSTM network equipped with a self-attention mechanism. Among 33 participating teams, our submitted system ranks top 7 (65.3{%} accuracy) on the {}labels-by-publisher{'} sub-task and top 24 out of 44 teams (68.3{\%} accuracy) on the { }labels-by-article{'} sub-task (65.3{%} accuracy). We also report a model that scores higher than the 8th ranking system (78.5{%} accuracy) on the {`}labels-by-article{'} sub-task. | |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2188/ |
https://www.aclweb.org/anthology/S19-2188 | |
PWC | https://paperswithcode.com/paper/ubc-nlp-at-semeval-2019-task-4-hyperpartisan |
Repo | |
Framework | |
Exploring Social Bias in Chatbots using Stereotype Knowledge
Title | Exploring Social Bias in Chatbots using Stereotype Knowledge |
Authors | Nayeon Lee, Andrea Madotto, Pascale Fung |
Abstract | Exploring social bias in chatbot is an important, yet relatively unexplored problem. In this paper, we propose an approach to understand social bias in chatbots by leveraging stereotype knowledge. It allows interesting comparison of bias between chatbots and humans, and provides intuitive analysis of existing chatbots by borrowing the finer-grain concepts of sexism and racism. |
Tasks | Chatbot |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/papers/W/W19/W19-3655/ |
https://www.aclweb.org/anthology/W19-3655 | |
PWC | https://paperswithcode.com/paper/exploring-social-bias-in-chatbots-using |
Repo | |
Framework | |
QE BERT: Bilingual BERT Using Multi-task Learning for Neural Quality Estimation
Title | QE BERT: Bilingual BERT Using Multi-task Learning for Neural Quality Estimation |
Authors | Hyun Kim, Joon-Ho Lim, Hyun-Ki Kim, Seung-Hoon Na |
Abstract | For translation quality estimation at word and sentence levels, this paper presents a novel approach based on BERT that recently has achieved impressive results on various natural language processing tasks. Our proposed model is re-purposed BERT for the translation quality estimation and uses multi-task learning for the sentence-level task and word-level subtasks (i.e., source word, target word, and target gap). Experimental results on Quality Estimation shared task of WMT19 show that our systems show competitive results and provide significant improvements over the baseline. |
Tasks | Multi-Task Learning |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5407/ |
https://www.aclweb.org/anthology/W19-5407 | |
PWC | https://paperswithcode.com/paper/qe-bert-bilingual-bert-using-multi-task |
Repo | |
Framework | |
Sentiment Analysis for Multilingual Corpora
Title | Sentiment Analysis for Multilingual Corpora |
Authors | Svitlana Galeshchuk, Ju Qiu, Julien Jourdan |
Abstract | The paper presents a generic approach to the supervised sentiment analysis of social media content in Slavic languages. The method proposes translating the documents from the original language to English with Google{'}s Neural Translation Model. The resulted texts are then converted to vectors by averaging the vectorial representation of words derived from a pre-trained Word2Vec English model. Testing the approach with several machine learning methods on Polish, Slovenian and Croatian Twitter datasets returns up to 86{%} of classification accuracy on the out-of-sample data. |
Tasks | Sentiment Analysis |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3717/ |
https://www.aclweb.org/anthology/W19-3717 | |
PWC | https://paperswithcode.com/paper/sentiment-analysis-for-multilingual-corpora |
Repo | |
Framework | |
Recursive Routing Networks: Learning to Compose Modules for Language Understanding
Title | Recursive Routing Networks: Learning to Compose Modules for Language Understanding |
Authors | Ignacio Cases, Clemens Rosenbaum, Matthew Riemer, Atticus Geiger, Tim Klinger, Alex Tamkin, Olivia Li, S Agarwal, hini, Joshua D. Greene, Dan Jurafsky, Christopher Potts, Lauri Karttunen |
Abstract | We introduce Recursive Routing Networks (RRNs), which are modular, adaptable models that learn effectively in diverse environments. RRNs consist of a set of functions, typically organized into a grid, and a meta-learner decision-making component called the router. The model jointly optimizes the parameters of the functions and the meta-learner{'}s policy for routing inputs through those functions. RRNs can be incorporated into existing architectures in a number of ways; we explore adding them to word representation layers, recurrent network hidden layers, and classifier layers. Our evaluation task is natural language inference (NLI). Using the MultiNLI corpus, we show that an RRN{'}s routing decisions reflect the high-level genre structure of that corpus. To show that RRNs can learn to specialize to more fine-grained semantic distinctions, we introduce a new corpus of NLI examples involving implicative predicates, and show that the model components become fine-tuned to the inferential signatures that are characteristic of these predicates. |
Tasks | Decision Making, Natural Language Inference |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1365/ |
https://www.aclweb.org/anthology/N19-1365 | |
PWC | https://paperswithcode.com/paper/recursive-routing-networks-learning-to |
Repo | |
Framework | |
FEED: Feature-level Ensemble Effect for knowledge Distillation
Title | FEED: Feature-level Ensemble Effect for knowledge Distillation |
Authors | SeongUk Park, Nojun Kwak |
Abstract | This paper proposes a versatile and powerful training algorithm named Feature-level Ensemble Effect for knowledge Distillation(FEED), which is inspired by the work of factor transfer. The factor transfer is one of the knowledge transfer methods that improves the performance of a student network with a strong teacher network. It transfers the knowledge of a teacher in the feature map level using high-capacity teacher network, and our training algorithm FEED is an extension of it. FEED aims to transfer ensemble knowledge, using either multiple teachers in parallel or multiple training sequences. Adapting the peer-teaching framework, we introduce a couple of training algorithms that transfer ensemble knowledge to the student at the feature map level, both of which help the student network find more generalized solutions in the parameter space. Experimental results on CIFAR-100 and ImageNet show that our method, FEED, has clear performance enhancements,without introducing any additional parameters or computations at test time. |
Tasks | Transfer Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BJxYEsAqY7 |
https://openreview.net/pdf?id=BJxYEsAqY7 | |
PWC | https://paperswithcode.com/paper/feed-feature-level-ensemble-effect-for |
Repo | |
Framework | |
A Partially Rule-Based Approach to AMR Generation
Title | A Partially Rule-Based Approach to AMR Generation |
Authors | Emma Manning |
Abstract | This paper presents a new approach to generating English text from Abstract Meaning Representation (AMR). In contrast to the neural and statistical MT approaches used in other AMR generation systems, this one is largely rule-based, supplemented only by a language model and simple statistical linearization models, allowing for more control over the output. We also address the difficulties of automatically evaluating AMR generation systems and the problems with BLEU for this task. We compare automatic metrics to human evaluations and show that while METEOR and TER arguably reflect human judgments better than BLEU, further research into suitable evaluation metrics is needed. |
Tasks | Language Modelling |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-3009/ |
https://www.aclweb.org/anthology/N19-3009 | |
PWC | https://paperswithcode.com/paper/a-partially-rule-based-approach-to-amr |
Repo | |
Framework | |
Leveraging Medical Literature for Section Prediction in Electronic Health Records
Title | Leveraging Medical Literature for Section Prediction in Electronic Health Records |
Authors | Sara Rosenthal, Ken Barker, Zhicheng Liang |
Abstract | Electronic Health Records (EHRs) contain both structured content and unstructured (text) content about a patient{'}s medical history. In the unstructured text parts, there are common sections such as Assessment and Plan, Social History, and Medications. These sections help physicians find information easily and can be used by an information retrieval system to return specific information sought by a user. However, it is common that the exact format of sections in a particular EHR does not adhere to known patterns. Therefore, being able to predict sections and headers in EHRs automatically is beneficial to physicians. Prior approaches in EHR section prediction have only used text data from EHRs and have required significant manual annotation. We propose using sections from medical literature (e.g., textbooks, journals, web content) that contain content similar to that found in EHR sections. Our approach uses data from a different kind of source where labels are provided without the need of a time-consuming annotation effort. We use this data to train two models: an RNN and a BERT-based model. We apply the learned models along with source data via transfer learning to predict sections in EHRs. Our results show that medical literature can provide helpful supervision signal for this classification task. |
Tasks | Information Retrieval, Transfer Learning |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1492/ |
https://www.aclweb.org/anthology/D19-1492 | |
PWC | https://paperswithcode.com/paper/leveraging-medical-literature-for-section |
Repo | |
Framework | |
Cross-lingual Subjectivity Detection for Resource Lean Languages
Title | Cross-lingual Subjectivity Detection for Resource Lean Languages |
Authors | Ida Amini, Samane Karimi, Azadeh Shakery |
Abstract | Wide and universal changes in the web content due to the growth of web 2 applications increase the importance of user-generated content on the web. Therefore, the related research areas such as sentiment analysis, opinion mining and subjectivity detection receives much attention from the research community. Due to the diverse languages that web-users use to express their opinions and sentiments, research areas like subjectivity detection should present methods which are practicable on all languages. An important prerequisite to effectively achieve this aim is considering the limitations in resource-lean languages. In this paper, cross-lingual subjectivity detection on resource lean languages is investigated using two different approaches: a language-model based and a learning-to-rank approach. Experimental results show the impact of different factors on the performance of subjectivity detection methods using English resources to detect the subjectivity score of Persian documents. The experiments demonstrate that the proposed learning-to-rank method outperforms the baseline method in ranking documents based on their subjectivity degree. |
Tasks | Language Modelling, Learning-To-Rank, Opinion Mining, Sentiment Analysis |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/W19-1310/ |
https://www.aclweb.org/anthology/W19-1310 | |
PWC | https://paperswithcode.com/paper/cross-lingual-subjectivity-detection-for |
Repo | |
Framework | |
MANIFOLDNET: A DEEP NEURAL NETWORK FOR MANIFOLD-VALUED DATA
Title | MANIFOLDNET: A DEEP NEURAL NETWORK FOR MANIFOLD-VALUED DATA |
Authors | Rudrasis Chakraborty, Jose Bouza, Jonathan Manton, Baba C. Vemuri |
Abstract | Developing deep neural networks (DNNs) for manifold-valued data sets has gained much interest of late in the deep learning research community. Examples of manifold-valued data include data from omnidirectional cameras on automobiles, drones etc., diffusion magnetic resonance imaging, elastography and others. In this paper, we present a novel theoretical framework for DNNs to cope with manifold-valued data inputs. In doing this generalization, we draw parallels to the widely popular convolutional neural networks (CNNs). We call our network the ManifoldNet. As in vector spaces where convolutions are equivalent to computing the weighted mean of functions, an analogous definition for manifold-valued data can be constructed involving the computation of the weighted Fr'{e}chet Mean (wFM). To this end, we present a provably convergent recursive computation of the wFM of the given data, where the weights makeup the convolution mask, to be learned. Further, we prove that the proposed wFM layer achieves a contraction mapping and hence the ManifoldNet does not need the additional non-linear ReLU unit used in standard CNNs. Operations such as pooling in traditional CNN are no longer necessary in this setting since wFM is already a pooling type operation. Analogous to the equivariance of convolution in Euclidean space to translations, we prove that the wFM is equivariant to the action of the group of isometries admitted by the Riemannian manifold on which the data reside. This equivariance property facilitates weight sharing within the network. We present experiments, using the ManifoldNet framework, to achieve video classification and image reconstruction using an auto-encoder+decoder setting. Experimental results demonstrate the efficacy of ManifoldNet in the context of classification and reconstruction accuracy. |
Tasks | Image Reconstruction, Video Classification |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SyzjBiR9t7 |
https://openreview.net/pdf?id=SyzjBiR9t7 | |
PWC | https://paperswithcode.com/paper/manifoldnet-a-deep-neural-network-for |
Repo | |
Framework | |
Joint Prediction for Kinematic Trajectories in Vehicle-Pedestrian-Mixed Scenes
Title | Joint Prediction for Kinematic Trajectories in Vehicle-Pedestrian-Mixed Scenes |
Authors | Huikun Bi, Zhong Fang, Tianlu Mao, Zhaoqi Wang, Zhigang Deng |
Abstract | Trajectory prediction for objects is challenging and critical for various applications (e.g., autonomous driving, and anomaly detection). Most of the existing methods focus on homogeneous pedestrian trajectories prediction, where pedestrians are treated as particles without size. However, they fall short of handling crowded vehicle-pedestrian-mixed scenes directly since vehicles, limited with kinematics in reality, should be treated as rigid, non-particle objects ideally. In this paper, we tackle this problem using separate LSTMs for heterogeneous vehicles and pedestrians. Specifically, we use an oriented bounding box to represent each vehicle, calculated based on its position and orientation, to denote its kinematic trajectories. We then propose a framework called VP-LSTM to predict the kinematic trajectories of both vehicles and pedestrians simultaneously. In order to evaluate our model, a large dataset containing the trajectories of both vehicles and pedestrians in vehicle-pedestrian-mixed scenes is specially built. Through comparisons between our method with state-of-the-art approaches, we show the effectiveness and advantages of our method on kinematic trajectories prediction in vehicle-pedestrian-mixed scenes. |
Tasks | Anomaly Detection, Autonomous Driving, Trajectory Prediction |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Bi_Joint_Prediction_for_Kinematic_Trajectories_in_Vehicle-Pedestrian-Mixed_Scenes_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Bi_Joint_Prediction_for_Kinematic_Trajectories_in_Vehicle-Pedestrian-Mixed_Scenes_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/joint-prediction-for-kinematic-trajectories |
Repo | |
Framework | |
Probabilistic Model-Based Dynamic Architecture Search
Title | Probabilistic Model-Based Dynamic Architecture Search |
Authors | Nozomu Yoshinari, Kento Uchida, Shota Saito, Shinichi Shirakawa, Youhei Akimoto |
Abstract | The architecture search methods for convolutional neural networks (CNNs) have shown promising results. These methods require significant computational resources, as they repeat the neural network training many times to evaluate and search the architectures. Developing the computationally efficient architecture search method is an important research topic. In this paper, we assume that the structure parameters of CNNs are categorical variables, such as types and connectivities of layers, and they are regarded as the learnable parameters. Introducing the multivariate categorical distribution as the underlying distribution for the structure parameters, we formulate a differentiable loss for the training task, where the training of the weights and the optimization of the parameters of the distribution for the structure parameters are coupled. They are trained using the stochastic gradient descent, leading to the optimization of the structure parameters within a single training. We apply the proposed method to search the architecture for two computer vision tasks: image classification and inpainting. The experimental results show that the proposed architecture search method is fast and can achieve comparable performance to the existing methods. |
Tasks | Image Classification, Neural Architecture Search |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Ske1-209Y7 |
https://openreview.net/pdf?id=Ske1-209Y7 | |
PWC | https://paperswithcode.com/paper/probabilistic-model-based-dynamic |
Repo | |
Framework | |
Understanding and Improving Hidden Representations for Neural Machine Translation
Title | Understanding and Improving Hidden Representations for Neural Machine Translation |
Authors | Guanlin Li, Lemao Liu, Xintong Li, Conghui Zhu, Tiejun Zhao, Shuming Shi |
Abstract | Multilayer architectures are currently the gold standard for large-scale neural machine translation. Existing works have explored some methods for understanding the hidden representations, however, they have not sought to improve the translation quality rationally according to their understanding. Towards understanding for performance improvement, we first artificially construct a sequence of nested relative tasks and measure the feature generalization ability of the learned hidden representation over these tasks. Based on our understanding, we then propose to regularize the layer-wise representations with all tree-induced tasks. To overcome the computational bottleneck resulting from the large number of regularization terms, we design efficient approximation methods by selecting a few coarse-to-fine tasks for regularization. Extensive experiments on two widely-used datasets demonstrate the proposed methods only lead to small extra overheads in training but no additional overheads in testing, and achieve consistent improvements (up to +1.3 BLEU) compared to the state-of-the-art translation model. |
Tasks | Machine Translation |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1046/ |
https://www.aclweb.org/anthology/N19-1046 | |
PWC | https://paperswithcode.com/paper/understanding-and-improving-hidden |
Repo | |
Framework | |
Deep RL-based Trajectory Planning for AoI Minimization in UAV-assisted IoT
Title | Deep RL-based Trajectory Planning for AoI Minimization in UAV-assisted IoT |
Authors | Conghao Zhou, Hongli He, Peng Yang, Feng Lyu, WenWu, Nan Cheng, and Xuemin (Sherman) Shen |
Abstract | Due to the flexibility and low deployment cost, unmanned aerial vehicles (UAVs) have been widely used to assist cellular networks in providing extended coverage for Internet of Things (IoT) networks. Existing throughput or delay-based UAV trajectory planning methods cannot meet the requirement of collecting fresh information from IoT devices. In this paper, by taking age-of-information (AoI) as a measure of information freshness, we investigate AoI-based UAV trajectory planning for fresh data collection. To model the complicated association and interaction pattern between UAV and IoT devices, the UAV trajectory planning problem is formulated as a Markov decision process (MDP) to capture the dynamics of UAV locations. As net- work topology and traffic generation pattern are unknown ahead, we propose an AoI-based trajectory planning (A-TP) algorithm using deep reinforcement learning (RL) technique. To accelerate the learning process during online decision making, the off-line pre-training of deep neural networks is performed. Extensive simulation results demonstrate that the proposed algorithm can significantly reduce the AoI of collected IoT data, as compared to other benchmark approaches. |
Tasks | Decision Making |
Published | 2019-12-09 |
URL | https://ieeexplore.ieee.org/document/8928091/ |
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8928091 | |
PWC | https://paperswithcode.com/paper/deep-rl-based-trajectory-planning-for-aoi |
Repo | |
Framework | |