January 25, 2020

2714 words 13 mins read

Paper Group NAWR 36

Paper Group NAWR 36

A Composable Specification Language for Reinforcement Learning Tasks. Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets. Beyond Word Attention: Using Segment Attention in Neural Relation Extraction. LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning. Prediction of Spatial Point Pr …

A Composable Specification Language for Reinforcement Learning Tasks

Title A Composable Specification Language for Reinforcement Learning Tasks
Authors Kishor Jothimurugan, Rajeev Alur, Osbert Bastani
Abstract Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9462-a-composable-specification-language-for-reinforcement-learning-tasks
PDF http://papers.nips.cc/paper/9462-a-composable-specification-language-for-reinforcement-learning-tasks.pdf
PWC https://paperswithcode.com/paper/a-composable-specification-language-for
Repo https://github.com/keyshor/spectrl_tool
Framework pytorch

Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets

Title Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets
Authors Nicole Peinelt, Maria Liakata, Dong Nguyen
Abstract Existing datasets for scoring text pairs in terms of semantic similarity contain instances whose resolution differs according to the degree of difficulty. This paper proposes to distinguish obvious from non-obvious text pairs based on superficial lexical overlap and ground-truth labels. We characterise existing datasets in terms of containing difficult cases and find that recently proposed models struggle to capture the non-obvious cases of semantic similarity. We describe metrics that emphasise cases of similarity which require more complex inference and propose that these are used for evaluating systems for semantic similarity.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1268/
PDF https://www.aclweb.org/anthology/P19-1268
PWC https://paperswithcode.com/paper/aiming-beyond-the-obvious-identifying-non
Repo https://github.com/wuningxi/LexSim
Framework none

Beyond Word Attention: Using Segment Attention in Neural Relation Extraction

Title Beyond Word Attention: Using Segment Attention in Neural Relation Extraction
Authors Bowen Yu, Zhenyu Zhang, Tingwen Liu, Bin Wang, Sujian Li, Quangang Li
Abstract Relation extraction studies the issue of predicting semantic relations between pairs of entities in sentences. Attention mechanisms are often used in this task to alleviate the inner-sentence noise by performing soft selections of words independently. Based on the observation that information pertinent to relations is usually contained within segments (continuous words in a sentence), it is possible to make use of this phenomenon for better extraction. In this paper, we aim to incorporate such segment information into neural relation extractor. Our approach views the attention mechanism as linear-chain conditional random fields over a set of latent variables whose edges encode the desired structure, and regards attention weight as the marginal distribution of each word being selected as a part of the relational expression. Experimental results show that our method can attend to continuous relational expressions without explicit annotations, and achieve the state-of-the-art performance on the large-scale TACRED dataset.
Tasks Relation Extraction
Published 2019-08-10
URL https://www.ijcai.org/proceedings/2019/750
PDF https://www.ijcai.org/proceedings/2019/0750.pdf
PWC https://paperswithcode.com/paper/beyond-word-attention-using-segment-attention
Repo https://github.com/yubowen-ph/segment
Framework pytorch

LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning

Title LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning
Authors Yali Du, Lei Han, Meng Fang, Ji Liu, Tianhong Dai, Dacheng Tao
Abstract A great challenge in cooperative decentralized multi-agent reinforcement learning (MARL) is generating diversified behaviors for each individual agent when receiving only a team reward. Prior studies have paid much effort on reward shaping or designing a centralized critic that can discriminatively credit the agents. In this paper, we propose to merge the two directions and learn each agent an intrinsic reward function which diversely stimulates the agents at each time step. Specifically, the intrinsic reward for a specific agent will be involved in computing a distinct proxy critic for the agent to direct the updating of its individual policy. Meanwhile, the parameterized intrinsic reward function will be updated towards maximizing the expected accumulated team reward from the environment so that the objective is consistent with the original MARL problem. The proposed method is referred to as learning individual intrinsic reward (LIIR) in MARL. We compare LIIR with a number of state-of-the-art MARL methods on battle games in StarCraft II. The results demonstrate the effectiveness of LIIR, and we show LIIR can assign each individual agent an insightful intrinsic reward per time step.
Tasks Multi-agent Reinforcement Learning, Starcraft, Starcraft II
Published 2019-12-01
URL http://papers.nips.cc/paper/8691-liir-learning-individual-intrinsic-reward-in-multi-agent-reinforcement-learning
PDF http://papers.nips.cc/paper/8691-liir-learning-individual-intrinsic-reward-in-multi-agent-reinforcement-learning.pdf
PWC https://paperswithcode.com/paper/liir-learning-individual-intrinsic-reward-in
Repo https://github.com/yalidu/liir
Framework pytorch

Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

Title Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees
Authors Muhammad Osama, Dave Zachariah, Peter Stoica
Abstract A spatial point process can be characterized by an intensity function which predicts the number of events that occur across space. In this paper, we develop a method to infer predictive intensity intervals by learning a spatial model using a regularized criterion. We prove that the proposed method exhibits out-of-sample prediction performance guarantees which, unlike standard estimators, are valid even when the spatial model is misspecified. The method is demonstrated using synthetic as well as real spatial data.
Tasks Point Processes
Published 2019-12-01
URL http://papers.nips.cc/paper/9363-prediction-of-spatial-point-processes-regularized-method-with-out-of-sample-guarantees
PDF http://papers.nips.cc/paper/9363-prediction-of-spatial-point-processes-regularized-method-with-out-of-sample-guarantees.pdf
PWC https://paperswithcode.com/paper/prediction-of-spatial-point-processes
Repo https://github.com/Muhammad-Osama/uncertainty_spatial_point_process
Framework none

Convolutional Neural Networks Can Be Deceived by Visual Illusions

Title Convolutional Neural Networks Can Be Deceived by Visual Illusions
Authors Alexander Gomez-Villa, Adrian Martin, Javier Vazquez-Corral, Marcelo Bertalmio
Abstract Visual illusions teach us that what we see is not always what is represented in the physical world. Their special nature make them a fascinating tool to test and validate any new vision model proposed. In general, current vision models are based on the concatenation of linear and non-linear operations. The similarity of this structure with the operations present in Convolutional Neural Networks (CNNs) has motivated us to study if CNNs trained for low-level visual tasks are deceived by visual illusions. In particular, we show that CNNs trained for image denoising, image deblurring, and computational color constancy are able to replicate the human response to visual illusions, and that the extent of this replication varies with respect to variation in architecture and spatial pattern size. These results suggest that in order to obtain CNNs that better replicate human behaviour, we may need to start aiming for them to better replicate visual illusions.
Tasks Color Constancy, Deblurring, Denoising, Image Denoising
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Gomez-Villa_Convolutional_Neural_Networks_Can_Be_Deceived_by_Visual_Illusions_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Gomez-Villa_Convolutional_Neural_Networks_Can_Be_Deceived_by_Visual_Illusions_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/convolutional-neural-networks-can-be-deceived
Repo https://github.com/alviur/convnets_vs_vi
Framework none

Fixing Implicit Derivatives: Trust-Region Based Learning of Continuous Energy Functions

Title Fixing Implicit Derivatives: Trust-Region Based Learning of Continuous Energy Functions
Authors Chris Russell, Matteo Toso, Neill Campbell
Abstract We present a new technique for the learning of continuous energy functions that we refer to as Wibergian Learning. One common approach to inverse problems is to cast them as an energy minimisation problem, where the minimum cost solution found is used as an estimator of hidden parameters. Our new approach formally characterises the dependency between weights that control the shape of the energy function, and the location of minima, by describing minima as fixed points of optimisation methods. This allows for the use of gradient-based end-to- end training to integrate deep-learning and the classical inverse problem methods. We show how our approach can be applied to obtain state-of-the-art results in the diverse applications of tracker fusion and multiview 3D reconstruction.
Tasks 3D Reconstruction
Published 2019-12-01
URL http://papers.nips.cc/paper/8427-fixing-implicit-derivatives-trust-region-based-learning-of-continuous-energy-functions
PDF http://papers.nips.cc/paper/8427-fixing-implicit-derivatives-trust-region-based-learning-of-continuous-energy-functions.pdf
PWC https://paperswithcode.com/paper/fixing-implicit-derivatives-trust-region
Repo https://github.com/MatteoT90/WibergianLearning
Framework tf

Attribute Attention for Semantic Disambiguation in Zero-Shot Learning

Title Attribute Attention for Semantic Disambiguation in Zero-Shot Learning
Authors Yang Liu, Jishun Guo, Deng Cai, Xiaofei He
Abstract Zero-shot learning (ZSL) aims to accurately recognize unseen objects by learning mapping matrices that bridge the gap between visual information and semantic attributes. Previous works implicitly treat attributes equally in compatibility score while ignoring that they have different importance for discrimination, which leads to severe semantic ambiguity. Considering both low-level visual information and global class-level features that relate to this ambiguity, we propose a practical Latent Feature Guided Attribute Attention (LFGAA) framework to perform object-based attribute attention for semantic disambiguation. By distracting semantic activation in dimensions that cause ambiguity, our method outperforms existing state-of-the-art methods on AwA2, CUB and SUN datasets in both inductive and transductive settings.
Tasks Zero-Shot Learning
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Liu_Attribute_Attention_for_Semantic_Disambiguation_in_Zero-Shot_Learning_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Liu_Attribute_Attention_for_Semantic_Disambiguation_in_Zero-Shot_Learning_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/attribute-attention-for-semantic
Repo https://github.com/ZJULearning/AttentionZSL
Framework pytorch

Graph-Based Semi-Supervised Learning with Non-ignorable Non-response

Title Graph-Based Semi-Supervised Learning with Non-ignorable Non-response
Authors Fan Zhou, Tengfei Li, Haibo Zhou, Hongtu Zhu, Ye Jieping
Abstract Graph-based semi-supervised learning is a very powerful tool in classification tasks, while in most existing literature the labelled nodes are assumed to be randomly sampled. When the labelling status depends on the unobserved node response, ignoring the missingness can lead to significant estimation bias and handicap the classifiers. This situation is called non-ignorable non-response. To solve the problem, we propose a Graph-based joint model with Non-ignorable Non-response (GNN), followed by a joint inverse weighting estimation procedure incorporated with sampling imputation approach. Our method is proved to outperform some state-of-art models in both regression and classification problems, by simulations and real analysis on the Cora dataset.
Tasks Imputation
Published 2019-12-01
URL http://papers.nips.cc/paper/8924-graph-based-semi-supervised-learning-with-non-ignorable-non-response
PDF http://papers.nips.cc/paper/8924-graph-based-semi-supervised-learning-with-non-ignorable-non-response.pdf
PWC https://paperswithcode.com/paper/graph-based-semi-supervised-learning-with-non
Repo https://github.com/mlzxzhou/keras-gnm
Framework none

What Does BERT Learn about the Structure of Language?

Title What Does BERT Learn about the Structure of Language?
Authors Ganesh Jawahar, Beno{^\i}t Sagot, Djam{'e} Seddah
Abstract BERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks. This result indicates the possibility that BERT networks capture structural information about language. In this work, we provide novel support for this claim by performing a series of experiments to unpack the elements of English language structure learned by BERT. Our findings are fourfold. BERT{'}s phrasal representation captures the phrase-level information in the lower layers. The intermediate layers of BERT compose a rich hierarchy of linguistic information, starting with surface features at the bottom, syntactic features in the middle followed by semantic features at the top. BERT requires deeper layers while tracking subject-verb agreement to handle long-term dependency problem. Finally, the compositional scheme underlying BERT mimics classical, tree-like structures.
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1356/
PDF https://www.aclweb.org/anthology/P19-1356
PWC https://paperswithcode.com/paper/what-does-bert-learn-about-the-structure-of
Repo https://github.com/ganeshjawahar/interpret_bert
Framework pytorch

Deep Video Frame Interpolation using Cyclic Frame Generation

Title Deep Video Frame Interpolation using Cyclic Frame Generation
Authors Yu-Lun Liu, Yi-Tung Liao, Yen-Yu Lin, Yung-Yu Chuang1, 2
Abstract Video frame interpolation algorithms predict intermediate frames to produce videos with higher frame rates and smooth view transitions given two consecutive frames as inputs. We propose that: synthesized frames are more reliable if they can be used to reconstruct the input frames with high quality. Based on this idea, we introduce a new loss term, the cycle consistency loss. The cycle consistency loss can better utilize the training data to not only enhance the interpolation results, but also maintain the performance better with less training data. It can be integrated into any frame interpolation network and trained in an end-to-end manner. In addition to the cycle consistency loss, we propose two extensions: motion linearity loss and edge-guided training. The motion linearity loss approximates the motion between two input frames to be linear and regularizes the training. By applying edge-guided training, we further improve results by integrating edge information into training. Both qualitative and quantitative experiments demonstrate that our model outperforms the state-of-the-art methods.
Tasks Video Frame Interpolation
Published 2019-01-27
URL https://www.cmlab.csie.ntu.edu.tw/~yulunliu/CyclicGen
PDF https://www.cmlab.csie.ntu.edu.tw/~yulunliu/CyclicGen_/liu.pdf
PWC https://paperswithcode.com/paper/deep-video-frame-interpolation-using-cyclic
Repo https://github.com/alex04072000/CyclicGen
Framework tf

Sentiment Classification Using Document Embeddings Trained with Cosine Similarity

Title Sentiment Classification Using Document Embeddings Trained with Cosine Similarity
Authors Tan Thongtan, Tanasanee Phienthrakul
Abstract In document-level sentiment classification, each document must be mapped to a fixed length vector. Document embedding models map each document to a dense, low-dimensional vector in continuous vector space. This paper proposes training document embeddings using cosine similarity instead of dot product. Experiments on the IMDB dataset show that accuracy is improved when using cosine similarity compared to using dot product, while using feature combination with Naive Bayes weighted bag of n-grams achieves a new state of the art accuracy of 97.42{%}. Code to reproduce all experiments is available at https://github.com/tanthongtan/dv-cosine
Tasks Document Embedding, Sentiment Analysis
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-2057/
PDF https://www.aclweb.org/anthology/P19-2057
PWC https://paperswithcode.com/paper/sentiment-classification-using-document
Repo https://github.com/tanthongtan/dv-cosine
Framework none

Linking convolutional neural networks with graph convolutional networks: application in pulmonary artery-vein separation

Title Linking convolutional neural networks with graph convolutional networks: application in pulmonary artery-vein separation
Authors Zhiwei Zhai, Marius Staring, Xuhui Zhou, Qiuxia Xie, Xiaojuan Xiao, M. Els Bakker, Lucia J. Kroft, Boudewijn P.F. Lelieveldt, Gudula J.A.M. Boon, Frederikus A. Klok, Berend C. Stoel
Abstract Graph Convolutional Networks (GCNs) are a novel and powerful method for dealing with non-Euclidean data, while Convolutional Neural Networks (CNNs) can learn features from Euclidean data such as images. In this work, we propose a novel method to combine CNNs with GCNs (CNN-GCN), that can consider both Euclidean and non-Euclidean features and can be trained end-to-end. We applied this method to separate the pulmonary vascular trees into arteries and veins (A/V). Chest CT scans were pre-processed by vessel segmentation and skeletonization, from which a graph was constructed: voxels on the skeletons resulting in a vertex set and their connections in an adjacency matrix. 3D patches centered around each vertex were extracted from the CT scans, oriented perpendicularly to the vessel. The proposed CNN-GCN classifier was trained and applied on the constructed vessel graphs, where each node is then labeled as artery or vein. The proposed method was trained and validated on data from one hospital (11 patient, 22 lungs), and tested on independent data from a different hospital (10 patients, 10 lungs). A baseline CNN method and human observer performance were used for comparison. The CNN-GCN method obtained a median accuracy of 0.773 (0.738) in the validation (test) set, compared to a median accuracy of 0.817 by the observers, and 0.727 (0.693) by the CNN. In conclusion, the proposed CNN-GCN method combines local image information with graph connectivity information, improving pulmonary A/V separation over a baseline CNN method, approaching the performance of human observers.
Tasks 3D Medical Imaging Segmentation, Medical Image Segmentation, Pulmonary Artery–Vein Classification, Pulmorary Vessel Segmentation
Published 2019-09-01
URL https://www.researchgate.net/publication/335620542_Linking_convolutional_neural_networks_with_graph_convolutional_networks_application_in_pulmonary_artery-vein_separation
PDF https://bit.ly/2kNpbdv
PWC https://paperswithcode.com/paper/linking-convolutional-neural-networks-with
Repo https://github.com/chushan89/Linking-CNN-GCN
Framework tf

Dialect Text Normalization to Normative Standard Finnish

Title Dialect Text Normalization to Normative Standard Finnish
Authors Niko Partanen, Mika H{"a}m{"a}l{"a}inen, Khalid Alnajjar
Abstract We compare different LSTMs and transformer models in terms of their effectiveness in normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common way of communication for people online in Finnish, such a normalization is a necessary step to improve the accuracy of the existing Finnish NLP tools that are tailored for normative Finnish text. We work on a corpus consisting of dialectal data of 23 distinct Finnish dialects. The best functioning BRNN approach lowers the initial word error rate of the corpus from 52.89 to 5.73.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5519/
PDF https://www.aclweb.org/anthology/D19-5519
PWC https://paperswithcode.com/paper/dialect-text-normalization-to-normative
Repo https://github.com/mikahama/murre
Framework none

On Distributed Averaging for Stochastic k-PCA

Title On Distributed Averaging for Stochastic k-PCA
Authors Aditya Bhaskara, Pruthuvi Maheshakya Wijewardena
Abstract In the stochastic k-PCA problem, we are given i.i.d. samples from an unknown distribution over vectors, and the goal is to compute the top k eigenvalues and eigenvectors of the moment matrix. In the simplest distributed variant, we have ‘m’ machines each of which receives ‘n’ samples. Each machine performs some computation and sends an O(k) size summary of the local dataset to a central server. The server performs an aggregation and computes the desired eigenvalues and vectors. The goal is to achieve the same effect as the server computing using m*n samples by itself. The main choices in this framework are the choice of the summary, and the method of aggregation. We consider a slight variant of the well-studied “distributed averaging” approach, and prove that this leads to significantly better bounds on the dependence between ‘n’ and the eigenvalue gaps. Our method can also be applied directly to a setting where the “right” value of the parameter k (i.e., one for which there is a non-trivial eigenvalue gap) is not known exactly. This is a common issue in practice which prior methods were unable to address.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9283-on-distributed-averaging-for-stochastic-k-pca
PDF http://papers.nips.cc/paper/9283-on-distributed-averaging-for-stochastic-k-pca.pdf
PWC https://paperswithcode.com/paper/on-distributed-averaging-for-stochastic-k-pca
Repo https://github.com/maheshakya/dist-averaging-k-pca
Framework none
comments powered by Disqus