January 25, 2020

2714 words 13 mins read

Paper Group NAWR 36

A Composable Specification Language for Reinforcement Learning Tasks. Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets. Beyond Word Attention: Using Segment Attention in Neural Relation Extraction. LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning. Prediction of Spatial Point Pr …

A Composable Specification Language for Reinforcement Learning Tasks


Title	A Composable Specification Language for Reinforcement Learning Tasks
Authors	Kishor Jothimurugan, Rajeev Alur, Osbert Bastani
Abstract	Reinforcement learning is a promising approach for learning control policies for robot tasks. However, specifying complex tasks (e.g., with multiple objectives and safety constraints) can be challenging, since the user must design a reward function that encodes the entire task. Furthermore, the user often needs to manually shape the reward to ensure convergence of the learning algorithm. We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping. We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9462-a-composable-specification-language-for-reinforcement-learning-tasks
PDF	http://papers.nips.cc/paper/9462-a-composable-specification-language-for-reinforcement-learning-tasks.pdf
PWC	https://paperswithcode.com/paper/a-composable-specification-language-for
Repo	https://github.com/keyshor/spectrl_tool
Framework	pytorch

Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets


Title	Aiming beyond the Obvious: Identifying Non-Obvious Cases in Semantic Similarity Datasets
Authors	Nicole Peinelt, Maria Liakata, Dong Nguyen
Abstract	Existing datasets for scoring text pairs in terms of semantic similarity contain instances whose resolution differs according to the degree of difficulty. This paper proposes to distinguish obvious from non-obvious text pairs based on superficial lexical overlap and ground-truth labels. We characterise existing datasets in terms of containing difficult cases and find that recently proposed models struggle to capture the non-obvious cases of semantic similarity. We describe metrics that emphasise cases of similarity which require more complex inference and propose that these are used for evaluating systems for semantic similarity.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1268/
PDF	https://www.aclweb.org/anthology/P19-1268
PWC	https://paperswithcode.com/paper/aiming-beyond-the-obvious-identifying-non
Repo	https://github.com/wuningxi/LexSim
Framework	none

Beyond Word Attention: Using Segment Attention in Neural Relation Extraction


Title	Beyond Word Attention: Using Segment Attention in Neural Relation Extraction
Authors	Bowen Yu, Zhenyu Zhang, Tingwen Liu, Bin Wang, Sujian Li, Quangang Li
Abstract	Relation extraction studies the issue of predicting semantic relations between pairs of entities in sentences. Attention mechanisms are often used in this task to alleviate the inner-sentence noise by performing soft selections of words independently. Based on the observation that information pertinent to relations is usually contained within segments (continuous words in a sentence), it is possible to make use of this phenomenon for better extraction. In this paper, we aim to incorporate such segment information into neural relation extractor. Our approach views the attention mechanism as linear-chain conditional random fields over a set of latent variables whose edges encode the desired structure, and regards attention weight as the marginal distribution of each word being selected as a part of the relational expression. Experimental results show that our method can attend to continuous relational expressions without explicit annotations, and achieve the state-of-the-art performance on the large-scale TACRED dataset.
Tasks	Relation Extraction
Published	2019-08-10
URL	https://www.ijcai.org/proceedings/2019/750
PDF	https://www.ijcai.org/proceedings/2019/0750.pdf
PWC	https://paperswithcode.com/paper/beyond-word-attention-using-segment-attention
Repo	https://github.com/yubowen-ph/segment
Framework	pytorch

LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning


Title	LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning
Authors	Yali Du, Lei Han, Meng Fang, Ji Liu, Tianhong Dai, Dacheng Tao
Abstract	A great challenge in cooperative decentralized multi-agent reinforcement learning (MARL) is generating diversified behaviors for each individual agent when receiving only a team reward. Prior studies have paid much effort on reward shaping or designing a centralized critic that can discriminatively credit the agents. In this paper, we propose to merge the two directions and learn each agent an intrinsic reward function which diversely stimulates the agents at each time step. Specifically, the intrinsic reward for a specific agent will be involved in computing a distinct proxy critic for the agent to direct the updating of its individual policy. Meanwhile, the parameterized intrinsic reward function will be updated towards maximizing the expected accumulated team reward from the environment so that the objective is consistent with the original MARL problem. The proposed method is referred to as learning individual intrinsic reward (LIIR) in MARL. We compare LIIR with a number of state-of-the-art MARL methods on battle games in StarCraft II. The results demonstrate the effectiveness of LIIR, and we show LIIR can assign each individual agent an insightful intrinsic reward per time step.
Tasks	Multi-agent Reinforcement Learning, Starcraft, Starcraft II
Published	2019-12-01
URL	http://papers.nips.cc/paper/8691-liir-learning-individual-intrinsic-reward-in-multi-agent-reinforcement-learning
PDF	http://papers.nips.cc/paper/8691-liir-learning-individual-intrinsic-reward-in-multi-agent-reinforcement-learning.pdf
PWC	https://paperswithcode.com/paper/liir-learning-individual-intrinsic-reward-in
Repo	https://github.com/yalidu/liir
Framework	pytorch

Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees


Title	Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees
Authors	Muhammad Osama, Dave Zachariah, Peter Stoica
Abstract	A spatial point process can be characterized by an intensity function which predicts the number of events that occur across space. In this paper, we develop a method to infer predictive intensity intervals by learning a spatial model using a regularized criterion. We prove that the proposed method exhibits out-of-sample prediction performance guarantees which, unlike standard estimators, are valid even when the spatial model is misspecified. The method is demonstrated using synthetic as well as real spatial data.
Tasks	Point Processes
Published	2019-12-01
URL	http://papers.nips.cc/paper/9363-prediction-of-spatial-point-processes-regularized-method-with-out-of-sample-guarantees
PDF	http://papers.nips.cc/paper/9363-prediction-of-spatial-point-processes-regularized-method-with-out-of-sample-guarantees.pdf
PWC	https://paperswithcode.com/paper/prediction-of-spatial-point-processes
Repo	https://github.com/Muhammad-Osama/uncertainty_spatial_point_process
Framework	none

Convolutional Neural Networks Can Be Deceived by Visual Illusions


Title	Convolutional Neural Networks Can Be Deceived by Visual Illusions
Authors	Alexander Gomez-Villa, Adrian Martin, Javier Vazquez-Corral, Marcelo Bertalmio
Abstract	Visual illusions teach us that what we see is not always what is represented in the physical world. Their special nature make them a fascinating tool to test and validate any new vision model proposed. In general, current vision models are based on the concatenation of linear and non-linear operations. The similarity of this structure with the operations present in Convolutional Neural Networks (CNNs) has motivated us to study if CNNs trained for low-level visual tasks are deceived by visual illusions. In particular, we show that CNNs trained for image denoising, image deblurring, and computational color constancy are able to replicate the human response to visual illusions, and that the extent of this replication varies with respect to variation in architecture and spatial pattern size. These results suggest that in order to obtain CNNs that better replicate human behaviour, we may need to start aiming for them to better replicate visual illusions.
Tasks	Color Constancy, Deblurring, Denoising, Image Denoising
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Gomez-Villa_Convolutional_Neural_Networks_Can_Be_Deceived_by_Visual_Illusions_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Gomez-Villa_Convolutional_Neural_Networks_Can_Be_Deceived_by_Visual_Illusions_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-can-be-deceived
Repo	https://github.com/alviur/convnets_vs_vi
Framework	none

Fixing Implicit Derivatives: Trust-Region Based Learning of Continuous Energy Functions


Title	Fixing Implicit Derivatives: Trust-Region Based Learning of Continuous Energy Functions
Authors	Chris Russell, Matteo Toso, Neill Campbell
Abstract	We present a new technique for the learning of continuous energy functions that we refer to as Wibergian Learning. One common approach to inverse problems is to cast them as an energy minimisation problem, where the minimum cost solution found is used as an estimator of hidden parameters. Our new approach formally characterises the dependency between weights that control the shape of the energy function, and the location of minima, by describing minima as fixed points of optimisation methods. This allows for the use of gradient-based end-to- end training to integrate deep-learning and the classical inverse problem methods. We show how our approach can be applied to obtain state-of-the-art results in the diverse applications of tracker fusion and multiview 3D reconstruction.
Tasks	3D Reconstruction
Published	2019-12-01
URL	http://papers.nips.cc/paper/8427-fixing-implicit-derivatives-trust-region-based-learning-of-continuous-energy-functions
PDF	http://papers.nips.cc/paper/8427-fixing-implicit-derivatives-trust-region-based-learning-of-continuous-energy-functions.pdf
PWC	https://paperswithcode.com/paper/fixing-implicit-derivatives-trust-region
Repo	https://github.com/MatteoT90/WibergianLearning
Framework	tf

Attribute Attention for Semantic Disambiguation in Zero-Shot Learning


Title	Attribute Attention for Semantic Disambiguation in Zero-Shot Learning
Authors	Yang Liu, Jishun Guo, Deng Cai, Xiaofei He
Abstract	Zero-shot learning (ZSL) aims to accurately recognize unseen objects by learning mapping matrices that bridge the gap between visual information and semantic attributes. Previous works implicitly treat attributes equally in compatibility score while ignoring that they have different importance for discrimination, which leads to severe semantic ambiguity. Considering both low-level visual information and global class-level features that relate to this ambiguity, we propose a practical Latent Feature Guided Attribute Attention (LFGAA) framework to perform object-based attribute attention for semantic disambiguation. By distracting semantic activation in dimensions that cause ambiguity, our method outperforms existing state-of-the-art methods on AwA2, CUB and SUN datasets in both inductive and transductive settings.
Tasks	Zero-Shot Learning
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Liu_Attribute_Attention_for_Semantic_Disambiguation_in_Zero-Shot_Learning_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Liu_Attribute_Attention_for_Semantic_Disambiguation_in_Zero-Shot_Learning_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/attribute-attention-for-semantic
Repo	https://github.com/ZJULearning/AttentionZSL
Framework	pytorch

Graph-Based Semi-Supervised Learning with Non-ignorable Non-response


Title	Graph-Based Semi-Supervised Learning with Non-ignorable Non-response
Authors	Fan Zhou, Tengfei Li, Haibo Zhou, Hongtu Zhu, Ye Jieping
Abstract	Graph-based semi-supervised learning is a very powerful tool in classification tasks, while in most existing literature the labelled nodes are assumed to be randomly sampled. When the labelling status depends on the unobserved node response, ignoring the missingness can lead to significant estimation bias and handicap the classifiers. This situation is called non-ignorable non-response. To solve the problem, we propose a Graph-based joint model with Non-ignorable Non-response (GNN), followed by a joint inverse weighting estimation procedure incorporated with sampling imputation approach. Our method is proved to outperform some state-of-art models in both regression and classification problems, by simulations and real analysis on the Cora dataset.
Tasks	Imputation
Published	2019-12-01
URL	http://papers.nips.cc/paper/8924-graph-based-semi-supervised-learning-with-non-ignorable-non-response
PDF	http://papers.nips.cc/paper/8924-graph-based-semi-supervised-learning-with-non-ignorable-non-response.pdf
PWC	https://paperswithcode.com/paper/graph-based-semi-supervised-learning-with-non
Repo	https://github.com/mlzxzhou/keras-gnm
Framework	none

What Does BERT Learn about the Structure of Language?


Title	What Does BERT Learn about the Structure of Language?
Authors	Ganesh Jawahar, Beno{^\i}t Sagot, Djam{'e} Seddah
Abstract	BERT is a recent language representation model that has surprisingly performed well in diverse language understanding benchmarks. This result indicates the possibility that BERT networks capture structural information about language. In this work, we provide novel support for this claim by performing a series of experiments to unpack the elements of English language structure learned by BERT. Our findings are fourfold. BERT{'}s phrasal representation captures the phrase-level information in the lower layers. The intermediate layers of BERT compose a rich hierarchy of linguistic information, starting with surface features at the bottom, syntactic features in the middle followed by semantic features at the top. BERT requires deeper layers while tracking subject-verb agreement to handle long-term dependency problem. Finally, the compositional scheme underlying BERT mimics classical, tree-like structures.
Tasks
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1356/
PDF	https://www.aclweb.org/anthology/P19-1356
PWC	https://paperswithcode.com/paper/what-does-bert-learn-about-the-structure-of
Repo	https://github.com/ganeshjawahar/interpret_bert
Framework	pytorch

Deep Video Frame Interpolation using Cyclic Frame Generation


Title	Deep Video Frame Interpolation using Cyclic Frame Generation
Authors	Yu-Lun Liu, Yi-Tung Liao, Yen-Yu Lin, Yung-Yu Chuang1, 2
Abstract	Video frame interpolation algorithms predict intermediate frames to produce videos with higher frame rates and smooth view transitions given two consecutive frames as inputs. We propose that: synthesized frames are more reliable if they can be used to reconstruct the input frames with high quality. Based on this idea, we introduce a new loss term, the cycle consistency loss. The cycle consistency loss can better utilize the training data to not only enhance the interpolation results, but also maintain the performance better with less training data. It can be integrated into any frame interpolation network and trained in an end-to-end manner. In addition to the cycle consistency loss, we propose two extensions: motion linearity loss and edge-guided training. The motion linearity loss approximates the motion between two input frames to be linear and regularizes the training. By applying edge-guided training, we further improve results by integrating edge information into training. Both qualitative and quantitative experiments demonstrate that our model outperforms the state-of-the-art methods.
Tasks	Video Frame Interpolation
Published	2019-01-27
URL	https://www.cmlab.csie.ntu.edu.tw/~yulunliu/CyclicGen
PDF	https://www.cmlab.csie.ntu.edu.tw/~yulunliu/CyclicGen_/liu.pdf
PWC	https://paperswithcode.com/paper/deep-video-frame-interpolation-using-cyclic
Repo	https://github.com/alex04072000/CyclicGen
Framework	tf

Sentiment Classification Using Document Embeddings Trained with Cosine Similarity


Title	Sentiment Classification Using Document Embeddings Trained with Cosine Similarity
Authors	Tan Thongtan, Tanasanee Phienthrakul
Abstract	In document-level sentiment classification, each document must be mapped to a fixed length vector. Document embedding models map each document to a dense, low-dimensional vector in continuous vector space. This paper proposes training document embeddings using cosine similarity instead of dot product. Experiments on the IMDB dataset show that accuracy is improved when using cosine similarity compared to using dot product, while using feature combination with Naive Bayes weighted bag of n-grams achieves a new state of the art accuracy of 97.42{%}. Code to reproduce all experiments is available at https://github.com/tanthongtan/dv-cosine
Tasks	Document Embedding, Sentiment Analysis
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-2057/
PDF	https://www.aclweb.org/anthology/P19-2057
PWC	https://paperswithcode.com/paper/sentiment-classification-using-document
Repo	https://github.com/tanthongtan/dv-cosine
Framework	none

Linking convolutional neural networks with graph convolutional networks: application in pulmonary artery-vein separation


Title	Linking convolutional neural networks with graph convolutional networks: application in pulmonary artery-vein separation
Authors	Zhiwei Zhai, Marius Staring, Xuhui Zhou, Qiuxia Xie, Xiaojuan Xiao, M. Els Bakker, Lucia J. Kroft, Boudewijn P.F. Lelieveldt, Gudula J.A.M. Boon, Frederikus A. Klok, Berend C. Stoel
Abstract	Graph Convolutional Networks (GCNs) are a novel and powerful method for dealing with non-Euclidean data, while Convolutional Neural Networks (CNNs) can learn features from Euclidean data such as images. In this work, we propose a novel method to combine CNNs with GCNs (CNN-GCN), that can consider both Euclidean and non-Euclidean features and can be trained end-to-end. We applied this method to separate the pulmonary vascular trees into arteries and veins (A/V). Chest CT scans were pre-processed by vessel segmentation and skeletonization, from which a graph was constructed: voxels on the skeletons resulting in a vertex set and their connections in an adjacency matrix. 3D patches centered around each vertex were extracted from the CT scans, oriented perpendicularly to the vessel. The proposed CNN-GCN classifier was trained and applied on the constructed vessel graphs, where each node is then labeled as artery or vein. The proposed method was trained and validated on data from one hospital (11 patient, 22 lungs), and tested on independent data from a different hospital (10 patients, 10 lungs). A baseline CNN method and human observer performance were used for comparison. The CNN-GCN method obtained a median accuracy of 0.773 (0.738) in the validation (test) set, compared to a median accuracy of 0.817 by the observers, and 0.727 (0.693) by the CNN. In conclusion, the proposed CNN-GCN method combines local image information with graph connectivity information, improving pulmonary A/V separation over a baseline CNN method, approaching the performance of human observers.
Tasks	3D Medical Imaging Segmentation, Medical Image Segmentation, Pulmonary Artery–Vein Classification, Pulmorary Vessel Segmentation
Published	2019-09-01
URL	https://www.researchgate.net/publication/335620542_Linking_convolutional_neural_networks_with_graph_convolutional_networks_application_in_pulmonary_artery-vein_separation
PDF	https://bit.ly/2kNpbdv
PWC	https://paperswithcode.com/paper/linking-convolutional-neural-networks-with
Repo	https://github.com/chushan89/Linking-CNN-GCN
Framework	tf

Dialect Text Normalization to Normative Standard Finnish


Title	Dialect Text Normalization to Normative Standard Finnish
Authors	Niko Partanen, Mika H{"a}m{"a}l{"a}inen, Khalid Alnajjar
Abstract	We compare different LSTMs and transformer models in terms of their effectiveness in normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common way of communication for people online in Finnish, such a normalization is a necessary step to improve the accuracy of the existing Finnish NLP tools that are tailored for normative Finnish text. We work on a corpus consisting of dialectal data of 23 distinct Finnish dialects. The best functioning BRNN approach lowers the initial word error rate of the corpus from 52.89 to 5.73.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5519/
PDF	https://www.aclweb.org/anthology/D19-5519
PWC	https://paperswithcode.com/paper/dialect-text-normalization-to-normative
Repo	https://github.com/mikahama/murre
Framework	none

On Distributed Averaging for Stochastic k-PCA


Title	On Distributed Averaging for Stochastic k-PCA
Authors	Aditya Bhaskara, Pruthuvi Maheshakya Wijewardena
Abstract	In the stochastic k-PCA problem, we are given i.i.d. samples from an unknown distribution over vectors, and the goal is to compute the top k eigenvalues and eigenvectors of the moment matrix. In the simplest distributed variant, we have ‘m’ machines each of which receives ‘n’ samples. Each machine performs some computation and sends an O(k) size summary of the local dataset to a central server. The server performs an aggregation and computes the desired eigenvalues and vectors. The goal is to achieve the same effect as the server computing using m*n samples by itself. The main choices in this framework are the choice of the summary, and the method of aggregation. We consider a slight variant of the well-studied “distributed averaging” approach, and prove that this leads to significantly better bounds on the dependence between ‘n’ and the eigenvalue gaps. Our method can also be applied directly to a setting where the “right” value of the parameter k (i.e., one for which there is a non-trivial eigenvalue gap) is not known exactly. This is a common issue in practice which prior methods were unable to address.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9283-on-distributed-averaging-for-stochastic-k-pca
PDF	http://papers.nips.cc/paper/9283-on-distributed-averaging-for-stochastic-k-pca.pdf
PWC	https://paperswithcode.com/paper/on-distributed-averaging-for-stochastic-k-pca
Repo	https://github.com/maheshakya/dist-averaging-k-pca
Framework	none