April 3, 2020

3291 words 16 mins read

Paper Group AWR 43

Multi-label learning for dynamic model type recommendation. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. Exact Information Bottleneck with Invertible Neural Networks: Getting the Best of Discriminative and Generative Modeling. Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions. Ret …

Multi-label learning for dynamic model type recommendation


Title	Multi-label learning for dynamic model type recommendation
Authors	Mariana A. Souza, Robert Sabourin, George D. C. Cavalcanti, Rafael M. O. Cruz
Abstract	Dynamic selection techniques aim at selecting the local experts around each test sample in particular for performing its classification. While generating the classifier on a local scope may make it easier for singling out the locally competent ones, as in the online local pool (OLP) technique, using the same base-classifier model in uneven distributions may restrict the local level of competence, since each region may have a data distribution that favors one model over the others. Thus, we propose in this work a problem-independent dynamic base-classifier model recommendation for the OLP technique, which uses information regarding the behavior of a portfolio of models over the samples of different problems to recommend one (or several) of them on a per-instance manner. Our proposed framework builds a multi-label meta-classifier responsible for recommending a set of relevant model types based on the local data complexity of the region surrounding each test sample. The OLP technique then produces a local pool with the model that yields the highest probability score of the meta-classifier. Experimental results show that different data distributions favored different model types on a local scope. Moreover, based on the performance of an ideal model type selector, it was observed that there is a clear advantage in choosing a relevant model type for each test instance. Overall, the proposed model type recommender system yielded a statistically similar performance to the original OLP with fixed base-classifier model. Given the novelty of the approach and the gap in performance between the proposed framework and the ideal selector, we regard this as a promising research direction. Code available at github.com/marianaasouza/dynamic-model-recommender.
Tasks	Multi-Label Learning, Recommendation Systems
Published	2020-04-01
URL	https://arxiv.org/abs/2004.00558v1
PDF	https://arxiv.org/pdf/2004.00558v1.pdf
PWC	https://paperswithcode.com/paper/multi-label-learning-for-dynamic-model-type
Repo	https://github.com/marianaasouza/dynamic-model-recommender
Framework	none

MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding


Title	MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding
Authors	Xinyu Fu, Jiani Zhang, Ziqiao Meng, Irwin King
Abstract	A large number of real-world graphs or networks are inherently heterogeneous, involving a diversity of node types and relation types. Heterogeneous graph embedding is to embed rich structural and semantic information of a heterogeneous graph into low-dimensional node representations. Existing models usually define multiple metapaths in a heterogeneous graph to capture the composite relations and guide neighbor selection. However, these models either omit node content features, discard intermediate nodes along the metapath, or only consider one metapath. To address these three limitations, we propose a new model named Metapath Aggregated Graph Neural Network (MAGNN) to boost the final performance. Specifically, MAGNN employs three major components, i.e., the node content transformation to encapsulate input node attributes, the intra-metapath aggregation to incorporate intermediate semantic nodes, and the inter-metapath aggregation to combine messages from multiple metapaths. Extensive experiments on three real-world heterogeneous graph datasets for node classification, node clustering, and link prediction show that MAGNN achieves more accurate prediction results than state-of-the-art baselines.
Tasks	Graph Embedding, Link Prediction, Node Classification
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01680v2
PDF	https://arxiv.org/pdf/2002.01680v2.pdf
PWC	https://paperswithcode.com/paper/magnn-metapath-aggregated-graph-neural
Repo	https://github.com/cynricfu/MAGNN
Framework	pytorch

Exact Information Bottleneck with Invertible Neural Networks: Getting the Best of Discriminative and Generative Modeling


Title	Exact Information Bottleneck with Invertible Neural Networks: Getting the Best of Discriminative and Generative Modeling
Authors	Lynton Ardizzone, Radek Mackowiak, Carsten Rother, Ullrich Köthe
Abstract	Generative models are more informative about underlying phenomena than discriminative ones and offer superior uncertainty quantification and out-of-distribution robustness. However, these advantages often come at the expense of reduced classification accuracy. The Information Bottleneck objective (IB) formulates this trade-off in a clean information-theoretic way, but its practical application is hampered by a lack of accurate high-dimensional estimators of mutual information (MI), its main constituent. To overcome this limitation, we develop the theory and methodology of IB-INNs, which optimize the IB objective by means of Invertible Neural Networks (INNs), without the need for approximations of MI. Our experiments show that IB-INNs allow for a precise adjustment of the generative/discriminative trade-off: They learn accurate models of the class conditional likelihoods, generalize well to unseen data and reliably detect out-of-distribution examples, while at the same time exhibiting classification accuracy close to purely discriminative feed-forward networks.
Tasks
Published	2020-01-17
URL	https://arxiv.org/abs/2001.06448v3
PDF	https://arxiv.org/pdf/2001.06448v3.pdf
PWC	https://paperswithcode.com/paper/exact-information-bottleneck-with-invertible
Repo	https://github.com/VLL-HD/FrEIA
Framework	pytorch

Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions


Title	Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions
Authors	Matheus Gadelha, Aruni RoyChowdhury, Gopal Sharma, Evangelos Kalogerakis, Liangliang Cao, Erik Learned-Miller, Rui Wang, Subhransu Maji
Abstract	The problems of shape classification and part segmentation from 3D point clouds have garnered increasing attention in the last few years. But both of these problems suffer from relatively small training sets, creating the need for statistically efficient methods to learn 3D shape representations. In this work, we investigate the use of Approximate Convex Decompositions (ACD) as a self-supervisory signal for label-efficient learning of point cloud representations. Decomposing a 3D shape into simpler constituent parts or primitives is a fundamental problem in geometrical shape processing. There has been extensive work on such decompositions, where the criterion for simplicity of a constituent shape is often defined in terms of convexity for solid primitives. In this paper, we show that using the results of ACD to approximate a ground truth segmentation provides excellent self-supervision for learning 3D point cloud representations that are highly effective on downstream tasks. We report improvements over the state-of-theart in unsupervised representation learning on the ModelNet40 shape classification dataset and significant gains in few-shot part segmentation on the ShapeNetPart dataset. Code available at https://github.com/matheusgadelha/PointCloudLearningACD
Tasks	Representation Learning, Unsupervised Representation Learning
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13834v1
PDF	https://arxiv.org/pdf/2003.13834v1.pdf
PWC	https://paperswithcode.com/paper/label-efficient-learning-on-point-clouds
Repo	https://github.com/matheusgadelha/PointCloudLearningACD
Framework	none

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study


Title	Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study
Authors	Jinlan Fu, Pengfei Liu, Qi Zhang, Xuanjing Huang
Abstract	While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers and classifies them into different research topics: https://github.com/pfliu-nlp/Named-Entity-Recognition-NER-Papers.
Tasks	Named Entity Recognition
Published	2020-01-12
URL	https://arxiv.org/abs/2001.03844v1
PDF	https://arxiv.org/pdf/2001.03844v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-generalization-of-neural-models-a
Repo	https://github.com/pfliu-nlp/Named-Entity-Recognition-NER-Papers
Framework	none

Deformable Groupwise Image Registration using Low-Rank and Sparse Decomposition


Title	Deformable Groupwise Image Registration using Low-Rank and Sparse Decomposition
Authors	Roland Haase, Stefan Heldmann, Jan Lellmann
Abstract	Low-rank and sparse decompositions and robust PCA (RPCA) are highly successful techniques in image processing and have recently found use in groupwise image registration. In this paper, we investigate the drawbacks of the most common RPCA-dissimi-larity metric in image registration and derive an improved version. In particular, this new metric models low-rank requirements through explicit constraints instead of penalties and thus avoids the pitfalls of the established metric. Equipped with total variation regularization, we present a theoretically justified multilevel scheme based on first-order primal-dual optimization to solve the resulting non-parametric registration problem. As confirmed by numerical experiments, our metric especially lends itself to data involving recurring changes in object appearance and potential sparse perturbations. We numerically compare its peformance to a number of related approaches.
Tasks	Image Registration
Published	2020-01-10
URL	https://arxiv.org/abs/2001.03509v1
PDF	https://arxiv.org/pdf/2001.03509v1.pdf
PWC	https://paperswithcode.com/paper/deformable-groupwise-image-registration-using
Repo	https://github.com/roland1993/d_RPCA
Framework	none

Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking


Title	Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking
Authors	Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury
Abstract	Search engines operate under a strict time constraint as a fast response is paramount to user satisfaction. Thus, neural re-ranking models have a limited time-budget to re-rank documents. Given the same amount of time, a faster re-ranking model can incorporate more documents than a less efficient one, leading to a higher effectiveness. To utilize this property, we propose TK (Transformer-Kernel): a neural re-ranking model for ad-hoc search using an efficient contextualization mechanism. TK employs a very small number of Transformer layers (up to three) to contextualize query and document word embeddings. To score individual term interactions, we use a document-length enhanced kernel-pooling, which enables users to gain insight into the model. TK offers an optimal ratio between effectiveness and efficiency: under realistic time constraints (max. 200 ms per query) TK achieves the highest effectiveness in comparison to BERT and other re-ranking models. We demonstrate this on three large-scale ranking collections: MSMARCO-Passage, MSMARCO-Document, and TREC CAR. In addition, to gain insight into TK, we perform a clustered query analysis of TK’s results, highlighting its strengths and weaknesses on queries with different types of information need and we show how to interpret the cause of ranking differences of two documents by comparing their internal scores.
Tasks	Word Embeddings
Published	2020-02-04
URL	https://arxiv.org/abs/2002.01854v1
PDF	https://arxiv.org/pdf/2002.01854v1.pdf
PWC	https://paperswithcode.com/paper/interpretable-time-budget-constrained
Repo	https://github.com/sebastian-hofstaetter/transformer-kernel-ranking
Framework	pytorch

Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach


Title	Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach
Authors	Lei Chen, Le Wu, Richang Hong, Kun Zhang, Meng Wang
Abstract	Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering (CF) based Recommender Systems (RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LRGCCF.
Tasks	Recommendation Systems, Representation Learning
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10167v1
PDF	https://arxiv.org/pdf/2001.10167v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-graph-based-collaborative
Repo	https://github.com/newlei/LR-GCCF
Framework	pytorch

A Framework for End-to-End Learning on Semantic Tree-Structured Data


Title	A Framework for End-to-End Learning on Semantic Tree-Structured Data
Authors	William Woof, Ke Chen
Abstract	While learning models are typically studied for inputs in the form of a fixed dimensional feature vector, real world data is rarely found in this form. In order to meet the basic requirement of traditional learning models, structural data generally have to be converted into fix-length vectors in a handcrafted manner, which is tedious and may even incur information loss. A common form of structured data is what we term “semantic tree-structures”, corresponding to data where rich semantic information is encoded in a compositional manner, such as those expressed in JavaScript Object Notation (JSON) and eXtensible Markup Language (XML). For tree-structured data, several learning models have been studied to allow for working directly on raw tree-structure data, However such learning models are limited to either a specific tree-topology or a specific tree-structured data format, e.g., synthetic parse trees. In this paper, we propose a novel framework for end-to-end learning on generic semantic tree-structured data of arbitrary topology and heterogeneous data types, such as data expressed in JSON, XML and so on. Motivated by the works in recursive and recurrent neural networks, we develop exemplar neural implementations of our framework for the JSON format. We evaluate our approach on several UCI benchmark datasets, including ablation and data-efficiency studies, and on a toy reinforcement learning task. Experimental results suggest that our framework yields comparable performance to use of standard models with dedicated feature-vectors in general, and even exceeds baseline performance in cases where compositional nature of the data is particularly important. The source code for a JSON-based implementation of our framework along with experiments can be downloaded at https://github.com/EndingCredits/json2vec.
Tasks
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05707v1
PDF	https://arxiv.org/pdf/2002.05707v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-end-to-end-learning-on
Repo	https://github.com/EndingCredits/json2vec
Framework	pytorch


Title	Ultrasound-Guided Robotic Navigation with Deep Reinforcement Learning
Authors	Hannes Hase, Mohammad Farid Azampour, Maria Tirindelli, Magdalini Paschali, Walter Simson, Emad Fatemizadeh, Nassir Navab
Abstract	In this paper we introduce the first reinforcement learning (RL) based robotic navigation method which utilizes ultrasound (US) images as an input. Our approach combines state-of-the-art RL techniques, specifically deep Q-networks (DQN) with memory buffers and a binary classifier for deciding when to terminate the task. Our method is trained and evaluated on an in-house collected data-set of 34 volunteers and when compared to pure RL and supervised learning (SL) techniques, it performs substantially better, which highlights the suitability of RL navigation for US-guided procedures. When testing our proposed model, we obtained a 82.91% chance of navigating correctly to the sacrum from 165 different starting positions on 5 different unseen simulated environments.
Tasks
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13321v1
PDF	https://arxiv.org/pdf/2003.13321v1.pdf
PWC	https://paperswithcode.com/paper/ultrasound-guided-robotic-navigation-with
Repo	https://github.com/hhase/spinal-navigation-rl
Framework	tf

Deep Image Compression using Decoder Side Information


Title	Deep Image Compression using Decoder Side Information
Authors	Sharon Ayzik, Shai Avidan
Abstract	We present a Deep Image Compression neural network that relies on side information, which is only available to the decoder. We base our algorithm on the assumption that the image available to the encoder and the image available to the decoder are correlated, and we let the network learn these correlations in the training phase. Then, at run time, the encoder side encodes the input image without knowing anything about the decoder side image and sends it to the decoder. The decoder then uses the encoded input image and the side information image to reconstruct the original image. This problem is known as Distributed Source Coding in Information Theory, and we discuss several use cases for this technology. We compare our algorithm to several image compression algorithms and show that adding decoder-only side information does indeed improve results. Our code is publicly available at https://github.com/ayziksha/DSIN.
Tasks	Image Compression
Published	2020-01-14
URL	https://arxiv.org/abs/2001.04753v1
PDF	https://arxiv.org/pdf/2001.04753v1.pdf
PWC	https://paperswithcode.com/paper/deep-image-compression-using-decoder-side
Repo	https://github.com/ayziksha/DSIN
Framework	tf

Show, Edit and Tell: A Framework for Editing Image Captions


Title	Show, Edit and Tell: A Framework for Editing Image Captions
Authors	Fawaz Sammani, Luke Melas-Kyriazi
Abstract	Most image captioning frameworks generate captions directly from images, learning a mapping from visual features to natural language. However, editing existing captions can be easier than generating new ones from scratch. Intuitively, when editing captions, a model is not required to learn information that is already present in the caption (i.e. sentence structure), enabling it to focus on fixing details (e.g. replacing repetitive words). This paper proposes a novel approach to image captioning based on iterative adaptive refinement of an existing caption. Specifically, our caption-editing model consisting of two sub-modules: (1) EditNet, a language module with an adaptive copy mechanism (Copy-LSTM) and a Selective Copy Memory Attention mechanism (SCMA), and (2) DCNet, an LSTM-based denoising auto-encoder. These components enable our model to directly copy from and modify existing captions. Experiments demonstrate that our new approach achieves state-of-art performance on the MS COCO dataset both with and without sequence-level training.
Tasks	Denoising, Image Captioning
Published	2020-03-06
URL	https://arxiv.org/abs/2003.03107v1
PDF	https://arxiv.org/pdf/2003.03107v1.pdf
PWC	https://paperswithcode.com/paper/show-edit-and-tell-a-framework-for-editing-1
Repo	https://github.com/fawazsammani/show-edit-tell
Framework	pytorch

Evolving Neural Networks through a Reverse Encoding Tree


Title	Evolving Neural Networks through a Reverse Encoding Tree
Authors	Haoling Zhang, Chao-Han Huck Yang, Hector Zenil, Narsis A. Kiani, Yue Shen, Jesper N. Tegner
Abstract	NeuroEvolution is one of the most competitive evolutionary learning frameworks for designing novel neural networks for use in specific tasks, such as logic circuit design and digital gaming. However, the application of benchmark methods such as the NeuroEvolution of Augmenting Topologies (NEAT) remains a challenge, in terms of their computational cost and search time inefficiency. This paper advances a method which incorporates a type of topological edge coding, named Reverse Encoding Tree (RET), for evolving scalable neural networks efficiently. Using RET, two types of approaches – NEAT with Binary search encoding (Bi-NEAT) and NEAT with Golden-Section search encoding (GS-NEAT) – have been designed to solve problems in benchmark continuous learning environments such as logic gates, Cartpole, and Lunar Lander, and tested against classical NEAT and FS-NEAT as baselines. Additionally, we conduct a robustness test to evaluate the resilience of the proposed NEAT algorithms. The results show that the two proposed strategies deliver improved performance, characterized by (1) a higher accumulated reward within a finite number of time steps; (2) using fewer episodes to solve problems in targeted environments, and (3) maintaining adaptive robustness under noisy perturbations, which outperform the baselines in all tested cases. Our analysis also demonstrates that RET expends potential future research directions in dynamic environments. Code is available from https://github.com/HaolingZHANG/ReverseEncodingTree.
Tasks
Published	2020-02-03
URL	https://arxiv.org/abs/2002.00539v2
PDF	https://arxiv.org/pdf/2002.00539v2.pdf
PWC	https://paperswithcode.com/paper/evolving-neural-networks-through-a-reverse
Repo	https://github.com/HaolingZHANG/ReverseEncodingTree
Framework	none

A Clustering Framework for Lexical Normalization of Roman Urdu


Title	A Clustering Framework for Lexical Normalization of Roman Urdu
Authors	Abdul Rafae Khan, Asim Karim, Hassan Sajjad, Faisal Kamiran, Jia Xu
Abstract	Roman Urdu is an informal form of the Urdu language written in Roman script, which is widely used in South Asia for online textual content. It lacks standard spelling and hence poses several normalization challenges during automatic language processing. In this article, we present a feature-based clustering framework for the lexical normalization of Roman Urdu corpora, which includes a phonetic algorithm UrduPhone, a string matching component, a feature-based similarity function, and a clustering algorithm Lex-Var. UrduPhone encodes Roman Urdu strings to their pronunciation-based representations. The string matching component handles character-level variations that occur when writing Urdu using Roman script.
Tasks	Lexical Normalization
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00088v1
PDF	https://arxiv.org/pdf/2004.00088v1.pdf
PWC	https://paperswithcode.com/paper/a-clustering-framework-for-lexical
Repo	https://github.com/abdulrafae/normalization
Framework	none

QRMine: A python package for triangulation in Grounded Theory


Title	QRMine: A python package for triangulation in Grounded Theory
Authors	Bell Raj Eapen, Norm Archer, Kamran Sartipi
Abstract	Grounded theory (GT) is a qualitative research method for building theory grounded in data. GT uses textual and numeric data and follows various stages of coding or tagging data for sense-making, such as open coding and selective coding. Machine Learning (ML) techniques, including natural language processing (NLP), can assist the researchers in the coding process. Triangulation is the process of combining various types of data. ML can facilitate deriving insights from numerical data for corroborating findings from the textual interview transcripts. We present an open-source python package (QRMine) that encapsulates various ML and NLP libraries to support coding and triangulation in GT. QRMine enables researchers to use these methods on their data with minimal effort. Researchers can install QRMine from the python package index (PyPI) and can contribute to its development. We believe that the concept of computational triangulation will make GT relevant in the realm of big data.
Tasks
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13519v1
PDF	https://arxiv.org/pdf/2003.13519v1.pdf
PWC	https://paperswithcode.com/paper/qrmine-a-python-package-for-triangulation-in
Repo	https://github.com/dermatologist/nlp-qrmine
Framework	none