February 1, 2020

2836 words 14 mins read

Paper Group AWR 140

Attributed Network Embedding via Subspace Discovery. Recursive Cascaded Networks for Unsupervised Medical Image Registration. Learning Logistic Circuits. On the number of k-skip-n-grams. RANet: Ranking Attention Network for Fast Video Object Segmentation. Diversity with Cooperation: Ensemble Methods for Few-Shot Classification. Deep convolutional n …

Attributed Network Embedding via Subspace Discovery


Title	Attributed Network Embedding via Subspace Discovery
Authors	Daokun Zhang, Jie Yin, Xingquan Zhu, Chengqi Zhang
Abstract	Network embedding aims to learn a latent, low-dimensional vector representations of network nodes, effective in supporting various network analytic tasks. While prior arts on network embedding focus primarily on preserving network topology structure to learn node representations, recently proposed attributed network embedding algorithms attempt to integrate rich node content information with network topological structure for enhancing the quality of network embedding. In reality, networks often have sparse content, incomplete node attributes, as well as the discrepancy between node attribute feature space and network structure space, which severely deteriorates the performance of existing methods. In this paper, we propose a unified framework for attributed network embedding-attri2vec-that learns node embeddings by discovering a latent node attribute subspace via a network structure guided transformation performed on the original attribute space. The resultant latent subspace can respect network structure in a more consistent way towards learning high-quality node representations. We formulate an optimization problem which is solved by an efficient stochastic gradient descent algorithm, with linear time complexity to the number of nodes. We investigate a series of linear and non-linear transformations performed on node attributes and empirically validate their effectiveness on various types of networks. Another advantage of attri2vec is its ability to solve out-of-sample problems, where embeddings of new coming nodes can be inferred from their node attributes through the learned mapping function. Experiments on various types of networks confirm that attri2vec is superior to state-of-the-art baselines for node classification, node clustering, as well as out-of-sample link prediction tasks. The source code of this paper is available at https://github.com/daokunzhang/attri2vec.
Tasks	Link Prediction, Network Embedding, Node Classification
Published	2019-01-14
URL	https://arxiv.org/abs/1901.04095v2
PDF	https://arxiv.org/pdf/1901.04095v2.pdf
PWC	https://paperswithcode.com/paper/attributed-network-embedding-via-subspace
Repo	https://github.com/daokunzhang/attri2vec
Framework	none

Recursive Cascaded Networks for Unsupervised Medical Image Registration


Title	Recursive Cascaded Networks for Unsupervised Medical Image Registration
Authors	Shengyu Zhao, Yue Dong, Eric I-Chao Chang, Yan Xu
Abstract	We present recursive cascaded networks, a general architecture that enables learning deep cascades, for deformable image registration. The proposed architecture is simple in design and can be built on any base network. The moving image is warped successively by each cascade and finally aligned to the fixed image; this procedure is recursive in a way that every cascade learns to perform a progressive deformation for the current warped image. The entire system is end-to-end and jointly trained in an unsupervised manner. In addition, enabled by the recursive architecture, one cascade can be iteratively applied for multiple times during testing, which approaches a better fit between each of the image pairs. We evaluate our method on 3D medical images, where deformable registration is most commonly applied. We demonstrate that recursive cascaded networks achieve consistent, significant gains and outperform state-of-the-art methods. The performance reveals an increasing trend as long as more cascades are trained, while the limit is not observed. Code is available at https://github.com/microsoft/Recursive-Cascaded-Networks.
Tasks	Image Registration, Medical Image Registration
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12353v3
PDF	https://arxiv.org/pdf/1907.12353v3.pdf
PWC	https://paperswithcode.com/paper/recursive-cascaded-networks-for-unsupervised
Repo	https://github.com/microsoft/Recursive-Cascaded-Networks
Framework	tf

Learning Logistic Circuits


Title	Learning Logistic Circuits
Authors	Yitao Liang, Guy Van den Broeck
Abstract	This paper proposes a new classification model called logistic circuits. On MNIST and Fashion datasets, our learning algorithm outperforms neural networks that have an order of magnitude more parameters. Yet, logistic circuits have a distinct origin in symbolic AI, forming a discriminative counterpart to probabilistic-logical circuits such as ACs, SPNs, and PSDDs. We show that parameter learning for logistic circuits is convex optimization, and that a simple local search algorithm can induce strong model structures from data.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10798v1
PDF	http://arxiv.org/pdf/1902.10798v1.pdf
PWC	https://paperswithcode.com/paper/learning-logistic-circuits
Repo	https://github.com/UCLA-StarAI/LogisticCircuit
Framework	none

On the number of k-skip-n-grams


Title	On the number of k-skip-n-grams
Authors	Dmytro Krasnoshtan
Abstract	The paper proves that the number of k-skip-n-grams for a corpus of size $L$ is $$\frac{Ln + n + k’ - n^2 - nk’}{n} \cdot \binom{n-1+k’}{n-1}$$ where $k’ = \min(L - n + 1, k)$.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05407v1
PDF	https://arxiv.org/pdf/1905.05407v1.pdf
PWC	https://paperswithcode.com/paper/on-the-number-of-k-skip-n-grams
Repo	https://github.com/salvador-dali/k-skip-n-gram
Framework	none

RANet: Ranking Attention Network for Fast Video Object Segmentation


Title	RANet: Ranking Attention Network for Fast Video Object Segmentation
Authors	Ziqin Wang, Jun Xu, Li Liu, Fan Zhu, Ling Shao
Abstract	Despite online learning (OL) techniques have boosted the performance of semi-supervised video object segmentation (VOS) methods, the huge time costs of OL greatly restrict their practicality. Matching based and propagation based methods run at a faster speed by avoiding OL techniques. However, they are limited by sub-optimal accuracy, due to mismatching and drifting problems. In this paper, we develop a real-time yet very accurate Ranking Attention Network (RANet) for VOS. Specifically, to integrate the insights of matching based and propagation based methods, we employ an encoder-decoder framework to learn pixel-level similarity and segmentation in an end-to-end manner. To better utilize the similarity maps, we propose a novel ranking attention module, which automatically ranks and selects these maps for fine-grained VOS performance. Experiments on DAVIS-16 and DAVIS-17 datasets show that our RANet achieves the best speed-accuracy trade-off, e.g., with 33 milliseconds per frame and J&F=85.5% on DAVIS-16. With OL, our RANet reaches J&F=87.1% on DAVIS-16, exceeding state-of-the-art VOS methods. The code can be found at https://github.com/Storife/RANet.
Tasks	Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06647v4
PDF	https://arxiv.org/pdf/1908.06647v4.pdf
PWC	https://paperswithcode.com/paper/ranet-ranking-attention-network-for-fast
Repo	https://github.com/Storife/RANet
Framework	pytorch

Diversity with Cooperation: Ensemble Methods for Few-Shot Classification


Title	Diversity with Cooperation: Ensemble Methods for Few-Shot Classification
Authors	Nikita Dvornik, Cordelia Schmid, Julien Mairal
Abstract	Few-shot classification consists of learning a predictive model that is able to effectively adapt to a new class, given only a few annotated samples. To solve this challenging problem, meta-learning has become a popular paradigm that advocates the ability to “learn to adapt”. Recent works have shown, however, that simple learning strategies without meta-learning could be competitive. In this paper, we go a step further and show that by addressing the fundamental high-variance issue of few-shot learning classifiers, it is possible to significantly outperform current meta-learning techniques. Our approach consists of designing an ensemble of deep networks to leverage the variance of the classifiers, and introducing new strategies to encourage the networks to cooperate, while encouraging prediction diversity. Evaluation is conducted on the mini-ImageNet and CUB datasets, where we show that even a single network obtained by distillation yields state-of-the-art results.
Tasks	Few-Shot Learning, Meta-Learning
Published	2019-03-27
URL	https://arxiv.org/abs/1903.11341v2
PDF	https://arxiv.org/pdf/1903.11341v2.pdf
PWC	https://paperswithcode.com/paper/diversity-with-cooperation-ensemble-methods
Repo	https://github.com/dvornikita/fewshot_ensemble
Framework	pytorch

Deep convolutional neural networks for multi-scale time-series classification and application to disruption prediction in fusion devices


Title	Deep convolutional neural networks for multi-scale time-series classification and application to disruption prediction in fusion devices
Authors	R. M. Churchill, the DIII-D team
Abstract	The multi-scale, mutli-physics nature of fusion plasmas makes predicting plasma events challenging. Recent advances in deep convolutional neural network architectures (CNN) utilizing dilated convolutions enable accurate predictions on sequences which have long-range, multi-scale characteristics, such as the time-series generated by diagnostic instruments observing fusion plasmas. Here we apply this neural network architecture to the popular problem of disruption prediction in fusion tokamaks, utilizing raw data from a single diagnostic, the Electron Cyclotron Emission imaging (ECEi) diagnostic from the DIII-D tokamak. ECEi measures a fundamental plasma quantity (electron temperature) with high temporal resolution over the entire plasma discharge, making it sensitive to a number of potential pre-disruptions markers with different temporal and spatial scales. Promising, initial disruption prediction results are obtained training a deep CNN with large receptive field (~30k), achieving an $F_1$-score of ~91% on individual time-slices using only the ECEi data.
Tasks	Time Series, Time Series Classification
Published	2019-10-31
URL	https://arxiv.org/abs/1911.00149v2
PDF	https://arxiv.org/pdf/1911.00149v2.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-neural-networks-for-multi
Repo	https://github.com/rmchurch/disruptCNN
Framework	pytorch

Attention Guided Graph Convolutional Networks for Relation Extraction


Title	Attention Guided Graph Convolutional Networks for Relation Extraction
Authors	Yan Zhang, Zhijiang Guo, Wei Lu
Abstract	Dependency trees convey rich structural information that is proven useful for extracting relations among entities in text. However, how to effectively make use of relevant information while ignoring irrelevant information from the dependency trees remains a challenging research question. Existing approaches employing rule based hard-pruning strategies for selecting relevant partial dependency structures may not always yield optimal results. In this work, we propose Attention Guided Graph Convolutional Networks (AGGCNs), a novel model which directly takes full dependency trees as inputs. Our model can be understood as a soft-pruning approach that automatically learns how to selectively attend to the relevant sub-structures useful for the relation extraction task. Extensive results on various tasks including cross-sentence n-ary relation extraction and large-scale sentence-level relation extraction show that our model is able to better leverage the structural information of the full dependency trees, giving significantly better results than previous approaches.
Tasks	Relation Extraction
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07510v6
PDF	https://arxiv.org/pdf/1906.07510v6.pdf
PWC	https://paperswithcode.com/paper/attention-guided-graph-convolutional-networks
Repo	https://github.com/Cartus/AGGCN_TACRED
Framework	pytorch

WALL-E: An Efficient Reinforcement Learning Research Framework


Title	WALL-E: An Efficient Reinforcement Learning Research Framework
Authors	Tianbing Xu, Andrew Zhang, Liang Zhao
Abstract	There are two halves to RL systems: experience collection time and policy learning time. For a large number of samples in rollouts, experience collection time is the major bottleneck. Thus, it is necessary to speed up the rollout generation time with multi-process architecture support. Our work, dubbed WALL-E, utilizes multiple rollout samplers running in parallel to rapidly generate experience. Due to our parallel samplers, we experience not only faster convergence times, but also higher average reward thresholds. For example, on the MuJoCo HalfCheetah-v2 task, with $N = 10$ parallel sampler processes, we are able to achieve much higher average return than those from using only a single process architecture.
Tasks
Published	2019-01-18
URL	http://arxiv.org/abs/1901.06086v2
PDF	http://arxiv.org/pdf/1901.06086v2.pdf
PWC	https://paperswithcode.com/paper/wall-e-an-efficient-reinforcement-learning
Repo	https://github.com/harrybraviner/self_directed_rl
Framework	tf

Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function


Title	Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function
Authors	Yusu Qian, Urwa Muaz, Ben Zhang, Jae Won Hyun
Abstract	Gender bias exists in natural language datasets which neural language models tend to learn, resulting in biased text generation. In this research, we propose a debiasing approach based on the loss function modification. We introduce a new term to the loss function which attempts to equalize the probabilities of male and female words in the output. Using an array of bias evaluation metrics, we provide empirical evidence that our approach successfully mitigates gender bias in language models without increasing perplexity. In comparison to existing debiasing strategies, data augmentation, and word embedding debiasing, our method performs better in several aspects, especially in reducing gender bias in occupation words. Finally, we introduce a combination of data augmentation and our approach, and show that it outperforms existing strategies in all bias evaluation metrics.
Tasks	Data Augmentation, Text Generation
Published	2019-05-30
URL	https://arxiv.org/abs/1905.12801v2
PDF	https://arxiv.org/pdf/1905.12801v2.pdf
PWC	https://paperswithcode.com/paper/reducing-gender-bias-in-word-level-language
Repo	https://github.com/sueqian6/Reducing-Gender-Bias-in-Word-Level-Language-Models-Using-A-Gender-Equalizing-Loss-Function
Framework	pytorch

Training language GANs from Scratch


Title	Training language GANs from Scratch
Authors	Cyprien de Masson d’Autume, Mihaela Rosca, Jack Rae, Shakir Mohamed
Abstract	Generative Adversarial Networks (GANs) enjoy great success at image generation, but have proven difficult to train in the domain of natural language. Challenges with gradient estimation, optimization instability, and mode collapse have lead practitioners to resort to maximum likelihood pre-training, followed by small amounts of adversarial fine-tuning. The benefits of GAN fine-tuning for language generation are unclear, as the resulting models produce comparable or worse samples than traditional language models. We show it is in fact possible to train a language GAN from scratch – without maximum likelihood pre-training. We combine existing techniques such as large batch sizes, dense rewards and discriminator regularization to stabilize and improve language GANs. The resulting model, ScratchGAN, performs comparably to maximum likelihood training on EMNLP2017 News and WikiText-103 corpora according to quality and diversity metrics.
Tasks	Image Generation, Text Generation
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09922v2
PDF	https://arxiv.org/pdf/1905.09922v2.pdf
PWC	https://paperswithcode.com/paper/training-language-gans-from-scratch
Repo	https://github.com/yaushian/Unparalleled-Text-Summarization-using-GAN
Framework	tf

Relation Embedding with Dihedral Group in Knowledge Graph


Title	Relation Embedding with Dihedral Group in Knowledge Graph
Authors	Canran Xu, Ruijiang Li
Abstract	Link prediction is critical for the application of incomplete knowledge graph (KG) in the downstream tasks. As a family of effective approaches for link predictions, embedding methods try to learn low-rank representations for both entities and relations such that the bilinear form defined therein is a well-behaved scoring function. Despite of their successful performances, existing bilinear forms overlook the modeling of relation compositions, resulting in lacks of interpretability for reasoning on KG. To fulfill this gap, we propose a new model called DihEdral, named after dihedral symmetry group. This new model learns knowledge graph embeddings that can capture relation compositions by nature. Furthermore, our approach models the relation embeddings parametrized by discrete values, thereby decrease the solution space drastically. Our experiments show that DihEdral is able to capture all desired properties such as (skew-) symmetry, inversion and (non-) Abelian composition, and outperforms existing bilinear form based approach and is comparable to or better than deep learning models such as ConvE.
Tasks	Knowledge Graph Embeddings, Link Prediction
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00687v1
PDF	https://arxiv.org/pdf/1906.00687v1.pdf
PWC	https://paperswithcode.com/paper/190600687
Repo	https://github.com/Chandrahasd/KnowledgeGraphEmbeddings
Framework	none

AutoGrow: Automatic Layer Growing in Deep Convolutional Networks


Title	AutoGrow: Automatic Layer Growing in Deep Convolutional Networks
Authors	Wei Wen, Feng Yan, Hai Li
Abstract	Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We propose AutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture, AutoGrow grows new layers if the growth improves the accuracy; otherwise, stops growing and thus discovers the depth. We propose robust growing and stopping policies to generalize to different network architectures and datasets. Our experiments show that by applying the same policy to different network architectures, AutoGrow can always discover near-optimal depth on various datasets of MNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and ImageNet. For example, in terms of accuracy-computation trade-off, AutoGrow discovers a better depth combination in ResNets than human experts. Our AutoGrow is efficient. It discovers depth within similar time of training a single DNN.
Tasks	Neural Architecture Search
Published	2019-06-07
URL	https://arxiv.org/abs/1906.02909v2
PDF	https://arxiv.org/pdf/1906.02909v2.pdf
PWC	https://paperswithcode.com/paper/autogrow-automatic-layer-growing-in-deep
Repo	https://github.com/wenwei202/autogrow
Framework	pytorch

A Flexible Framework for Anomaly Detection via Dimensionality Reduction


Title	A Flexible Framework for Anomaly Detection via Dimensionality Reduction
Authors	Alireza Vafaei Sadr, Bruce A. Bassett, Martin Kunz
Abstract	Anomaly detection is challenging, especially for large datasets in high dimensions. Here we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. We release DRAMA, a general python package that implements the general framework with a wide range of built-in options. We test DRAMA on a wide variety of simulated and real datasets, in up to 3000 dimensions, and find it robust and highly competitive with commonly-used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning and highly unbalanced datasets.
Tasks	Active Learning, Anomaly Detection, Dimensionality Reduction
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04060v1
PDF	https://arxiv.org/pdf/1909.04060v1.pdf
PWC	https://paperswithcode.com/paper/a-flexible-framework-for-anomaly-detection
Repo	https://github.com/vafaei-ar/drama
Framework	none

A Question-Entailment Approach to Question Answering


Title	A Question-Entailment Approach to Question Answering
Authors	Asma Ben Abacha, Dina Demner-Fushman
Abstract	One of the challenges in large-scale information retrieval (IR) is to develop fine-grained and domain-specific methods to answer natural language questions. Despite the availability of numerous sources and datasets for answer retrieval, Question Answering (QA) remains a challenging problem due to the difficulty of the question understanding and answer extraction tasks. One of the promising tracks investigated in QA is to map new questions to formerly answered questions that are `similar’. In this paper, we propose a novel QA approach based on Recognizing Question Entailment (RQE) and we describe the QA system and resources that we built and evaluated on real medical questions. First, we compare machine learning and deep learning methods for RQE using different kinds of datasets, including textual inference, question similarity and entailment in both the open and clinical domains. Second, we combine IR models with the best RQE method to select entailed questions and rank the retrieved answers. To study the end-to-end QA approach, we built the MedQuAD collection of 47,457 question-answer pairs from trusted medical sources, that we introduce and share in the scope of this paper. Following the evaluation process used in TREC 2017 LiveQA, we find that our approach exceeds the best results of the medical task with a 29.8% increase over the best official score. The evaluation results also support the relevance of question entailment for QA and highlight the effectiveness of combining IR and RQE for future QA efforts. Our findings also show that relying on a restricted set of reliable answer sources can bring a substantial improvement in medical QA. \|
Tasks	Information Retrieval, Question Answering, Question Similarity
Published	2019-01-23
URL	http://arxiv.org/abs/1901.08079v1
PDF	http://arxiv.org/pdf/1901.08079v1.pdf
PWC	https://paperswithcode.com/paper/a-question-entailment-approach-to-question
Repo	https://github.com/abachaa/MedQuAD
Framework	none