Paper Group AWR 140
Attributed Network Embedding via Subspace Discovery. Recursive Cascaded Networks for Unsupervised Medical Image Registration. Learning Logistic Circuits. On the number of k-skip-n-grams. RANet: Ranking Attention Network for Fast Video Object Segmentation. Diversity with Cooperation: Ensemble Methods for Few-Shot Classification. Deep convolutional n …
Attributed Network Embedding via Subspace Discovery
Title | Attributed Network Embedding via Subspace Discovery |
Authors | Daokun Zhang, Jie Yin, Xingquan Zhu, Chengqi Zhang |
Abstract | Network embedding aims to learn a latent, low-dimensional vector representations of network nodes, effective in supporting various network analytic tasks. While prior arts on network embedding focus primarily on preserving network topology structure to learn node representations, recently proposed attributed network embedding algorithms attempt to integrate rich node content information with network topological structure for enhancing the quality of network embedding. In reality, networks often have sparse content, incomplete node attributes, as well as the discrepancy between node attribute feature space and network structure space, which severely deteriorates the performance of existing methods. In this paper, we propose a unified framework for attributed network embedding-attri2vec-that learns node embeddings by discovering a latent node attribute subspace via a network structure guided transformation performed on the original attribute space. The resultant latent subspace can respect network structure in a more consistent way towards learning high-quality node representations. We formulate an optimization problem which is solved by an efficient stochastic gradient descent algorithm, with linear time complexity to the number of nodes. We investigate a series of linear and non-linear transformations performed on node attributes and empirically validate their effectiveness on various types of networks. Another advantage of attri2vec is its ability to solve out-of-sample problems, where embeddings of new coming nodes can be inferred from their node attributes through the learned mapping function. Experiments on various types of networks confirm that attri2vec is superior to state-of-the-art baselines for node classification, node clustering, as well as out-of-sample link prediction tasks. The source code of this paper is available at https://github.com/daokunzhang/attri2vec. |
Tasks | Link Prediction, Network Embedding, Node Classification |
Published | 2019-01-14 |
URL | https://arxiv.org/abs/1901.04095v2 |
https://arxiv.org/pdf/1901.04095v2.pdf | |
PWC | https://paperswithcode.com/paper/attributed-network-embedding-via-subspace |
Repo | https://github.com/daokunzhang/attri2vec |
Framework | none |
Recursive Cascaded Networks for Unsupervised Medical Image Registration
Title | Recursive Cascaded Networks for Unsupervised Medical Image Registration |
Authors | Shengyu Zhao, Yue Dong, Eric I-Chao Chang, Yan Xu |
Abstract | We present recursive cascaded networks, a general architecture that enables learning deep cascades, for deformable image registration. The proposed architecture is simple in design and can be built on any base network. The moving image is warped successively by each cascade and finally aligned to the fixed image; this procedure is recursive in a way that every cascade learns to perform a progressive deformation for the current warped image. The entire system is end-to-end and jointly trained in an unsupervised manner. In addition, enabled by the recursive architecture, one cascade can be iteratively applied for multiple times during testing, which approaches a better fit between each of the image pairs. We evaluate our method on 3D medical images, where deformable registration is most commonly applied. We demonstrate that recursive cascaded networks achieve consistent, significant gains and outperform state-of-the-art methods. The performance reveals an increasing trend as long as more cascades are trained, while the limit is not observed. Code is available at https://github.com/microsoft/Recursive-Cascaded-Networks. |
Tasks | Image Registration, Medical Image Registration |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1907.12353v3 |
https://arxiv.org/pdf/1907.12353v3.pdf | |
PWC | https://paperswithcode.com/paper/recursive-cascaded-networks-for-unsupervised |
Repo | https://github.com/microsoft/Recursive-Cascaded-Networks |
Framework | tf |
Learning Logistic Circuits
Title | Learning Logistic Circuits |
Authors | Yitao Liang, Guy Van den Broeck |
Abstract | This paper proposes a new classification model called logistic circuits. On MNIST and Fashion datasets, our learning algorithm outperforms neural networks that have an order of magnitude more parameters. Yet, logistic circuits have a distinct origin in symbolic AI, forming a discriminative counterpart to probabilistic-logical circuits such as ACs, SPNs, and PSDDs. We show that parameter learning for logistic circuits is convex optimization, and that a simple local search algorithm can induce strong model structures from data. |
Tasks | |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10798v1 |
http://arxiv.org/pdf/1902.10798v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-logistic-circuits |
Repo | https://github.com/UCLA-StarAI/LogisticCircuit |
Framework | none |
On the number of k-skip-n-grams
Title | On the number of k-skip-n-grams |
Authors | Dmytro Krasnoshtan |
Abstract | The paper proves that the number of k-skip-n-grams for a corpus of size $L$ is $$\frac{Ln + n + k’ - n^2 - nk’}{n} \cdot \binom{n-1+k’}{n-1}$$ where $k’ = \min(L - n + 1, k)$. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05407v1 |
https://arxiv.org/pdf/1905.05407v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-number-of-k-skip-n-grams |
Repo | https://github.com/salvador-dali/k-skip-n-gram |
Framework | none |
RANet: Ranking Attention Network for Fast Video Object Segmentation
Title | RANet: Ranking Attention Network for Fast Video Object Segmentation |
Authors | Ziqin Wang, Jun Xu, Li Liu, Fan Zhu, Ling Shao |
Abstract | Despite online learning (OL) techniques have boosted the performance of semi-supervised video object segmentation (VOS) methods, the huge time costs of OL greatly restrict their practicality. Matching based and propagation based methods run at a faster speed by avoiding OL techniques. However, they are limited by sub-optimal accuracy, due to mismatching and drifting problems. In this paper, we develop a real-time yet very accurate Ranking Attention Network (RANet) for VOS. Specifically, to integrate the insights of matching based and propagation based methods, we employ an encoder-decoder framework to learn pixel-level similarity and segmentation in an end-to-end manner. To better utilize the similarity maps, we propose a novel ranking attention module, which automatically ranks and selects these maps for fine-grained VOS performance. Experiments on DAVIS-16 and DAVIS-17 datasets show that our RANet achieves the best speed-accuracy trade-off, e.g., with 33 milliseconds per frame and J&F=85.5% on DAVIS-16. With OL, our RANet reaches J&F=87.1% on DAVIS-16, exceeding state-of-the-art VOS methods. The code can be found at https://github.com/Storife/RANet. |
Tasks | Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06647v4 |
https://arxiv.org/pdf/1908.06647v4.pdf | |
PWC | https://paperswithcode.com/paper/ranet-ranking-attention-network-for-fast |
Repo | https://github.com/Storife/RANet |
Framework | pytorch |
Diversity with Cooperation: Ensemble Methods for Few-Shot Classification
Title | Diversity with Cooperation: Ensemble Methods for Few-Shot Classification |
Authors | Nikita Dvornik, Cordelia Schmid, Julien Mairal |
Abstract | Few-shot classification consists of learning a predictive model that is able to effectively adapt to a new class, given only a few annotated samples. To solve this challenging problem, meta-learning has become a popular paradigm that advocates the ability to “learn to adapt”. Recent works have shown, however, that simple learning strategies without meta-learning could be competitive. In this paper, we go a step further and show that by addressing the fundamental high-variance issue of few-shot learning classifiers, it is possible to significantly outperform current meta-learning techniques. Our approach consists of designing an ensemble of deep networks to leverage the variance of the classifiers, and introducing new strategies to encourage the networks to cooperate, while encouraging prediction diversity. Evaluation is conducted on the mini-ImageNet and CUB datasets, where we show that even a single network obtained by distillation yields state-of-the-art results. |
Tasks | Few-Shot Learning, Meta-Learning |
Published | 2019-03-27 |
URL | https://arxiv.org/abs/1903.11341v2 |
https://arxiv.org/pdf/1903.11341v2.pdf | |
PWC | https://paperswithcode.com/paper/diversity-with-cooperation-ensemble-methods |
Repo | https://github.com/dvornikita/fewshot_ensemble |
Framework | pytorch |
Deep convolutional neural networks for multi-scale time-series classification and application to disruption prediction in fusion devices
Title | Deep convolutional neural networks for multi-scale time-series classification and application to disruption prediction in fusion devices |
Authors | R. M. Churchill, the DIII-D team |
Abstract | The multi-scale, mutli-physics nature of fusion plasmas makes predicting plasma events challenging. Recent advances in deep convolutional neural network architectures (CNN) utilizing dilated convolutions enable accurate predictions on sequences which have long-range, multi-scale characteristics, such as the time-series generated by diagnostic instruments observing fusion plasmas. Here we apply this neural network architecture to the popular problem of disruption prediction in fusion tokamaks, utilizing raw data from a single diagnostic, the Electron Cyclotron Emission imaging (ECEi) diagnostic from the DIII-D tokamak. ECEi measures a fundamental plasma quantity (electron temperature) with high temporal resolution over the entire plasma discharge, making it sensitive to a number of potential pre-disruptions markers with different temporal and spatial scales. Promising, initial disruption prediction results are obtained training a deep CNN with large receptive field (~30k), achieving an $F_1$-score of ~91% on individual time-slices using only the ECEi data. |
Tasks | Time Series, Time Series Classification |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1911.00149v2 |
https://arxiv.org/pdf/1911.00149v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-neural-networks-for-multi |
Repo | https://github.com/rmchurch/disruptCNN |
Framework | pytorch |
Attention Guided Graph Convolutional Networks for Relation Extraction
Title | Attention Guided Graph Convolutional Networks for Relation Extraction |
Authors | Yan Zhang, Zhijiang Guo, Wei Lu |
Abstract | Dependency trees convey rich structural information that is proven useful for extracting relations among entities in text. However, how to effectively make use of relevant information while ignoring irrelevant information from the dependency trees remains a challenging research question. Existing approaches employing rule based hard-pruning strategies for selecting relevant partial dependency structures may not always yield optimal results. In this work, we propose Attention Guided Graph Convolutional Networks (AGGCNs), a novel model which directly takes full dependency trees as inputs. Our model can be understood as a soft-pruning approach that automatically learns how to selectively attend to the relevant sub-structures useful for the relation extraction task. Extensive results on various tasks including cross-sentence n-ary relation extraction and large-scale sentence-level relation extraction show that our model is able to better leverage the structural information of the full dependency trees, giving significantly better results than previous approaches. |
Tasks | Relation Extraction |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07510v6 |
https://arxiv.org/pdf/1906.07510v6.pdf | |
PWC | https://paperswithcode.com/paper/attention-guided-graph-convolutional-networks |
Repo | https://github.com/Cartus/AGGCN_TACRED |
Framework | pytorch |
WALL-E: An Efficient Reinforcement Learning Research Framework
Title | WALL-E: An Efficient Reinforcement Learning Research Framework |
Authors | Tianbing Xu, Andrew Zhang, Liang Zhao |
Abstract | There are two halves to RL systems: experience collection time and policy learning time. For a large number of samples in rollouts, experience collection time is the major bottleneck. Thus, it is necessary to speed up the rollout generation time with multi-process architecture support. Our work, dubbed WALL-E, utilizes multiple rollout samplers running in parallel to rapidly generate experience. Due to our parallel samplers, we experience not only faster convergence times, but also higher average reward thresholds. For example, on the MuJoCo HalfCheetah-v2 task, with $N = 10$ parallel sampler processes, we are able to achieve much higher average return than those from using only a single process architecture. |
Tasks | |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06086v2 |
http://arxiv.org/pdf/1901.06086v2.pdf | |
PWC | https://paperswithcode.com/paper/wall-e-an-efficient-reinforcement-learning |
Repo | https://github.com/harrybraviner/self_directed_rl |
Framework | tf |
Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function
Title | Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function |
Authors | Yusu Qian, Urwa Muaz, Ben Zhang, Jae Won Hyun |
Abstract | Gender bias exists in natural language datasets which neural language models tend to learn, resulting in biased text generation. In this research, we propose a debiasing approach based on the loss function modification. We introduce a new term to the loss function which attempts to equalize the probabilities of male and female words in the output. Using an array of bias evaluation metrics, we provide empirical evidence that our approach successfully mitigates gender bias in language models without increasing perplexity. In comparison to existing debiasing strategies, data augmentation, and word embedding debiasing, our method performs better in several aspects, especially in reducing gender bias in occupation words. Finally, we introduce a combination of data augmentation and our approach, and show that it outperforms existing strategies in all bias evaluation metrics. |
Tasks | Data Augmentation, Text Generation |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.12801v2 |
https://arxiv.org/pdf/1905.12801v2.pdf | |
PWC | https://paperswithcode.com/paper/reducing-gender-bias-in-word-level-language |
Repo | https://github.com/sueqian6/Reducing-Gender-Bias-in-Word-Level-Language-Models-Using-A-Gender-Equalizing-Loss-Function |
Framework | pytorch |
Training language GANs from Scratch
Title | Training language GANs from Scratch |
Authors | Cyprien de Masson d’Autume, Mihaela Rosca, Jack Rae, Shakir Mohamed |
Abstract | Generative Adversarial Networks (GANs) enjoy great success at image generation, but have proven difficult to train in the domain of natural language. Challenges with gradient estimation, optimization instability, and mode collapse have lead practitioners to resort to maximum likelihood pre-training, followed by small amounts of adversarial fine-tuning. The benefits of GAN fine-tuning for language generation are unclear, as the resulting models produce comparable or worse samples than traditional language models. We show it is in fact possible to train a language GAN from scratch – without maximum likelihood pre-training. We combine existing techniques such as large batch sizes, dense rewards and discriminator regularization to stabilize and improve language GANs. The resulting model, ScratchGAN, performs comparably to maximum likelihood training on EMNLP2017 News and WikiText-103 corpora according to quality and diversity metrics. |
Tasks | Image Generation, Text Generation |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09922v2 |
https://arxiv.org/pdf/1905.09922v2.pdf | |
PWC | https://paperswithcode.com/paper/training-language-gans-from-scratch |
Repo | https://github.com/yaushian/Unparalleled-Text-Summarization-using-GAN |
Framework | tf |
Relation Embedding with Dihedral Group in Knowledge Graph
Title | Relation Embedding with Dihedral Group in Knowledge Graph |
Authors | Canran Xu, Ruijiang Li |
Abstract | Link prediction is critical for the application of incomplete knowledge graph (KG) in the downstream tasks. As a family of effective approaches for link predictions, embedding methods try to learn low-rank representations for both entities and relations such that the bilinear form defined therein is a well-behaved scoring function. Despite of their successful performances, existing bilinear forms overlook the modeling of relation compositions, resulting in lacks of interpretability for reasoning on KG. To fulfill this gap, we propose a new model called DihEdral, named after dihedral symmetry group. This new model learns knowledge graph embeddings that can capture relation compositions by nature. Furthermore, our approach models the relation embeddings parametrized by discrete values, thereby decrease the solution space drastically. Our experiments show that DihEdral is able to capture all desired properties such as (skew-) symmetry, inversion and (non-) Abelian composition, and outperforms existing bilinear form based approach and is comparable to or better than deep learning models such as ConvE. |
Tasks | Knowledge Graph Embeddings, Link Prediction |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00687v1 |
https://arxiv.org/pdf/1906.00687v1.pdf | |
PWC | https://paperswithcode.com/paper/190600687 |
Repo | https://github.com/Chandrahasd/KnowledgeGraphEmbeddings |
Framework | none |
AutoGrow: Automatic Layer Growing in Deep Convolutional Networks
Title | AutoGrow: Automatic Layer Growing in Deep Convolutional Networks |
Authors | Wei Wen, Feng Yan, Hai Li |
Abstract | Depth is a key component of Deep Neural Networks (DNNs), however, designing depth is heuristic and requires many human efforts. We propose AutoGrow to automate depth discovery in DNNs: starting from a shallow seed architecture, AutoGrow grows new layers if the growth improves the accuracy; otherwise, stops growing and thus discovers the depth. We propose robust growing and stopping policies to generalize to different network architectures and datasets. Our experiments show that by applying the same policy to different network architectures, AutoGrow can always discover near-optimal depth on various datasets of MNIST, FashionMNIST, SVHN, CIFAR10, CIFAR100 and ImageNet. For example, in terms of accuracy-computation trade-off, AutoGrow discovers a better depth combination in ResNets than human experts. Our AutoGrow is efficient. It discovers depth within similar time of training a single DNN. |
Tasks | Neural Architecture Search |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.02909v2 |
https://arxiv.org/pdf/1906.02909v2.pdf | |
PWC | https://paperswithcode.com/paper/autogrow-automatic-layer-growing-in-deep |
Repo | https://github.com/wenwei202/autogrow |
Framework | pytorch |
A Flexible Framework for Anomaly Detection via Dimensionality Reduction
Title | A Flexible Framework for Anomaly Detection via Dimensionality Reduction |
Authors | Alireza Vafaei Sadr, Bruce A. Bassett, Martin Kunz |
Abstract | Anomaly detection is challenging, especially for large datasets in high dimensions. Here we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. We release DRAMA, a general python package that implements the general framework with a wide range of built-in options. We test DRAMA on a wide variety of simulated and real datasets, in up to 3000 dimensions, and find it robust and highly competitive with commonly-used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning and highly unbalanced datasets. |
Tasks | Active Learning, Anomaly Detection, Dimensionality Reduction |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.04060v1 |
https://arxiv.org/pdf/1909.04060v1.pdf | |
PWC | https://paperswithcode.com/paper/a-flexible-framework-for-anomaly-detection |
Repo | https://github.com/vafaei-ar/drama |
Framework | none |
A Question-Entailment Approach to Question Answering
Title | A Question-Entailment Approach to Question Answering |
Authors | Asma Ben Abacha, Dina Demner-Fushman |
Abstract | One of the challenges in large-scale information retrieval (IR) is to develop fine-grained and domain-specific methods to answer natural language questions. Despite the availability of numerous sources and datasets for answer retrieval, Question Answering (QA) remains a challenging problem due to the difficulty of the question understanding and answer extraction tasks. One of the promising tracks investigated in QA is to map new questions to formerly answered questions that are `similar’. In this paper, we propose a novel QA approach based on Recognizing Question Entailment (RQE) and we describe the QA system and resources that we built and evaluated on real medical questions. First, we compare machine learning and deep learning methods for RQE using different kinds of datasets, including textual inference, question similarity and entailment in both the open and clinical domains. Second, we combine IR models with the best RQE method to select entailed questions and rank the retrieved answers. To study the end-to-end QA approach, we built the MedQuAD collection of 47,457 question-answer pairs from trusted medical sources, that we introduce and share in the scope of this paper. Following the evaluation process used in TREC 2017 LiveQA, we find that our approach exceeds the best results of the medical task with a 29.8% increase over the best official score. The evaluation results also support the relevance of question entailment for QA and highlight the effectiveness of combining IR and RQE for future QA efforts. Our findings also show that relying on a restricted set of reliable answer sources can bring a substantial improvement in medical QA. | |
Tasks | Information Retrieval, Question Answering, Question Similarity |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.08079v1 |
http://arxiv.org/pdf/1901.08079v1.pdf | |
PWC | https://paperswithcode.com/paper/a-question-entailment-approach-to-question |
Repo | https://github.com/abachaa/MedQuAD |
Framework | none |