October 21, 2019

3290 words 16 mins read

Paper Group AWR 13

Paper Group AWR 13

A Novel Online Stacked Ensemble for Multi-Label Stream Classification. CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images. Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection. DeepType: Multilingual Entity Linking by Neural Type System Evolution. Robust Blind Deco …

A Novel Online Stacked Ensemble for Multi-Label Stream Classification

Title A Novel Online Stacked Ensemble for Multi-Label Stream Classification
Authors Alican Büyükçakır, Hamed Bonab, Fazli Can
Abstract As data streams become more prevalent, the necessity for online algorithms that mine this transient and dynamic data becomes clearer. Multi-label data stream classification is a supervised learning problem where each instance in the data stream is classified into one or more pre-defined sets of labels. Many methods have been proposed to tackle this problem, including but not limited to ensemble-based methods. Some of these ensemble-based methods are specifically designed to work with certain multi-label base classifiers; some others employ online bagging schemes to build their ensembles. In this study, we introduce a novel online and dynamically-weighted stacked ensemble for multi-label classification, called GOOWE-ML, that utilizes spatial modeling to assign optimal weights to its component classifiers. Our model can be used with any existing incremental multi-label classification algorithm as its base classifier. We conduct experiments with 4 GOOWE-ML-based multi-label ensembles and 7 baseline models on 7 real-world datasets from diverse areas of interest. Our experiments show that GOOWE-ML ensembles yield consistently better results in terms of predictive performance in almost all of the datasets, with respect to the other prominent ensemble models.
Tasks Multi-Label Classification
Published 2018-09-26
URL http://arxiv.org/abs/1809.09994v1
PDF http://arxiv.org/pdf/1809.09994v1.pdf
PWC https://paperswithcode.com/paper/a-novel-online-stacked-ensemble-for-multi
Repo https://github.com/abuyukcakir/gooweml
Framework none

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

Title CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images
Authors Sheng Guo, Weilin Huang, Haozhi Zhang, Chenfan Zhuang, Dengke Dong, Matthew R. Scott, Dinglong Huang
Abstract We present a simple yet efficient approach capable of training deep neural networks on large-scale weakly-supervised web images, which are crawled raw from the Internet by using text queries, without any human annotation. We develop a principled learning strategy by leveraging curriculum learning, with the goal of handling a massive amount of noisy labels and data imbalance effectively. We design a new learning curriculum by measuring the complexity of data using its distribution density in a feature space, and rank the complexity in an unsupervised manner. This allows for an efficient implementation of curriculum learning on large-scale web images, resulting in a high-performance CNN model, where the negative impact of noisy labels is reduced substantially. Importantly, we show by experiments that those images with highly noisy labels can surprisingly improve the generalization capability of the model, by serving as a manner of regularization. Our approaches obtain state-of-the-art performance on four benchmarks: WebVision, ImageNet, Clothing-1M and Food-101. With an ensemble of multiple models, we achieved a top-5 error rate of 5.2% on the WebVision challenge for 1000-category classification. This result was the top performance by a wide margin, outperforming second place by a nearly 50% relative error rate. Code and models are available at: https://github.com/MalongTech/CurriculumNet .
Tasks
Published 2018-08-03
URL http://arxiv.org/abs/1808.01097v4
PDF http://arxiv.org/pdf/1808.01097v4.pdf
PWC https://paperswithcode.com/paper/curriculumnet-weakly-supervised-learning-from
Repo https://github.com/guoshengcv/CurriculumNet
Framework none

Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection

Title Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection
Authors Khoi-Nguyen C. Mac, Dhiraj Joshi, Raymond A. Yeh, Jinjun Xiong, Rogerio S. Feris, Minh N. Do
Abstract Fine-grained action detection is an important task with numerous applications in robotics and human-computer interaction. Existing methods typically utilize a two-stage approach including extraction of local spatio-temporal features followed by temporal modeling to capture long-term dependencies. While most recent papers have focused on the latter (long-temporal modeling), here, we focus on producing features capable of modeling fine-grained motion more efficiently. We propose a novel locally-consistent deformable convolution, which utilizes the change in receptive fields and enforces a local coherency constraint to capture motion information effectively. Our model jointly learns spatio-temporal features (instead of using independent spatial and temporal streams). The temporal component is learned from the feature space instead of pixel space, e.g. optical flow. The produced features can be flexibly used in conjunction with other long-temporal modeling networks, e.g. ST-CNN, DilatedTCN, and ED-TCN. Overall, our proposed approach robustly outperforms the original long-temporal models on two fine-grained action datasets: 50 Salads and GTEA, achieving F1 scores of 80.22% and 75.39% respectively.
Tasks Action Detection, Fine-Grained Action Detection, Optical Flow Estimation
Published 2018-11-21
URL https://arxiv.org/abs/1811.08815v5
PDF https://arxiv.org/pdf/1811.08815v5.pdf
PWC https://paperswithcode.com/paper/locally-consistent-deformable-convolution
Repo https://github.com/bkvie/Locally-Consistent-Deformable-Convolution
Framework pytorch

DeepType: Multilingual Entity Linking by Neural Type System Evolution

Title DeepType: Multilingual Entity Linking by Neural Type System Evolution
Authors Jonathan Raiman, Olivier Raiman
Abstract The wealth of structured (e.g. Wikidata) and unstructured data about the world available today presents an incredible opportunity for tomorrow’s Artificial Intelligence. So far, integration of these two different modalities is a difficult process, involving many decisions concerning how best to represent the information so that it will be captured or useful, and hand-labeling large amounts of data. DeepType overcomes this challenge by explicitly integrating symbolic information into the reasoning process of a neural network with a type system. First we construct a type system, and second, we use it to constrain the outputs of a neural network to respect the symbolic structure. We achieve this by reformulating the design problem into a mixed integer problem: create a type system and subsequently train a neural network with it. In this reformulation discrete variables select which parent-child relations from an ontology are types within the type system, while continuous variables control a classifier fit to the type system. The original problem cannot be solved exactly, so we propose a 2-step algorithm: 1) heuristic search or stochastic optimization over discrete variables that define a type system informed by an Oracle and a Learnability heuristic, 2) gradient descent to fit classifier parameters. We apply DeepType to the problem of Entity Linking on three standard datasets (i.e. WikiDisamb30, CoNLL (YAGO), TAC KBP 2010) and find that it outperforms all existing solutions by a wide margin, including approaches that rely on a human-designed type system or recent deep learning-based entity embeddings, while explicitly using symbolic information lets it integrate new entities without retraining.
Tasks Entity Embeddings, Entity Linking, Stochastic Optimization
Published 2018-02-03
URL http://arxiv.org/abs/1802.01021v1
PDF http://arxiv.org/pdf/1802.01021v1.pdf
PWC https://paperswithcode.com/paper/deeptype-multilingual-entity-linking-by
Repo https://github.com/openai/deeptype
Framework none

Robust Blind Deconvolution via Mirror Descent

Title Robust Blind Deconvolution via Mirror Descent
Authors Sathya N. Ravi, Ronak Mehta, Vikas Singh
Abstract We revisit the Blind Deconvolution problem with a focus on understanding its robustness and convergence properties. Provable robustness to noise and other perturbations is receiving recent interest in vision, from obtaining immunity to adversarial attacks to assessing and describing failure modes of algorithms in mission critical applications. Further, many blind deconvolution methods based on deep architectures internally make use of or optimize the basic formulation, so a clearer understanding of how this sub-module behaves, when it can be solved, and what noise injection it can tolerate is a first order requirement. We derive new insights into the theoretical underpinnings of blind deconvolution. The algorithm that emerges has nice convergence guarantees and is provably robust in a sense we formalize in the paper. Interestingly, these technical results play out very well in practice, where on standard datasets our algorithm yields results competitive with or superior to the state of the art. Keywords: blind deconvolution, robust continuous optimization
Tasks
Published 2018-03-21
URL http://arxiv.org/abs/1803.08137v1
PDF http://arxiv.org/pdf/1803.08137v1.pdf
PWC https://paperswithcode.com/paper/robust-blind-deconvolution-via-mirror-descent
Repo https://github.com/tianyishan/Blind_Deconvolution
Framework none

McTorch, a manifold optimization library for deep learning

Title McTorch, a manifold optimization library for deep learning
Authors Mayank Meghwanshi, Pratik Jawanpuria, Anoop Kunchukuttan, Hiroyuki Kasai, Bamdev Mishra
Abstract In this paper, we introduce McTorch, a manifold optimization library for deep learning that extends PyTorch. It aims to lower the barrier for users wishing to use manifold constraints in deep learning applications, i.e., when the parameters are constrained to lie on a manifold. Such constraints include the popular orthogonality and rank constraints, and have been recently used in a number of applications in deep learning. McTorch follows PyTorch’s architecture and decouples manifold definitions and optimizers, i.e., once a new manifold is added it can be used with any existing optimizer and vice-versa. McTorch is available at https://github.com/mctorch .
Tasks
Published 2018-10-03
URL http://arxiv.org/abs/1810.01811v2
PDF http://arxiv.org/pdf/1810.01811v2.pdf
PWC https://paperswithcode.com/paper/mctorch-a-manifold-optimization-library-for
Repo https://github.com/mctorch/mctorch
Framework pytorch

ML-Net: multi-label classification of biomedical texts with deep neural networks

Title ML-Net: multi-label classification of biomedical texts with deep neural networks
Authors Jingcheng Du, Qingyu Chen, Yifan Peng, Yang Xiang, Cui Tao, Zhiyong Lu
Abstract In multi-label text classification, each textual document can be assigned with one or more labels. Due to this nature, the multi-label text classification task is often considered to be more challenging compared to the binary or multi-class text classification problems. As an important task with broad applications in biomedicine such as assigning diagnosis codes, a number of different computational methods (e.g. training and combining binary classifiers for each label) have been proposed in recent years. However, many suffered from modest accuracy and efficiency, with only limited success in practical use. We propose ML-Net, a novel deep learning framework, for multi-label classification of biomedical texts. As an end-to-end system, ML-Net combines a label prediction network with an automated label count prediction mechanism to output an optimal set of labels by leveraging both predicted confidence score of each label and the contextual information in the target document. We evaluate ML-Net on three independent, publicly-available corpora in two kinds of text genres: biomedical literature and clinical notes. For evaluation, example-based measures such as precision, recall and f-measure are used. ML-Net is compared with several competitive machine learning baseline models. Our benchmarking results show that ML-Net compares favorably to the state-of-the-art methods in multi-label classification of biomedical texts. ML-NET is also shown to be robust when evaluated on different text genres in biomedicine. Unlike traditional machine learning methods, ML-Net does not require human efforts in feature engineering and is highly efficient and scalable approach to tasks with a large set of labels (no need to build individual classifiers for each separate label). Finally, ML-NET is able to dynamically estimate the label count based on the document context in a more systematic and accurate manner.
Tasks Feature Engineering, Multi-Label Classification, Multi-Label Classification Of Biomedical Texts, Multi-Label Text Classification, Text Classification
Published 2018-11-13
URL http://arxiv.org/abs/1811.05475v2
PDF http://arxiv.org/pdf/1811.05475v2.pdf
PWC https://paperswithcode.com/paper/ml-net-multi-label-classification-of
Repo https://github.com/ncbi-nlp/ML_Net
Framework tf

Part-based Graph Convolutional Network for Action Recognition

Title Part-based Graph Convolutional Network for Action Recognition
Authors Kalpit Thakkar, P J Narayanan
Abstract Human actions comprise of joint motion of articulated body parts or `gestures’. Human skeleton is intuitively represented as a sparse graph with joints as nodes and natural connections between them as edges. Graph convolutional networks have been used to recognize actions from skeletal videos. We introduce a part-based graph convolutional network (PB-GCN) for this task, inspired by Deformable Part-based Models (DPMs). We divide the skeleton graph into four subgraphs with joints shared across them and learn a recognition model using a part-based graph convolutional network. We show that such a model improves performance of recognition, compared to a model using entire skeleton graph. Instead of using 3D joint coordinates as node features, we show that using relative coordinates and temporal displacements boosts performance. Our model achieves state-of-the-art performance on two challenging benchmark datasets NTURGB+D and HDM05, for skeletal action recognition. |
Tasks Action Recognition In Videos, Skeleton Based Action Recognition, Temporal Action Localization
Published 2018-09-13
URL http://arxiv.org/abs/1809.04983v1
PDF http://arxiv.org/pdf/1809.04983v1.pdf
PWC https://paperswithcode.com/paper/part-based-graph-convolutional-network-for
Repo https://github.com/kalpitthakkar/pb-gcn
Framework pytorch

ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks

Title ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks
Authors Kavita Ganesan
Abstract Evaluation of summarization tasks is extremely crucial to determining the quality of machine generated summaries. Over the last decade, ROUGE has become the standard automatic evaluation measure for evaluating summarization tasks. While ROUGE has been shown to be effective in capturing n-gram overlap between system and human composed summaries, there are several limitations with the existing ROUGE measures in terms of capturing synonymous concepts and coverage of topics. Thus, often times ROUGE scores do not reflect the true quality of summaries and prevents multi-faceted evaluation of summaries (i.e. by topics, by overall content coverage and etc). In this paper, we introduce ROUGE 2.0, which has several updated measures of ROUGE: ROUGE-N+Synonyms, ROUGE-Topic, ROUGE-Topic+Synonyms, ROUGE-TopicUniq and ROUGE-TopicUniq+Synonyms; all of which are improvements over the core ROUGE measures.
Tasks
Published 2018-03-05
URL http://arxiv.org/abs/1803.01937v1
PDF http://arxiv.org/pdf/1803.01937v1.pdf
PWC https://paperswithcode.com/paper/rouge-20-updated-and-improved-measures-for
Repo https://github.com/kavgan/ROUGE-2.0
Framework none

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

Title YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
Authors Ning Xu, Linjie Yang, Yuchen Fan, Jianchao Yang, Dingcheng Yue, Yuchen Liang, Brian Price, Scott Cohen, Thomas Huang
Abstract Learning long-term spatial-temporal features are critical for many video analysis tasks. However, existing video segmentation methods predominantly rely on static image segmentation techniques, and methods capturing temporal dependency for segmentation have to depend on pretrained optical flow models, leading to suboptimal solutions for the problem. End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i.e., even the largest video segmentation dataset only contains 90 short video clips. To solve this problem, we build a new large-scale video object segmentation dataset called YouTube Video Object Segmentation dataset (YouTube-VOS). Our dataset contains 3,252 YouTube video clips and 78 categories including common objects and human activities. This is by far the largest video object segmentation dataset to our knowledge and we have released it at https://youtube-vos.org. Based on this dataset, we propose a novel sequence-to-sequence network to fully exploit long-term spatial-temporal information in videos for segmentation. We demonstrate that our method is able to achieve the best results on our YouTube-VOS test set and comparable results on DAVIS 2016 compared to the current state-of-the-art methods. Experiments show that the large scale dataset is indeed a key factor to the success of our model.
Tasks Optical Flow Estimation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published 2018-09-03
URL http://arxiv.org/abs/1809.00461v1
PDF http://arxiv.org/pdf/1809.00461v1.pdf
PWC https://paperswithcode.com/paper/youtube-vos-sequence-to-sequence-video-object
Repo https://github.com/BehradToghi/ECCV_Youtube_VOS
Framework tf

Unsupervised Online Video Object Segmentation with Motion Property Understanding

Title Unsupervised Online Video Object Segmentation with Motion Property Understanding
Authors Tao Zhuo, Zhiyong Cheng, Peng Zhang, Yongkang Wong, Mohan Kankanhalli
Abstract Unsupervised video object segmentation aims to automatically segment moving objects over an unconstrained video without any user annotation. So far, only few unsupervised online methods have been reported in literature and their performance is still far from satisfactory, because the complementary information from future frames cannot be processed under online setting. To solve this challenging problem, in this paper, we propose a novel Unsupervised Online Video Object Segmentation (UOVOS) framework by construing the motion property to mean moving in concurrence with a generic object for segmented regions. By incorporating salient motion detection and object proposal, a pixel-wise fusion strategy is developed to effectively remove detection noise such as dynamic background and stationary objects. Furthermore, by leveraging the obtained segmentation from immediately preceding frames, a forward propagation algorithm is employed to deal with unreliable motion detection and object proposals. Experimental results on several benchmark datasets demonstrate the efficacy of the proposed method. Compared to the state-of-the-art unsupervised online segmentation algorithms, the proposed method achieves an absolute gain of 6.2%. Moreover, our method achieves better performance than the best unsupervised offline algorithm on the DAVIS-2016 benchmark dataset. Our code is available on the project website: https://github.com/visiontao/uovos.
Tasks Motion Detection, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published 2018-10-09
URL https://arxiv.org/abs/1810.03783v2
PDF https://arxiv.org/pdf/1810.03783v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-online-video-object-segmentation
Repo https://github.com/VisionTao/UOVOS
Framework none

Encoding Robust Representation for Graph Generation

Title Encoding Robust Representation for Graph Generation
Authors Dongmian Zou, Gilad Lerman
Abstract Generative networks have made it possible to generate meaningful signals such as images and texts from simple noise. Recently, generative methods based on GAN and VAE were developed for graphs and graph signals. However, the mathematical properties of these methods are unclear, and training good generative models is difficult. This work proposes a graph generation model that uses a recent adaptation of Mallat’s scattering transform to graphs. The proposed model is naturally composed of an encoder and a decoder. The encoder is a Gaussianized graph scattering transform, which is robust to signal and graph manipulation. The decoder is a simple fully connected network that is adapted to specific tasks, such as link prediction, signal generation on graphs and full graph and signal generation. The training of our proposed system is efficient since it is only applied to the decoder and the hardware requirements are moderate. Numerical results demonstrate state-of-the-art performance of the proposed system for both link prediction and graph and signal generation.
Tasks Graph Generation, Link Prediction
Published 2018-09-28
URL http://arxiv.org/abs/1809.10851v2
PDF http://arxiv.org/pdf/1809.10851v2.pdf
PWC https://paperswithcode.com/paper/encoding-robust-representation-for-graph
Repo https://github.com/dmzou/SCAT
Framework tf

Unsupervised Keyphrase Extraction with Multipartite Graphs

Title Unsupervised Keyphrase Extraction with Multipartite Graphs
Authors Florian Boudin
Abstract We propose an unsupervised keyphrase extraction model that encodes topical information within a multipartite graph structure. Our model represents keyphrase candidates and topics in a single graph and exploits their mutually reinforcing relationship to improve candidate ranking. We further introduce a novel mechanism to incorporate keyphrase selection preferences into the model. Experiments conducted on three widely used datasets show significant improvements over state-of-the-art graph-based models.
Tasks
Published 2018-03-23
URL http://arxiv.org/abs/1803.08721v2
PDF http://arxiv.org/pdf/1803.08721v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-keyphrase-extraction-with
Repo https://github.com/boudinfl/pke
Framework none

Multi-hop assortativities for networks classification

Title Multi-hop assortativities for networks classification
Authors Leonardo Gutierrez Gomez, Jean-Charles Delvenne
Abstract Several social, medical, engineering and biological challenges rely on discovering the functionality of networks from their structure and node metadata, when it is available. For example, in chemoinformatics one might want to detect whether a molecule is toxic based on structure and atomic types, or discover the research field of a scientific collaboration network. Existing techniques rely on counting or measuring structural patterns that are known to show large variations from network to network, such as the number of triangles, or the assortativity of node metadata. We introduce the concept of multi-hop assortativity, that captures the similarity of the nodes situated at the extremities of a randomly selected path of a given length. We show that multi-hop assortativity unifies various existing concepts and offers a versatile family of ‘fingerprints’ to characterize networks. These fingerprints allow in turn to recover the functionalities of a network, with the help of the machine learning toolbox. Our method is evaluated empirically on established social and chemoinformatic network benchmarks. Results reveal that our assortativity based features are competitive providing highly accurate results often outperforming state of the art methods for the network classification task.
Tasks
Published 2018-09-14
URL http://arxiv.org/abs/1809.06253v2
PDF http://arxiv.org/pdf/1809.06253v2.pdf
PWC https://paperswithcode.com/paper/multi-hop-assortativities-for-networks
Repo https://github.com/leoguti85/MaF
Framework none

Self-produced Guidance for Weakly-supervised Object Localization

Title Self-produced Guidance for Weakly-supervised Object Localization
Authors Xiaolin Zhang, Yunchao Wei, Guoliang Kang, Yi Yang, Thomas Huang
Abstract Weakly supervised methods usually generate localization results based on attention maps produced by classification networks. However, the attention maps exhibit the most discriminative parts of the object which are small and sparse. We propose to generate Self-produced Guidance (SPG) masks which separate the foreground, the object of interest, from the background to provide the classification networks with spatial correlation information of pixels. A stagewise approach is proposed to incorporate high confident object regions to learn the SPG masks. The high confident regions within attention maps are utilized to progressively learn the SPG masks. The masks are then used as an auxiliary pixel-level supervision to facilitate the training of classification networks. Extensive experiments on ILSVRC demonstrate that SPG is effective in producing high-quality object localizations maps. Particularly, the proposed SPG achieves the Top-1 localization error rate of 43.83% on the ILSVRC validation set, which is a new state-of-the-art error rate.
Tasks Object Localization, Weakly-Supervised Object Localization
Published 2018-07-24
URL http://arxiv.org/abs/1807.08902v2
PDF http://arxiv.org/pdf/1807.08902v2.pdf
PWC https://paperswithcode.com/paper/self-produced-guidance-for-weakly-supervised
Repo https://github.com/xiaomengyc/SPG
Framework pytorch
comments powered by Disqus