January 30, 2020

3264 words 16 mins read

Paper Group ANR 339

Paper Group ANR 339

Interaction-and-Aggregation Network for Person Re-identification. UrbanRhythm: Revealing Urban Dynamics Hidden in Mobility Data. K-BERT: Enabling Language Representation with Knowledge Graph. Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference. Attentive Spatio-Temporal Representation Learning for Diving Classification. …

Interaction-and-Aggregation Network for Person Re-identification

Title Interaction-and-Aggregation Network for Person Re-identification
Authors Ruibing Hou, Bingpeng Ma, Hong Chang, Xinqian Gu, Shiguang Shan, Xilin Chen
Abstract Person re-identification (reID) benefits greatly from deep convolutional neural networks (CNNs) which learn robust feature embeddings. However, CNNs are inherently limited in modeling the large variations in person pose and scale due to their fixed geometric structures. In this paper, we propose a novel network structure, Interaction-and-Aggregation (IA), to enhance the feature representation capability of CNNs. Firstly, Spatial IA (SIA) module is introduced. It models the interdependencies between spatial features and then aggregates the correlated features corresponding to the same body parts. Unlike CNNs which extract features from fixed rectangle regions, SIA can adaptively determine the receptive fields according to the input person pose and scale. Secondly, we introduce Channel IA (CIA) module which selectively aggregates channel features to enhance the feature representation, especially for smallscale visual cues. Further, IA network can be constructed by inserting IA blocks into CNNs at any depth. We validate the effectiveness of our model for person reID by demonstrating its superiority over state-of-the-art methods on three benchmark datasets.
Tasks Person Re-Identification
Published 2019-07-19
URL https://arxiv.org/abs/1907.08435v1
PDF https://arxiv.org/pdf/1907.08435v1.pdf
PWC https://paperswithcode.com/paper/interaction-and-aggregation-network-for-1
Repo
Framework

UrbanRhythm: Revealing Urban Dynamics Hidden in Mobility Data

Title UrbanRhythm: Revealing Urban Dynamics Hidden in Mobility Data
Authors Sirui Song, Tong Xia, Depeng Jin, Pan Hui, Yong Li
Abstract Understanding urban dynamics, i.e., how the types and intensity of urban residents’ activities in the city change along with time, is of urgent demand for building an efficient and livable city. Nonetheless, this is challenging due to the expanding urban population and the complicated spatial distribution of residents. In this paper, to reveal urban dynamics, we propose a novel system UrbanRhythm to reveal the urban dynamics hidden in human mobility data. UrbanRhythm addresses three questions: 1) What mobility feature should be used to present residents’ high-dimensional activities in the city? 2) What are basic components of urban dynamics? 3) What are the long-term periodicity and short-term regularity of urban dynamics? In UrbanRhythm, we extract staying, leaving, arriving three attributes of mobility and use a image processing method Saak transform to calculate the mobility distribution feature. For the second question, several city states are identified by hierarchy clustering as the basic components of urban dynamics, such as sleeping states and working states. We further characterize the urban dynamics as the transform of city states along time axis. For the third question, we directly observe the long-term periodicity of urban dynamics from visualization. Then for the short-term regularity, we design a novel motif analysis method to discovery motifs as well as their hierarchy relationships. We evaluate our proposed system on two real-life datesets and validate the results according to App usage records. This study sheds light on urban dynamics hidden in human mobility and can further pave the way for more complicated mobility behavior modeling and deeper urban understanding.
Tasks
Published 2019-11-03
URL https://arxiv.org/abs/1911.05493v1
PDF https://arxiv.org/pdf/1911.05493v1.pdf
PWC https://paperswithcode.com/paper/urbanrhythm-revealing-urban-dynamics-hidden
Repo
Framework

K-BERT: Enabling Language Representation with Knowledge Graph

Title K-BERT: Enabling Language Representation with Knowledge Graph
Authors Weijie Liu, Peng Zhou, Zhe Zhao, Zhiruo Wang, Qi Ju, Haotang Deng, Ping Wang
Abstract Pre-trained language representation models, such as BERT, capture a general language representation from large-scale corpora, but lack domain-specific knowledge. When reading a domain text, experts make inferences with relevant knowledge. For machines to achieve this capability, we propose a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge. However, too much knowledge incorporation may divert the sentence from its correct meaning, which is called knowledge noise (KN) issue. To overcome KN, K-BERT introduces soft-position and visible matrix to limit the impact of knowledge. K-BERT can easily inject domain knowledge into the models by equipped with a KG without pre-training by-self because it is capable of loading model parameters from the pre-trained BERT. Our investigation reveals promising results in twelve NLP tasks. Especially in domain-specific tasks (including finance, law, and medicine), K-BERT significantly outperforms BERT, which demonstrates that K-BERT is an excellent choice for solving the knowledge-driven problems that require experts.
Tasks Knowledge Graphs
Published 2019-09-17
URL https://arxiv.org/abs/1909.07606v1
PDF https://arxiv.org/pdf/1909.07606v1.pdf
PWC https://paperswithcode.com/paper/k-bert-enabling-language-representation-with
Repo
Framework

Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference

Title Doubly Sparse: Sparse Mixture of Sparse Experts for Efficient Softmax Inference
Authors Shun Liao, Ting Chen, Tian Lin, Denny Zhou, Chong Wang
Abstract Computations for the softmax function are significantly expensive when the number of output classes is large. In this paper, we present a novel softmax inference speedup method, Doubly Sparse Softmax (DS-Softmax), that leverages sparse mixture of sparse experts to efficiently retrieve top-k classes. Different from most existing methods that require and approximate a fixed softmax, our method is learning-based and can adapt softmax weights for a better inference speedup. In particular, our method learns a two-level hierarchy which divides entire output class space into several partially overlapping experts. Each expert is sparse and only contains a subset of output classes. To find top-k classes, a sparse mixture enables us to find the most probable expert quickly, and the sparse expert enables us to search within a small-scale softmax. We empirically conduct evaluation on several real-world tasks, including neural machine translation, language modeling and image classification, and demonstrate that significant computation reductions can be achieved at no performance loss.
Tasks Image Classification, Language Modelling, Machine Translation
Published 2019-01-30
URL https://arxiv.org/abs/1901.10668v2
PDF https://arxiv.org/pdf/1901.10668v2.pdf
PWC https://paperswithcode.com/paper/doubly-sparse-sparse-mixture-of-sparse
Repo
Framework

Attentive Spatio-Temporal Representation Learning for Diving Classification

Title Attentive Spatio-Temporal Representation Learning for Diving Classification
Authors Gagan Kanojia, Sudhakar Kumawat, Shanmuganathan Raman
Abstract Competitive diving is a well recognized aquatic sport in which a person dives from a platform or a springboard into the water. Based on the acrobatics performed during the dive, diving is classified into a finite set of action classes which are standardized by FINA. In this work, we propose an attention guided LSTM-based neural network architecture for the task of diving classification. The network takes the frames of a diving video as input and determines its class. We evaluate the performance of the proposed model on a recently introduced competitive diving dataset, Diving48. It contains over 18000 video clips which covers 48 classes of diving. The proposed model outperforms the classification accuracy of the state-of-the-art models in both 2D and 3D frameworks by 11.54% and 4.24%, respectively. We show that the network is able to localize the diver in the video frames during the dive without being trained with such a supervision.
Tasks Representation Learning
Published 2019-04-30
URL http://arxiv.org/abs/1905.00050v1
PDF http://arxiv.org/pdf/1905.00050v1.pdf
PWC https://paperswithcode.com/paper/attentive-spatio-temporal-representation
Repo
Framework

Combination of Unified Embedding Model and Observed Features for Knowledge Graph Completion

Title Combination of Unified Embedding Model and Observed Features for Knowledge Graph Completion
Authors Takuma Ebisu, Ryutaro Ichise
Abstract Knowledge graphs are useful for many artificial intelligence tasks but often have missing data. Hence, a method for completing knowledge graphs is required. Existing approaches include embedding models, the Path Ranking Algorithm, and rule evaluation models. However, these approaches have limitations. For example, all the information is mixed and difficult to interpret in embedding models, and traditional rule evaluation models are basically slow. In this paper, we provide an integrated view of various approaches and combine them to compensate for their limitations. We first unify state-of-the-art embedding models, such as ComplEx and TorusE, reinterpreting them as a variant of translation-based models. Then, we show that these models utilize paths for link prediction and propose a method for evaluating rules based on this idea. Finally, we combine an embedding model and observed feature models to predict missing triples. This is possible because all of these models utilize paths. We also conduct experiments, including link prediction tasks, with standard datasets to evaluate our method and framework. The experiments show that our method can evaluate rules faster than traditional methods and that our framework outperforms state-of-the-art models in terms of link prediction.
Tasks Knowledge Graph Completion, Knowledge Graphs, Link Prediction
Published 2019-09-09
URL https://arxiv.org/abs/1909.03821v2
PDF https://arxiv.org/pdf/1909.03821v2.pdf
PWC https://paperswithcode.com/paper/combination-of-embedding-models-and
Repo
Framework

Resolution-independent meshes of super pixels

Title Resolution-independent meshes of super pixels
Authors Vitaliy Kurlin, Philip Smith
Abstract The over-segmentation into superpixels is an important preprocessing step to smartly compress the input size and speed up higher level tasks. A superpixel was traditionally considered as a small cluster of square-based pixels that have similar color intensities and are closely located to each other. In this discrete model the boundaries of superpixels often have irregular zigzags consisting of horizontal or vertical edges from a given pixel grid. However digital images represent a continuous world, hence the following continuous model in the resolution-independent formulation can be more suitable for the reconstruction problem. Instead of uniting squares in a grid, a resolution-independent superpixel is defined as a polygon that has straight edges with any possible slope at subpixel resolution. The harder continuous version of the over-segmentation problem is to split an image into polygons and find a best (say, constant) color of each polygon so that the resulting colored mesh well approximates the given image. Such a mesh of polygons can be rendered at any higher resolution with all edges kept straight. We propose a fast conversion of any traditional superpixels into polygons and guarantees that their straight edges do not intersect. The meshes based on the superpixels SEEDS (Superpixels Extracted via Energy-Driven Sampling) and SLIC (Simple Linear Iterative Clustering) are compared with past meshes based on the Line Segment Detector. The experiments on the Berkeley Segmentation Database confirm that the new superpixels have more compact shapes than pixel-based superpixels.
Tasks
Published 2019-10-29
URL https://arxiv.org/abs/1910.13323v2
PDF https://arxiv.org/pdf/1910.13323v2.pdf
PWC https://paperswithcode.com/paper/resolution-independent-meshes-of-super-pixels
Repo
Framework

Effectiveness of Data-Driven Induction of Semantic Spaces and Traditional Classifiers for Sarcasm Detection

Title Effectiveness of Data-Driven Induction of Semantic Spaces and Traditional Classifiers for Sarcasm Detection
Authors Mattia Antonino Di Gangi, Giosué Lo Bosco, Giovanni Pilato
Abstract Irony and sarcasm are two complex linguistic phenomena that are widely used in everyday language and especially over the social media, but they represent two serious issues for automated text understanding. Many labeled corpora have been extracted from several sources to accomplish this task, and it seems that sarcasm is conveyed in different ways for different domains. Nonetheless, very little work has been done for comparing different methods among the available corpora. Furthermore, usually, each author collects and uses their own datasets to evaluate his own method. In this paper, we show that sarcasm detection can be tackled by applying classical machine learning algorithms to input texts sub-symbolically represented in a Latent Semantic space. The main consequence is that our studies establish both reference datasets and baselines for the sarcasm detection problem that could serve the scientific community to test newly proposed methods.
Tasks Sarcasm Detection
Published 2019-04-02
URL https://arxiv.org/abs/1904.04019v4
PDF https://arxiv.org/pdf/1904.04019v4.pdf
PWC https://paperswithcode.com/paper/effectiveness-of-data-driven-induction-of
Repo
Framework

Parametic Classification of Handvein Patterns Based on Texture Features

Title Parametic Classification of Handvein Patterns Based on Texture Features
Authors Harbi AlMahafzah, Mohammad Imranand, Supreetha Gowda H. D.
Abstract In this paper, we have developed Biometric recognition system adopting hand based modality Handvein, which has the unique pattern for each individual and it is impossible to counterfeit and fabricate as it is an internal feature. We have opted in choosing feature extraction algorithms such as LBP-visual descriptor ,LPQ-blur insensitive texture operator, Log-Gabor-Texture descriptor. We have chosen well known classifiers such as KNN and SVM for classification. We have experimented and tabulated results of single algorithm recognition rate for Handvein under different distance measures and kernel options. The feature level fusion is carried out which increased the performance level.
Tasks
Published 2019-03-21
URL http://arxiv.org/abs/1903.08847v1
PDF http://arxiv.org/pdf/1903.08847v1.pdf
PWC https://paperswithcode.com/paper/parametic-classification-of-handvein-patterns
Repo
Framework

SPDA: Superpixel-based Data Augmentation for Biomedical Image Segmentation

Title SPDA: Superpixel-based Data Augmentation for Biomedical Image Segmentation
Authors Yizhe Zhang, Lin Yang, Hao Zheng, Peixian Liang, Colleen Mangold, Raquel G. Loreto, David P. Hughes, Danny Z. Chen
Abstract Supervised training a deep neural network aims to “teach” the network to mimic human visual perception that is represented by image-and-label pairs in the training data. Superpixelized (SP) images are visually perceivable to humans, but a conventionally trained deep learning model often performs poorly when working on SP images. To better mimic human visual perception, we think it is desirable for the deep learning model to be able to perceive not only raw images but also SP images. In this paper, we propose a new superpixel-based data augmentation (SPDA) method for training deep learning models for biomedical image segmentation. Our method applies a superpixel generation scheme to all the original training images to generate superpixelized images. The SP images thus obtained are then jointly used with the original training images to train a deep learning model. Our experiments of SPDA on four biomedical image datasets show that SPDA is effective and can consistently improve the performance of state-of-the-art fully convolutional networks for biomedical image segmentation in 2D and 3D images. Additional studies also demonstrate that SPDA can practically reduce the generalization gap.
Tasks Data Augmentation, Semantic Segmentation
Published 2019-02-28
URL http://arxiv.org/abs/1903.00035v1
PDF http://arxiv.org/pdf/1903.00035v1.pdf
PWC https://paperswithcode.com/paper/spda-superpixel-based-data-augmentation-for
Repo
Framework

Image Captioning using Facial Expression and Attention

Title Image Captioning using Facial Expression and Attention
Authors Omid Mohamad Nezami, Mark Dras, Stephen Wan, Cecile Paris
Abstract Benefiting from advances in machine vision and natural language processing techniques, current image captioning systems are able to generate detailed visual descriptions. For the most part, these descriptions represent an objective characterisation of the image, although some models do incorporate subjective aspects related to the observer’s view of the image, such as sentiment; current models, however, usually do not consider the emotional content of images during the caption generation process. This paper addresses this issue by proposing novel image captioning models which use facial expression features to generate image captions. The models generate image captions using long short-term memory networks applying facial features in addition to other visual features at different time steps. We compare a comprehensive collection of image captioning models with and without facial features using all standard evaluation metrics. The evaluation metrics indicate that applying facial features with an attention mechanism achieves the best performance, showing more expressive and more correlated image captions, on an image caption dataset extracted from the standard Flickr 30K dataset, consisting of around 11K images containing faces. An analysis of the generated captions finds that, perhaps unexpectedly, the improvement in caption quality appears to come not from the addition of adjectives linked to emotional aspects of the images, but from more variety in the actions described in the captions.
Tasks Image Captioning
Published 2019-08-08
URL https://arxiv.org/abs/1908.02923v2
PDF https://arxiv.org/pdf/1908.02923v2.pdf
PWC https://paperswithcode.com/paper/image-captioning-using-facial-expression-and
Repo
Framework

A Semi-Supervised Maximum Margin Metric Learning Approach for Small Scale Person Re-identification

Title A Semi-Supervised Maximum Margin Metric Learning Approach for Small Scale Person Re-identification
Authors T M Feroz Ali, Subhasis Chaudhuri
Abstract In video surveillance, person re-identification is the task of searching person images in non-overlapping cameras. Though supervised methods for person re-identification have attained impressive performance, obtaining large scale cross-view labeled training data is very expensive. However, unlabelled data is available in abundance. In this paper, we propose a semi-supervised metric learning approach that can utilize information in unlabelled data with the help of a few labelled training samples. We also address the small sample size problem that inherently occurs due to the few labeled training data. Our method learns a discriminative space where within class samples collapse to singular points, achieving the least within class variance, and then use a maximum margin criterion over a high dimensional kernel space to maximally separate the distinct class samples. A maximum margin criterion with two levels of high dimensional mappings to kernel space is used to obtain better cross-view discrimination of the identities. Cross-view affinity learning with reciprocal nearest neighbor constraints is used to mine new pseudo-classes from the unlabelled data and update the distance metric iteratively. We attain state-of-the-art performance on four challenging datasets with a large margin.
Tasks Metric Learning, Person Re-Identification
Published 2019-10-09
URL https://arxiv.org/abs/1910.03905v1
PDF https://arxiv.org/pdf/1910.03905v1.pdf
PWC https://paperswithcode.com/paper/a-semi-supervised-maximum-margin-metric
Repo
Framework

Healthcare NER Models Using Language Model Pretraining

Title Healthcare NER Models Using Language Model Pretraining
Authors Amogh Kamat Tarcar, Aashis Tiwari, Vineet Naique Dhaimodker, Penjo Rebelo, Rahul Desai, Dattaraj Rao
Abstract In this paper, we present our approach to extracting structured information from unstructured Electronic Health Records (EHR) [2] which can be used to, for example, study adverse drug reactions in patients due to chemicals in their products. Our solution uses a combination of Natural Language Processing (NLP) techniques and a web-based annotation tool to optimize the performance of a custom Named Entity Recognition (NER) [1] model trained on a limited amount of EHR training data. This work was presented at the first Health Search and Data Mining Workshop (HSDM 2020) [26]. We showcase a combination of tools and techniques leveraging the recent advancements in NLP aimed at targeting domain shifts by applying transfer learning and language model pre-training techniques [3]. We present a comparison of our technique to the current popular approaches and show the effective increase in performance of the NER model and the reduction in time to annotate data.A key observation of the results presented is that the F1 score of model (0.734) trained with our approach with just 50% of available training data outperforms the F1 score of the blank spaCy model without language model component (0.704) trained with 100% of the available training data. We also demonstrate an annotation tool to minimize domain expert time and the manual effort required to generate such a training dataset. Further, we plan to release the annotated dataset as well as the pre-trained model to the community to further research in medical health records.
Tasks Language Modelling, Named Entity Recognition, Transfer Learning
Published 2019-10-23
URL https://arxiv.org/abs/1910.11241v2
PDF https://arxiv.org/pdf/1910.11241v2.pdf
PWC https://paperswithcode.com/paper/ner-models-using-pre-training-and-transfer
Repo
Framework

A Gated Hypernet Decoder for Polar Codes

Title A Gated Hypernet Decoder for Polar Codes
Authors Eliya Nachmani, Lior Wolf
Abstract Hypernetworks were recently shown to improve the performance of message passing algorithms for decoding error correcting codes. In this work, we demonstrate how hypernetworks can be applied to decode polar codes by employing a new formalization of the polar belief propagation decoding scheme. We demonstrate that our method improves the previous results of neural polar decoders and achieves, for large SNRs, the same bit-error-rate performances as the successive list cancellation method, which is known to be better than any belief propagation decoders and very close to the maximum likelihood decoder.
Tasks
Published 2019-11-08
URL https://arxiv.org/abs/1911.03229v2
PDF https://arxiv.org/pdf/1911.03229v2.pdf
PWC https://paperswithcode.com/paper/a-gated-hypernet-decoder-for-polar-codes
Repo
Framework

Generating Random Parameters in Feedforward Neural Networks with Random Hidden Nodes: Drawbacks of the Standard Method and How to Improve It

Title Generating Random Parameters in Feedforward Neural Networks with Random Hidden Nodes: Drawbacks of the Standard Method and How to Improve It
Authors Grzegorz Dudek
Abstract The standard method of generating random weights and biases in feedforward neural networks with random hidden nodes, selects them both from the uniform distribution over the same fixed interval. In this work, we show the drawbacks of this approach and propose a new method of generating random parameters. This method ensures the most nonlinear fragments of sigmoids, which are most useful in modeling target function nonlinearity, are kept in the input hypercube. In addition, we show how to generate activation functions with uniformly distributed slope angles.
Tasks
Published 2019-08-16
URL https://arxiv.org/abs/1908.05864v2
PDF https://arxiv.org/pdf/1908.05864v2.pdf
PWC https://paperswithcode.com/paper/generating-random-parameters-in-feedforward
Repo
Framework
comments powered by Disqus