January 25, 2020

3214 words 16 mins read

Paper Group ANR 1747

Paper Group ANR 1747

Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning. Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity. Learning Priors for Adversarial Autoencoders. Mixed pooling of seasonality in time series pallet forecasting. A Survey on GANs for Anomaly Detection. Ensemble Quantile Classifier. …

Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning

Title Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning
Authors Rong-Cheng Tu, Xian-Ling Mao, Bing Ma, Yong Hu, Tan Yan, Wei Wei, Heyan Huang
Abstract Due to their high retrieval efficiency and low storage cost, cross-modal hashing methods have attracted considerable attention. Generally, compared with shallow cross-modal hashing methods, deep cross-modal hashing methods can achieve a more satisfactory performance by integrating feature learning and hash codes optimizing into a same framework. However, most existing deep cross-modal hashing methods either cannot learn a unified hash code for the two correlated data-points of different modalities in a database instance or cannot guide the learning of unified hash codes by the feedback of hashing function learning procedure, to enhance the retrieval accuracy. To address the issues above, in this paper, we propose a novel end-to-end Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning (DCHUC). Specifically, by an iterative optimization algorithm, DCHUC jointly learns unified hash codes for image-text pairs in a database and a pair of hash functions for unseen query image-text pairs. With the iterative optimization algorithm, the learned unified hash codes can be used to guide the hashing function learning procedure; Meanwhile, the learned hashing functions can feedback to guide the unified hash codes optimizing procedure. Extensive experiments on three public datasets demonstrate that the proposed method outperforms the state-of-the-art cross-modal hashing methods.
Tasks
Published 2019-07-29
URL https://arxiv.org/abs/1907.12490v1
PDF https://arxiv.org/pdf/1907.12490v1.pdf
PWC https://paperswithcode.com/paper/deep-cross-modal-hashing-with-hashing
Repo
Framework

Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity

Title Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity
Authors Yang Hu, Giovanni Montana
Abstract Transfer learning methods for reinforcement learning (RL) domains facilitate the acquisition of new skills using previously acquired knowledge. The vast majority of existing approaches assume that the agents have the same design, e.g. same shape and action spaces. In this paper we address the problem of transferring previously acquired skills amongst morphologically different agents (MDAs). For instance, assuming that a bipedal agent has been trained to move forward, could this skill be transferred on to a one-leg hopper so as to make its training process for the same task more sample efficient? We frame this problem as one of subspace learning whereby we aim to infer latent factors representing the control mechanism that is common between MDAs. We propose a novel paired variational encoder-decoder model, PVED, that disentangles the control of MDAs into shared and agent-specific factors. The shared factors are then leveraged for skill transfer using RL. Theoretically, we derive a theorem indicating how the performance of PVED depends on the shared factors and agent morphologies. Experimentally, PVED has been extensively validated on four MuJoCo environments. We demonstrate its performance compared to a state-of-the-art approach and several ablation cases, visualize and interpret the hidden factors, and identify avenues for future improvements.
Tasks Transfer Learning
Published 2019-08-14
URL https://arxiv.org/abs/1908.05265v2
PDF https://arxiv.org/pdf/1908.05265v2.pdf
PWC https://paperswithcode.com/paper/skill-transfer-in-deep-reinforcement-learning
Repo
Framework

Learning Priors for Adversarial Autoencoders

Title Learning Priors for Adversarial Autoencoders
Authors Hui-Po Wang, Wen-Hsiao Peng, Wei-Jan Ko
Abstract Most deep latent factor models choose simple priors for simplicity, tractability or not knowing what prior to use. Recent studies show that the choice of the prior may have a profound effect on the expressiveness of the model,especially when its generative network has limited capacity. In this paper, we propose to learn a proper prior from data for adversarial autoencoders(AAEs). We introduce the notion of code generators to transform manually selected simple priors into ones that can better characterize the data distribution. Experimental results show that the proposed model can generate better image quality and learn better disentangled representations than AAEs in both supervised and unsupervised settings. Lastly, we present its ability to do cross-domain translation in a text-to-image synthesis task.
Tasks Image Generation
Published 2019-09-10
URL https://arxiv.org/abs/1909.04443v1
PDF https://arxiv.org/pdf/1909.04443v1.pdf
PWC https://paperswithcode.com/paper/learning-priors-for-adversarial-autoencoders-1
Repo
Framework

Mixed pooling of seasonality in time series pallet forecasting

Title Mixed pooling of seasonality in time series pallet forecasting
Authors Hyunji Moon, Hyeonseop Lee
Abstract Multiple seasonal patterns play a key role in time series forecasting, especially for business time series where seasonal effects are often dramatic. Previous approaches including Fourier decomposition, exponential smoothing, and seasonal autoregressive integrated moving average (SARIMA) models do not reflect the distinct characteristics of each period in seasonal patterns, such as the unique behavior of specific days of the week in business data. We propose a multi-dimensional hierarchical model. Intermediate parameters for each seasonal period are first estimated, and a mixture of intermediate parameters is then taken, resulting in a model that successfully reflects the interactions between multiple seasonal patterns. Although this process reduces the data available for each parameter, a robust estimation can be obtained through a hierarchical Bayesian model implemented in Stan. Through this model, it becomes possible to consider both the characteristics of each seasonal period and the interactions among characteristics from multiple seasonal periods. Our new model achieved considerable improvements in prediction accuracy compared to previous models, including Fourier decomposition, which Prophet uses to model seasonality patterns. A comparison was performed on a real-world dataset of pallet transport from a national-scale logistic network.
Tasks Time Series, Time Series Forecasting
Published 2019-08-14
URL https://arxiv.org/abs/1908.05339v1
PDF https://arxiv.org/pdf/1908.05339v1.pdf
PWC https://paperswithcode.com/paper/mixed-pooling-of-seasonality-in-time-series
Repo
Framework

A Survey on GANs for Anomaly Detection

Title A Survey on GANs for Anomaly Detection
Authors Federico Di Mattia, Paolo Galeone, Michele De Simoni, Emanuele Ghelfi
Abstract Anomaly detection is a significant problem faced in several research areas. Detecting and correctly classifying something unseen as anomalous is a challenging problem that has been tackled in many different manners over the years. Generative Adversarial Networks (GANs) and the adversarial training process have been recently employed to face this task yielding remarkable results. In this paper we survey the principal GAN-based anomaly detection methods, highlighting their pros and cons. Our contributions are the empirical validation of the main GAN models for anomaly detection, the increase of the experimental results on different datasets and the public release of a complete Open Source toolbox for Anomaly Detection using GANs.
Tasks Anomaly Detection
Published 2019-06-27
URL https://arxiv.org/abs/1906.11632v1
PDF https://arxiv.org/pdf/1906.11632v1.pdf
PWC https://paperswithcode.com/paper/a-survey-on-gans-for-anomaly-detection
Repo
Framework

Ensemble Quantile Classifier

Title Ensemble Quantile Classifier
Authors Yuanhao Lai, Ian McLeod
Abstract Both the median-based classifier and the quantile-based classifier are useful for discriminating high-dimensional data with heavy-tailed or skewed inputs. But these methods are restricted as they assign equal weight to each variable in an unregularized way. The ensemble quantile classifier is a more flexible regularized classifier that provides better performance with high-dimensional data, asymmetric data or when there are many irrelevant extraneous inputs. The improved performance is demonstrated by a simulation study as well as an application to text categorization. It is proven that the estimated parameters of the ensemble quantile classifier consistently estimate the minimal population loss under suitable general model assumptions. It is also shown that the ensemble quantile classifier is Bayes optimal under suitable assumptions with asymmetric Laplace distribution inputs.
Tasks Text Categorization
Published 2019-10-28
URL https://arxiv.org/abs/1910.12960v1
PDF https://arxiv.org/pdf/1910.12960v1.pdf
PWC https://paperswithcode.com/paper/ensemble-quantile-classifier
Repo
Framework

Community detection over a heterogeneous population of non-aligned networks

Title Community detection over a heterogeneous population of non-aligned networks
Authors Guilherme Gomes, Vinayak Rao, Jennifer Neville
Abstract Clustering and community detection with multiple graphs have typically focused on aligned graphs, where there is a mapping between nodes across the graphs (e.g., multi-view, multi-layer, temporal graphs). However, there are numerous application areas with multiple graphs that are only partially aligned, or even unaligned. These graphs are often drawn from the same population, with communities of potentially different sizes that exhibit similar structure. In this paper, we develop a joint stochastic blockmodel (Joint SBM) to estimate shared communities across sets of heterogeneous non-aligned graphs. We derive an efficient spectral clustering approach to learn the parameters of the joint SBM. We evaluate the model on both synthetic and real-world datasets and show that the joint model is able to exploit cross-graph information to better estimate the communities compared to learning separate SBMs on each individual graph.
Tasks Community Detection
Published 2019-04-04
URL http://arxiv.org/abs/1904.05332v1
PDF http://arxiv.org/pdf/1904.05332v1.pdf
PWC https://paperswithcode.com/paper/community-detection-over-a-heterogeneous
Repo
Framework

HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person Re-ID via Image Captioning

Title HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person Re-ID via Image Captioning
Authors Shiyang Yan, Jun Xu, Yuai Liu, Lin Xu
Abstract Person re-identification (re-ID) aims to recognize a person-of-interest across different cameras with notable appearance variance. Existing research works focused on the capability and robustness of visual representation. In this paper, instead, we propose a novel hierarchical offshoot recurrent network (HorNet) for improving person re-ID via image captioning. Image captions are semantically richer and more consistent than visual attributes, which could significantly alleviate the variance. We use the similarity preserving generative adversarial network (SPGAN) and an image captioner to fulfill domain transfer and language descriptions generation. Then the proposed HorNet can learn the visual and language representation from both the images and captions jointly, and thus enhance the performance of person re-ID. Extensive experiments are conducted on several benchmark datasets with or without image captions, i.e., CUHK03, Market-1501, and Duke-MTMC, demonstrating the superiority of the proposed method. Our method can generate and extract meaningful image captions while achieving state-of-the-art performance.
Tasks Image Captioning, Person Re-Identification
Published 2019-08-14
URL https://arxiv.org/abs/1908.04915v1
PDF https://arxiv.org/pdf/1908.04915v1.pdf
PWC https://paperswithcode.com/paper/hornet-a-hierarchical-offshoot-recurrent
Repo
Framework

Scene-based Factored Attention for Image Captioning

Title Scene-based Factored Attention for Image Captioning
Authors Chen Shen, Rongrong Ji, Fuhai Chen, Xiaoshuai Sun, Xiangming Li
Abstract Image captioning has attracted ever-increasing research attention in the multimedia community. To this end, most cutting-edge works rely on an encoder-decoder framework with attention mechanisms, which have achieved remarkable progress. However, such a framework does not consider scene concepts to attend visual information, which leads to sentence bias in caption generation and defects the performance correspondingly. We argue that such scene concepts capture higher-level visual semantics and serve as an important cue in describing images. In this paper, we propose a novel scene-based factored attention module for image captioning. Specifically, the proposed module first embeds the scene concepts into factored weights explicitly and attends the visual information extracted from the input image. Then, an adaptive LSTM is used to generate captions for specific scene types. Experimental results on Microsoft COCO benchmark show that the proposed scene-based attention module improves model performance a lot, which outperforms the state-of-the-art approaches under various evaluation metrics.
Tasks Image Captioning
Published 2019-08-07
URL https://arxiv.org/abs/1908.02632v3
PDF https://arxiv.org/pdf/1908.02632v3.pdf
PWC https://paperswithcode.com/paper/scene-based-factored-attention-for-image
Repo
Framework

Operational Framework for Recent Advances in Backtracking Search Optimisation Algorithm: A Systematic Review and Performance Evaluation

Title Operational Framework for Recent Advances in Backtracking Search Optimisation Algorithm: A Systematic Review and Performance Evaluation
Authors Bryar A. Hassan, Tarik A. Rashid
Abstract The experiments conducted in previous studies demonstrated the successful performance of BSA and its non-sensitivity toward the several types of optimisation problems. This success of BSA motivated researchers to work on expanding it, e.g., developing its improved versions or employing it for different applications and problem domains. However, there is a lack of literature review on BSA; therefore, reviewing the aforementioned modifications and applications systematically will aid further development of the algorithm. This paper provides a systematic review and meta-analysis that emphasise on reviewing the related studies and recent developments on BSA. Hence, the objectives of this work are two-fold: (i) First, two frameworks for depicting the main extensions and the uses of BSA are proposed. The first framework is a general framework to depict the main extensions of BSA, whereas the second is an operational framework to present the expansion procedures of BSA to guide the researchers who are working on improving it. (ii) Second, the experiments conducted in this study fairly compare the analytical performance of BSA with four other competitive algorithms: differential evolution (DE), particle swarm optimisation (PSO), artificial bee colony (ABC), and firefly (FF) on 16 different hardness scores of the benchmark functions with different initial control parameters such as problem dimensions and search space. The experimental results indicate that BSA is statistically superior than the aforementioned algorithms in solving different cohorts of numerical optimisation problems such as problems with different levels of hardness score, problem dimensions, and search spaces.
Tasks
Published 2019-11-29
URL https://arxiv.org/abs/1911.13011v2
PDF https://arxiv.org/pdf/1911.13011v2.pdf
PWC https://paperswithcode.com/paper/operational-framework-for-recent-advances-in
Repo
Framework

Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment

Title Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment
Authors Jianbo Yuan, Haofu Liao, Rui Luo, Jiebo Luo
Abstract Generating radiology reports is time-consuming and requires extensive expertise in practice. Therefore, reliable automatic radiology report generation is highly desired to alleviate the workload. Although deep learning techniques have been successfully applied to image classification and image captioning tasks, radiology report generation remains challenging in regards to understanding and linking complicated medical visual contents with accurate natural language descriptions. In addition, the data scales of open-access datasets that contain paired medical images and reports remain very limited. To cope with these practical challenges, we propose a generative encoder-decoder model and focus on chest x-ray images and reports with the following improvements. First, we pretrain the encoder with a large number of chest x-ray images to accurately recognize 14 common radiographic observations, while taking advantage of the multi-view images by enforcing the cross-view consistency. Second, we synthesize multi-view visual features based on a sentence-level attention mechanism in a late fusion fashion. In addition, in order to enrich the decoder with descriptive semantics and enforce the correctness of the deterministic medical-related contents such as mentions of organs or diagnoses, we extract medical concepts based on the radiology reports in the training data and fine-tune the encoder to extract the most frequent medical concepts from the x-ray images. Such concepts are fused with each decoding step by a word-level attention model. The experimental results conducted on the Indiana University Chest X-Ray dataset demonstrate that the proposed model achieves the state-of-the-art performance compared with other baseline approaches.
Tasks Image Captioning, Image Classification
Published 2019-07-22
URL https://arxiv.org/abs/1907.09085v2
PDF https://arxiv.org/pdf/1907.09085v2.pdf
PWC https://paperswithcode.com/paper/automatic-radiology-report-generation-based
Repo
Framework

Short Term Prediction of Parking Area states Using Real Time Data and Machine Learning Techniques

Title Short Term Prediction of Parking Area states Using Real Time Data and Machine Learning Techniques
Authors Jesper Provoost, Luc Wismans, Sander Van der Drift, Andreas Kamilaris, Maurice Van Keulen
Abstract Public road authorities and private mobility service providers need information derived from the current and predicted traffic states to act upon the daily urban system and its spatial and temporal dynamics. In this research, a real-time parking area state (occupancy, in- and outflux) prediction model (up to 60 minutes ahead) has been developed using publicly available historic and real time data sources. Based on a case study in a real-life scenario in the city of Arnhem, a Neural Network-based approach outperforms a Random Forest-based one on all assessed performance measures, although the differences are small. Both are outperforming a naive seasonal random walk model. Although the performance degrades with increasing prediction horizon, the model shows a performance gain of over 150% at a prediction horizon of 60 minutes compared with the naive model. Furthermore, it is shown that predicting the in- and outflux is a far more difficult task (i.e. performance gains of 30%) which needs more training data, not based exclusively on occupancy rate. However, the performance of predicting in- and outflux is less sensitive to the prediction horizon. In addition, it is shown that real-time information of current occupancy rate is the independent variable with the highest contribution to the performance, although time, traffic flow and weather variables also deliver a significant contribution. During real-time deployment, the model performs three times better than the naive model on average. As a result, it can provide valuable information for proactive traffic management as well as mobility service providers.
Tasks
Published 2019-11-29
URL https://arxiv.org/abs/1911.13178v1
PDF https://arxiv.org/pdf/1911.13178v1.pdf
PWC https://paperswithcode.com/paper/short-term-prediction-of-parking-area-states
Repo
Framework

On Sample Complexity Upper and Lower Bounds for Exact Ranking from Noisy Comparisons

Title On Sample Complexity Upper and Lower Bounds for Exact Ranking from Noisy Comparisons
Authors Wenbo Ren, Jia Liu, Ness B. Shroff
Abstract This paper studies the problem of finding the exact ranking from noisy comparisons. A comparison over a set of $m$ items produces a noisy outcome about the most preferred item, and reveals some information about the ranking. By repeatedly and adaptively choosing items to compare, we want to fully rank the items with a certain confidence, and use as few comparisons as possible. Different from most previous works, in this paper, we have three main novelties: (i) compared to prior works, our upper bounds (algorithms) and lower bounds on the sample complexity (aka number of comparisons) require the minimal assumptions on the instances, and are not restricted to specific models; (ii) we give lower bounds and upper bounds on instances with unequal noise levels; and (iii) this paper aims at the exact ranking without knowledge on the instances, while most of the previous works either focus on approximate rankings or study exact ranking but require prior knowledge. We first derive lower bounds for pairwise ranking (i.e., compare two items each time), and then propose (nearly) optimal pairwise ranking algorithms. We further make extensions to listwise ranking (i.e., comparing multiple items each time). Numerical results also show our improvements against the state of the art.
Tasks
Published 2019-09-07
URL https://arxiv.org/abs/1909.03194v1
PDF https://arxiv.org/pdf/1909.03194v1.pdf
PWC https://paperswithcode.com/paper/on-sample-complexity-upper-and-lower-bounds
Repo
Framework

Super-Trajectories: A Compact Yet Rich Video Representation

Title Super-Trajectories: A Compact Yet Rich Video Representation
Authors Ijaz Akhter, Cheong Loong Fah, Richard Hartley
Abstract We propose a new video representation in terms of an over-segmentation of dense trajectories covering the whole video. Trajectories are often used to encode long-temporal information in several computer vision applications. Similar to temporal superpixels, a temporal slice of super-trajectories are superpixels, but the later contains more information because it maintains the long dense pixel-wise tracking information as well. The main challenge in using trajectories for any application, is the accumulation of tracking error in the trajectory construction. For our problem, this results in disconnected superpixels. We exploit constraints for edges in addition to trajectory based color and position similarity. Analogous to superpixels as a preprocessing tool for images, the proposed representation has its applications for videos, especially in trajectory based video analysis.
Tasks
Published 2019-01-22
URL http://arxiv.org/abs/1901.07273v1
PDF http://arxiv.org/pdf/1901.07273v1.pdf
PWC https://paperswithcode.com/paper/super-trajectories-a-compact-yet-rich-video
Repo
Framework

Learning morphological operators for skin detection

Title Learning morphological operators for skin detection
Authors Alessandra Lumini, Loris Nanni, Alice Codogno, Filippo Berno
Abstract In this work we propose a novel post processing approach for skin detectors based on trained morphological operators. The first step, consisting in skin segmentation is performed according to an existing skin detection approach is performed for skin segmentation, then a second step is carried out consisting in the application of a set of morphological operators to refine the resulting mask. Extensive experimental evaluation performed considering two different detection approaches (one based on deep learning and a handcrafted one) carried on 10 different datasets confirms the quality of the proposed method.
Tasks
Published 2019-08-09
URL https://arxiv.org/abs/1908.03630v2
PDF https://arxiv.org/pdf/1908.03630v2.pdf
PWC https://paperswithcode.com/paper/learning-morphological-operators-for-skin
Repo
Framework
comments powered by Disqus