January 25, 2020

3214 words 16 mins read

Paper Group ANR 1747

Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning. Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity. Learning Priors for Adversarial Autoencoders. Mixed pooling of seasonality in time series pallet forecasting. A Survey on GANs for Anomaly Detection. Ensemble Quantile Classifier. …


Title	Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning
Authors	Rong-Cheng Tu, Xian-Ling Mao, Bing Ma, Yong Hu, Tan Yan, Wei Wei, Heyan Huang
Abstract	Due to their high retrieval efficiency and low storage cost, cross-modal hashing methods have attracted considerable attention. Generally, compared with shallow cross-modal hashing methods, deep cross-modal hashing methods can achieve a more satisfactory performance by integrating feature learning and hash codes optimizing into a same framework. However, most existing deep cross-modal hashing methods either cannot learn a unified hash code for the two correlated data-points of different modalities in a database instance or cannot guide the learning of unified hash codes by the feedback of hashing function learning procedure, to enhance the retrieval accuracy. To address the issues above, in this paper, we propose a novel end-to-end Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning (DCHUC). Specifically, by an iterative optimization algorithm, DCHUC jointly learns unified hash codes for image-text pairs in a database and a pair of hash functions for unseen query image-text pairs. With the iterative optimization algorithm, the learned unified hash codes can be used to guide the hashing function learning procedure; Meanwhile, the learned hashing functions can feedback to guide the unified hash codes optimizing procedure. Extensive experiments on three public datasets demonstrate that the proposed method outperforms the state-of-the-art cross-modal hashing methods.
Tasks
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12490v1
PDF	https://arxiv.org/pdf/1907.12490v1.pdf
PWC	https://paperswithcode.com/paper/deep-cross-modal-hashing-with-hashing
Repo
Framework

Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity


Title	Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity
Authors	Yang Hu, Giovanni Montana
Abstract	Transfer learning methods for reinforcement learning (RL) domains facilitate the acquisition of new skills using previously acquired knowledge. The vast majority of existing approaches assume that the agents have the same design, e.g. same shape and action spaces. In this paper we address the problem of transferring previously acquired skills amongst morphologically different agents (MDAs). For instance, assuming that a bipedal agent has been trained to move forward, could this skill be transferred on to a one-leg hopper so as to make its training process for the same task more sample efficient? We frame this problem as one of subspace learning whereby we aim to infer latent factors representing the control mechanism that is common between MDAs. We propose a novel paired variational encoder-decoder model, PVED, that disentangles the control of MDAs into shared and agent-specific factors. The shared factors are then leveraged for skill transfer using RL. Theoretically, we derive a theorem indicating how the performance of PVED depends on the shared factors and agent morphologies. Experimentally, PVED has been extensively validated on four MuJoCo environments. We demonstrate its performance compared to a state-of-the-art approach and several ablation cases, visualize and interpret the hidden factors, and identify avenues for future improvements.
Tasks	Transfer Learning
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05265v2
PDF	https://arxiv.org/pdf/1908.05265v2.pdf
PWC	https://paperswithcode.com/paper/skill-transfer-in-deep-reinforcement-learning
Repo
Framework

Learning Priors for Adversarial Autoencoders


Title	Learning Priors for Adversarial Autoencoders
Authors	Hui-Po Wang, Wen-Hsiao Peng, Wei-Jan Ko
Abstract	Most deep latent factor models choose simple priors for simplicity, tractability or not knowing what prior to use. Recent studies show that the choice of the prior may have a profound effect on the expressiveness of the model,especially when its generative network has limited capacity. In this paper, we propose to learn a proper prior from data for adversarial autoencoders(AAEs). We introduce the notion of code generators to transform manually selected simple priors into ones that can better characterize the data distribution. Experimental results show that the proposed model can generate better image quality and learn better disentangled representations than AAEs in both supervised and unsupervised settings. Lastly, we present its ability to do cross-domain translation in a text-to-image synthesis task.
Tasks	Image Generation
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04443v1
PDF	https://arxiv.org/pdf/1909.04443v1.pdf
PWC	https://paperswithcode.com/paper/learning-priors-for-adversarial-autoencoders-1
Repo
Framework

Mixed pooling of seasonality in time series pallet forecasting


Title	Mixed pooling of seasonality in time series pallet forecasting
Authors	Hyunji Moon, Hyeonseop Lee
Abstract	Multiple seasonal patterns play a key role in time series forecasting, especially for business time series where seasonal effects are often dramatic. Previous approaches including Fourier decomposition, exponential smoothing, and seasonal autoregressive integrated moving average (SARIMA) models do not reflect the distinct characteristics of each period in seasonal patterns, such as the unique behavior of specific days of the week in business data. We propose a multi-dimensional hierarchical model. Intermediate parameters for each seasonal period are first estimated, and a mixture of intermediate parameters is then taken, resulting in a model that successfully reflects the interactions between multiple seasonal patterns. Although this process reduces the data available for each parameter, a robust estimation can be obtained through a hierarchical Bayesian model implemented in Stan. Through this model, it becomes possible to consider both the characteristics of each seasonal period and the interactions among characteristics from multiple seasonal periods. Our new model achieved considerable improvements in prediction accuracy compared to previous models, including Fourier decomposition, which Prophet uses to model seasonality patterns. A comparison was performed on a real-world dataset of pallet transport from a national-scale logistic network.
Tasks	Time Series, Time Series Forecasting
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05339v1
PDF	https://arxiv.org/pdf/1908.05339v1.pdf
PWC	https://paperswithcode.com/paper/mixed-pooling-of-seasonality-in-time-series
Repo
Framework

A Survey on GANs for Anomaly Detection


Title	A Survey on GANs for Anomaly Detection
Authors	Federico Di Mattia, Paolo Galeone, Michele De Simoni, Emanuele Ghelfi
Abstract	Anomaly detection is a significant problem faced in several research areas. Detecting and correctly classifying something unseen as anomalous is a challenging problem that has been tackled in many different manners over the years. Generative Adversarial Networks (GANs) and the adversarial training process have been recently employed to face this task yielding remarkable results. In this paper we survey the principal GAN-based anomaly detection methods, highlighting their pros and cons. Our contributions are the empirical validation of the main GAN models for anomaly detection, the increase of the experimental results on different datasets and the public release of a complete Open Source toolbox for Anomaly Detection using GANs.
Tasks	Anomaly Detection
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11632v1
PDF	https://arxiv.org/pdf/1906.11632v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-gans-for-anomaly-detection
Repo
Framework

Ensemble Quantile Classifier


Title	Ensemble Quantile Classifier
Authors	Yuanhao Lai, Ian McLeod
Abstract	Both the median-based classifier and the quantile-based classifier are useful for discriminating high-dimensional data with heavy-tailed or skewed inputs. But these methods are restricted as they assign equal weight to each variable in an unregularized way. The ensemble quantile classifier is a more flexible regularized classifier that provides better performance with high-dimensional data, asymmetric data or when there are many irrelevant extraneous inputs. The improved performance is demonstrated by a simulation study as well as an application to text categorization. It is proven that the estimated parameters of the ensemble quantile classifier consistently estimate the minimal population loss under suitable general model assumptions. It is also shown that the ensemble quantile classifier is Bayes optimal under suitable assumptions with asymmetric Laplace distribution inputs.
Tasks	Text Categorization
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12960v1
PDF	https://arxiv.org/pdf/1910.12960v1.pdf
PWC	https://paperswithcode.com/paper/ensemble-quantile-classifier
Repo
Framework

Community detection over a heterogeneous population of non-aligned networks


Title	Community detection over a heterogeneous population of non-aligned networks
Authors	Guilherme Gomes, Vinayak Rao, Jennifer Neville
Abstract	Clustering and community detection with multiple graphs have typically focused on aligned graphs, where there is a mapping between nodes across the graphs (e.g., multi-view, multi-layer, temporal graphs). However, there are numerous application areas with multiple graphs that are only partially aligned, or even unaligned. These graphs are often drawn from the same population, with communities of potentially different sizes that exhibit similar structure. In this paper, we develop a joint stochastic blockmodel (Joint SBM) to estimate shared communities across sets of heterogeneous non-aligned graphs. We derive an efficient spectral clustering approach to learn the parameters of the joint SBM. We evaluate the model on both synthetic and real-world datasets and show that the joint model is able to exploit cross-graph information to better estimate the communities compared to learning separate SBMs on each individual graph.
Tasks	Community Detection
Published	2019-04-04
URL	http://arxiv.org/abs/1904.05332v1
PDF	http://arxiv.org/pdf/1904.05332v1.pdf
PWC	https://paperswithcode.com/paper/community-detection-over-a-heterogeneous
Repo
Framework

HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person Re-ID via Image Captioning


Title	HorNet: A Hierarchical Offshoot Recurrent Network for Improving Person Re-ID via Image Captioning
Authors	Shiyang Yan, Jun Xu, Yuai Liu, Lin Xu
Abstract	Person re-identification (re-ID) aims to recognize a person-of-interest across different cameras with notable appearance variance. Existing research works focused on the capability and robustness of visual representation. In this paper, instead, we propose a novel hierarchical offshoot recurrent network (HorNet) for improving person re-ID via image captioning. Image captions are semantically richer and more consistent than visual attributes, which could significantly alleviate the variance. We use the similarity preserving generative adversarial network (SPGAN) and an image captioner to fulfill domain transfer and language descriptions generation. Then the proposed HorNet can learn the visual and language representation from both the images and captions jointly, and thus enhance the performance of person re-ID. Extensive experiments are conducted on several benchmark datasets with or without image captions, i.e., CUHK03, Market-1501, and Duke-MTMC, demonstrating the superiority of the proposed method. Our method can generate and extract meaningful image captions while achieving state-of-the-art performance.
Tasks	Image Captioning, Person Re-Identification
Published	2019-08-14
URL	https://arxiv.org/abs/1908.04915v1
PDF	https://arxiv.org/pdf/1908.04915v1.pdf
PWC	https://paperswithcode.com/paper/hornet-a-hierarchical-offshoot-recurrent
Repo
Framework

Scene-based Factored Attention for Image Captioning


Title	Scene-based Factored Attention for Image Captioning
Authors	Chen Shen, Rongrong Ji, Fuhai Chen, Xiaoshuai Sun, Xiangming Li
Abstract	Image captioning has attracted ever-increasing research attention in the multimedia community. To this end, most cutting-edge works rely on an encoder-decoder framework with attention mechanisms, which have achieved remarkable progress. However, such a framework does not consider scene concepts to attend visual information, which leads to sentence bias in caption generation and defects the performance correspondingly. We argue that such scene concepts capture higher-level visual semantics and serve as an important cue in describing images. In this paper, we propose a novel scene-based factored attention module for image captioning. Specifically, the proposed module first embeds the scene concepts into factored weights explicitly and attends the visual information extracted from the input image. Then, an adaptive LSTM is used to generate captions for specific scene types. Experimental results on Microsoft COCO benchmark show that the proposed scene-based attention module improves model performance a lot, which outperforms the state-of-the-art approaches under various evaluation metrics.
Tasks	Image Captioning
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02632v3
PDF	https://arxiv.org/pdf/1908.02632v3.pdf
PWC	https://paperswithcode.com/paper/scene-based-factored-attention-for-image
Repo
Framework

Operational Framework for Recent Advances in Backtracking Search Optimisation Algorithm: A Systematic Review and Performance Evaluation


Title	Operational Framework for Recent Advances in Backtracking Search Optimisation Algorithm: A Systematic Review and Performance Evaluation
Authors	Bryar A. Hassan, Tarik A. Rashid
Abstract	The experiments conducted in previous studies demonstrated the successful performance of BSA and its non-sensitivity toward the several types of optimisation problems. This success of BSA motivated researchers to work on expanding it, e.g., developing its improved versions or employing it for different applications and problem domains. However, there is a lack of literature review on BSA; therefore, reviewing the aforementioned modifications and applications systematically will aid further development of the algorithm. This paper provides a systematic review and meta-analysis that emphasise on reviewing the related studies and recent developments on BSA. Hence, the objectives of this work are two-fold: (i) First, two frameworks for depicting the main extensions and the uses of BSA are proposed. The first framework is a general framework to depict the main extensions of BSA, whereas the second is an operational framework to present the expansion procedures of BSA to guide the researchers who are working on improving it. (ii) Second, the experiments conducted in this study fairly compare the analytical performance of BSA with four other competitive algorithms: differential evolution (DE), particle swarm optimisation (PSO), artificial bee colony (ABC), and firefly (FF) on 16 different hardness scores of the benchmark functions with different initial control parameters such as problem dimensions and search space. The experimental results indicate that BSA is statistically superior than the aforementioned algorithms in solving different cohorts of numerical optimisation problems such as problems with different levels of hardness score, problem dimensions, and search spaces.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13011v2
PDF	https://arxiv.org/pdf/1911.13011v2.pdf
PWC	https://paperswithcode.com/paper/operational-framework-for-recent-advances-in
Repo
Framework

Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment


Title	Automatic Radiology Report Generation based on Multi-view Image Fusion and Medical Concept Enrichment
Authors	Jianbo Yuan, Haofu Liao, Rui Luo, Jiebo Luo
Abstract	Generating radiology reports is time-consuming and requires extensive expertise in practice. Therefore, reliable automatic radiology report generation is highly desired to alleviate the workload. Although deep learning techniques have been successfully applied to image classification and image captioning tasks, radiology report generation remains challenging in regards to understanding and linking complicated medical visual contents with accurate natural language descriptions. In addition, the data scales of open-access datasets that contain paired medical images and reports remain very limited. To cope with these practical challenges, we propose a generative encoder-decoder model and focus on chest x-ray images and reports with the following improvements. First, we pretrain the encoder with a large number of chest x-ray images to accurately recognize 14 common radiographic observations, while taking advantage of the multi-view images by enforcing the cross-view consistency. Second, we synthesize multi-view visual features based on a sentence-level attention mechanism in a late fusion fashion. In addition, in order to enrich the decoder with descriptive semantics and enforce the correctness of the deterministic medical-related contents such as mentions of organs or diagnoses, we extract medical concepts based on the radiology reports in the training data and fine-tune the encoder to extract the most frequent medical concepts from the x-ray images. Such concepts are fused with each decoding step by a word-level attention model. The experimental results conducted on the Indiana University Chest X-Ray dataset demonstrate that the proposed model achieves the state-of-the-art performance compared with other baseline approaches.
Tasks	Image Captioning, Image Classification
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09085v2
PDF	https://arxiv.org/pdf/1907.09085v2.pdf
PWC	https://paperswithcode.com/paper/automatic-radiology-report-generation-based
Repo
Framework

Short Term Prediction of Parking Area states Using Real Time Data and Machine Learning Techniques


Title	Short Term Prediction of Parking Area states Using Real Time Data and Machine Learning Techniques
Authors	Jesper Provoost, Luc Wismans, Sander Van der Drift, Andreas Kamilaris, Maurice Van Keulen
Abstract	Public road authorities and private mobility service providers need information derived from the current and predicted traffic states to act upon the daily urban system and its spatial and temporal dynamics. In this research, a real-time parking area state (occupancy, in- and outflux) prediction model (up to 60 minutes ahead) has been developed using publicly available historic and real time data sources. Based on a case study in a real-life scenario in the city of Arnhem, a Neural Network-based approach outperforms a Random Forest-based one on all assessed performance measures, although the differences are small. Both are outperforming a naive seasonal random walk model. Although the performance degrades with increasing prediction horizon, the model shows a performance gain of over 150% at a prediction horizon of 60 minutes compared with the naive model. Furthermore, it is shown that predicting the in- and outflux is a far more difficult task (i.e. performance gains of 30%) which needs more training data, not based exclusively on occupancy rate. However, the performance of predicting in- and outflux is less sensitive to the prediction horizon. In addition, it is shown that real-time information of current occupancy rate is the independent variable with the highest contribution to the performance, although time, traffic flow and weather variables also deliver a significant contribution. During real-time deployment, the model performs three times better than the naive model on average. As a result, it can provide valuable information for proactive traffic management as well as mobility service providers.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13178v1
PDF	https://arxiv.org/pdf/1911.13178v1.pdf
PWC	https://paperswithcode.com/paper/short-term-prediction-of-parking-area-states
Repo
Framework

On Sample Complexity Upper and Lower Bounds for Exact Ranking from Noisy Comparisons


Title	On Sample Complexity Upper and Lower Bounds for Exact Ranking from Noisy Comparisons
Authors	Wenbo Ren, Jia Liu, Ness B. Shroff
Abstract	This paper studies the problem of finding the exact ranking from noisy comparisons. A comparison over a set of $m$ items produces a noisy outcome about the most preferred item, and reveals some information about the ranking. By repeatedly and adaptively choosing items to compare, we want to fully rank the items with a certain confidence, and use as few comparisons as possible. Different from most previous works, in this paper, we have three main novelties: (i) compared to prior works, our upper bounds (algorithms) and lower bounds on the sample complexity (aka number of comparisons) require the minimal assumptions on the instances, and are not restricted to specific models; (ii) we give lower bounds and upper bounds on instances with unequal noise levels; and (iii) this paper aims at the exact ranking without knowledge on the instances, while most of the previous works either focus on approximate rankings or study exact ranking but require prior knowledge. We first derive lower bounds for pairwise ranking (i.e., compare two items each time), and then propose (nearly) optimal pairwise ranking algorithms. We further make extensions to listwise ranking (i.e., comparing multiple items each time). Numerical results also show our improvements against the state of the art.
Tasks
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03194v1
PDF	https://arxiv.org/pdf/1909.03194v1.pdf
PWC	https://paperswithcode.com/paper/on-sample-complexity-upper-and-lower-bounds
Repo
Framework

Super-Trajectories: A Compact Yet Rich Video Representation


Title	Super-Trajectories: A Compact Yet Rich Video Representation
Authors	Ijaz Akhter, Cheong Loong Fah, Richard Hartley
Abstract	We propose a new video representation in terms of an over-segmentation of dense trajectories covering the whole video. Trajectories are often used to encode long-temporal information in several computer vision applications. Similar to temporal superpixels, a temporal slice of super-trajectories are superpixels, but the later contains more information because it maintains the long dense pixel-wise tracking information as well. The main challenge in using trajectories for any application, is the accumulation of tracking error in the trajectory construction. For our problem, this results in disconnected superpixels. We exploit constraints for edges in addition to trajectory based color and position similarity. Analogous to superpixels as a preprocessing tool for images, the proposed representation has its applications for videos, especially in trajectory based video analysis.
Tasks
Published	2019-01-22
URL	http://arxiv.org/abs/1901.07273v1
PDF	http://arxiv.org/pdf/1901.07273v1.pdf
PWC	https://paperswithcode.com/paper/super-trajectories-a-compact-yet-rich-video
Repo
Framework

Learning morphological operators for skin detection


Title	Learning morphological operators for skin detection
Authors	Alessandra Lumini, Loris Nanni, Alice Codogno, Filippo Berno
Abstract	In this work we propose a novel post processing approach for skin detectors based on trained morphological operators. The first step, consisting in skin segmentation is performed according to an existing skin detection approach is performed for skin segmentation, then a second step is carried out consisting in the application of a set of morphological operators to refine the resulting mask. Extensive experimental evaluation performed considering two different detection approaches (one based on deep learning and a handcrafted one) carried on 10 different datasets confirms the quality of the proposed method.
Tasks
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03630v2
PDF	https://arxiv.org/pdf/1908.03630v2.pdf
PWC	https://paperswithcode.com/paper/learning-morphological-operators-for-skin
Repo
Framework