January 31, 2020

3214 words 16 mins read

Paper Group ANR 164

Paper Group ANR 164

Multiple Riemannian Manifold-valued Descriptors based Image Set Classification with Multi-Kernel Metric Learning. A Hierarchical Mixture Density Network. Video Skimming: Taxonomy and Comprehensive Survey. Air Quality Measurement Based on Double-Channel Convolutional Neural Network Ensemble Learning. Good-Enough Compositional Data Augmentation. A Li …

Multiple Riemannian Manifold-valued Descriptors based Image Set Classification with Multi-Kernel Metric Learning

Title Multiple Riemannian Manifold-valued Descriptors based Image Set Classification with Multi-Kernel Metric Learning
Authors Rui Wang, XiaoJun Wu, Josef Kittler
Abstract The importance of wild video based image set recognition is becoming monotonically increasing. However, the contents of these collected videos are often complicated, and how to efficiently perform set modeling and feature extraction is a big challenge for set-based classification algorithms. In recent years, some proposed image set classification methods have made a considerable advance by modeling the original image set with covariance matrix, linear subspace, or Gaussian distribution. As a matter of fact, most of them just adopt a single geometric model to describe each given image set, which may lose some other useful information for classification. To tackle this problem, we propose a novel algorithm to model each image set from a multi-geometric perspective. Specifically, the covariance matrix, linear subspace, and Gaussian distribution are applied for set representation simultaneously. In order to fuse these multiple heterogeneous Riemannian manifoldvalued features, the well-equipped Riemannian kernel functions are first utilized to map them into high dimensional Hilbert spaces. Then, a multi-kernel metric learning framework is devised to embed the learned hybrid kernels into a lower dimensional common subspace for classification. We conduct experiments on four widely used datasets corresponding to four different classification tasks: video-based face recognition, set-based object categorization, video-based emotion recognition, and dynamic scene classification, to evaluate the classification performance of the proposed algorithm. Extensive experimental results justify its superiority over the state-of-the-art.
Tasks Emotion Recognition, Face Recognition, Metric Learning, Scene Classification
Published 2019-08-06
URL https://arxiv.org/abs/1908.01950v1
PDF https://arxiv.org/pdf/1908.01950v1.pdf
PWC https://paperswithcode.com/paper/multiple-riemannian-manifold-valued
Repo
Framework

A Hierarchical Mixture Density Network

Title A Hierarchical Mixture Density Network
Authors Fan Yang, Jaymar Soriano, Takatomi Kubo, Kazushi Ikeda
Abstract The relationship among three correlated variables could be very sophisticated, as a result, we may not be able to find their hidden causality and model their relationship explicitly. However, we still can make our best guess for possible mappings among these variables, based on the observed relationship. One of the complicated relationships among three correlated variables could be a two-layer hierarchical many-to-many mapping. In this paper, we proposed a Hierarchical Mixture Density Network (HMDN) to model the two-layer hierarchical many-to-many mapping. We apply HMDN on an indoor positioning problem and show its benefit.
Tasks
Published 2019-10-23
URL https://arxiv.org/abs/1910.13523v1
PDF https://arxiv.org/pdf/1910.13523v1.pdf
PWC https://paperswithcode.com/paper/a-hierarchical-mixture-density-network
Repo
Framework

Video Skimming: Taxonomy and Comprehensive Survey

Title Video Skimming: Taxonomy and Comprehensive Survey
Authors Vivekraj V. K., Debashis Sen, Balasubramanian Raman
Abstract Video skimming, also known as dynamic video summarization, generates a temporally abridged version of a given video. Skimming can be achieved by identifying significant components either in uni-modal or multi-modal features extracted from the video. Being dynamic in nature, video skimming, through temporal connectivity, allows better understanding of the video from its summary. Having this obvious advantage, recently, video skimming has drawn the focus of many researchers benefiting from the easy availability of the required computing resources. In this paper, we provide a comprehensive survey on video skimming focusing on the substantial amount of literature from the past decade. We present a taxonomy of video skimming approaches, and discuss their evolution highlighting key advances. We also provide a study on the components required for the evaluation of a video skimming performance.
Tasks Video Summarization
Published 2019-09-21
URL https://arxiv.org/abs/1909.12948v1
PDF https://arxiv.org/pdf/1909.12948v1.pdf
PWC https://paperswithcode.com/paper/video-skimming-taxonomy-and-comprehensive
Repo
Framework

Air Quality Measurement Based on Double-Channel Convolutional Neural Network Ensemble Learning

Title Air Quality Measurement Based on Double-Channel Convolutional Neural Network Ensemble Learning
Authors Zhenyu Wang, Wei Zheng, Chunfeng Song
Abstract Environmental air quality affects people’s life, obtaining real-time and accurate environmental air quality has a profound guiding significance for the development of social activities. At present, environmental air quality measurement mainly adopts the method that setting air quality detector at specific monitoring points in cities and timing sampling analysis, which is easy to be restricted by time and space factors. Some air quality measurement algorithms related to deep learning mostly adopt a single convolutional neural network to train the whole image, which will ignore the difference of different parts of the image. In this paper, we propose a method for air quality measurement based on double-channel convolutional neural network ensemble learning to solve the problem of feature extraction for different parts of environmental images. Our method mainly includes two aspects: ensemble learning of double-channel convolutional neural network and self-learning weighted feature fusion. We constructed a double-channel convolutional neural network, used each channel to train different parts of the environment images for feature extraction. We propose a feature weight self-learning method, which weights and concatenates the extracted feature vectors, and uses the fused feature vectors to measure air quality. Our method can be applied to the two tasks of air quality grade measurement and air quality index (AQI) measurement. Moreover, we build an environmental image dataset of random time and location condition. The experiments show that our method can achieve nearly 82% accuracy and a small mean absolute error (MAE) on our test dataset. At the same time, through comparative experiment, we proved that our proposed method gained considerable improvement in performance compared with single channel convolutional neural network air quality measurements.
Tasks
Published 2019-02-19
URL http://arxiv.org/abs/1902.06942v3
PDF http://arxiv.org/pdf/1902.06942v3.pdf
PWC https://paperswithcode.com/paper/air-quality-measurement-based-on-double
Repo
Framework

Good-Enough Compositional Data Augmentation

Title Good-Enough Compositional Data Augmentation
Authors Jacob Andreas
Abstract We propose a simple data augmentation protocol aimed at providing a compositional inductive bias in conditional and unconditional sequence models. Under this protocol, synthetic training examples are constructed by taking real training examples and replacing (possibly discontinuous) fragments with other fragments that appear in at least one similar environment. The protocol is model-agnostic and useful for a variety of tasks. Applied to neural sequence-to-sequence models, it reduces relative error rate by up to 87% on problems from the diagnostic SCAN tasks and 16% on a semantic parsing task. Applied to n-gram language modeling, it reduces perplexity by roughly 1% on small datasets in several languages.
Tasks Data Augmentation, Language Modelling, Semantic Parsing
Published 2019-04-21
URL https://arxiv.org/abs/1904.09545v2
PDF https://arxiv.org/pdf/1904.09545v2.pdf
PWC https://paperswithcode.com/paper/good-enough-compositional-data-augmentation
Repo
Framework
Title A Link Between the Multiplicative and Additive Functional Asplund’s Metrics
Authors Guillaume Noyel
Abstract Functional Asplund’s metrics were recently introduced to perform pattern matching robust to lighting changes thanks to double-sided probing in the Logarithmic Image Processing (LIP) framework. Two metrics were defined, namely the LIP-multiplicative Asplund’s metric which is robust to variations of object thickness (or opacity) and the LIP-additive Asplund’s metric which is robust to variations of camera exposure-time (or light intensity). Maps of distances-i.e. maps of these metric values-were also computed between a reference template and an image. Recently, it was proven that the map of LIP-multiplicative As-plund’s distances corresponds to mathematical morphology operations. In this paper, the link between both metrics and between their corresponding distance maps will be demonstrated. It will be shown that the map of LIP-additive Asplund’s distances of an image can be computed from the map of the LIP-multiplicative Asplund’s distance of a transform of this image and vice-versa. Both maps will be related by the LIP isomorphism which will allow to pass from the image space of the LIP-additive distance map to the positive real function space of the LIP-multiplicative distance map. Experiments will illustrate this relation and the robustness of the LIP-additive Asplund’s metric to lighting changes.
Tasks
Published 2019-07-17
URL https://arxiv.org/abs/1907.07509v1
PDF https://arxiv.org/pdf/1907.07509v1.pdf
PWC https://paperswithcode.com/paper/a-link-between-the-multiplicative-and
Repo
Framework

Regional based query in graph active learning

Title Regional based query in graph active learning
Authors Roy Abel, Yoram Louzoun
Abstract Graph convolution networks (GCN) have emerged as the leading method to classify node classes in networks, and have reached the highest accuracy in multiple node classification tasks. In the absence of available tagged samples, active learning methods have been developed to obtain the highest accuracy using the minimal number of queries to an oracle. The current best active learning methods use the sample class uncertainty as selection criteria. However, in graph based classification, the class of each node is often related to the class of its neighbors. As such, the uncertainty in the class of a node’s neighbor may be a more appropriate selection criterion. We here propose two such criteria, one extending the classical uncertainty measure, and the other extending the page-rank algorithm. We show that the latter is optimal when the fraction of tagged nodes is low, and when this fraction grows to one over the average degree, the regional uncertainty performs better than all existing methods. While we have tested this methods on graphs, such methods can be extended to any classification problem, where a distance metrics can be defined between the input samples. All the code used can be accessed at : https://github.com/louzounlab/graph-al All the datasets used can be accessed at : https://github.com/louzounlab/DataSets
Tasks Active Learning, Node Classification
Published 2019-06-20
URL https://arxiv.org/abs/1906.08541v1
PDF https://arxiv.org/pdf/1906.08541v1.pdf
PWC https://paperswithcode.com/paper/regional-based-query-in-graph-active-learning
Repo
Framework

Learning Interpretable Features via Adversarially Robust Optimization

Title Learning Interpretable Features via Adversarially Robust Optimization
Authors Ashkan Khakzar, Shadi Albarqouni, Nassir Navab
Abstract Neural networks are proven to be remarkably successful for classification and diagnosis in medical applications. However, the ambiguity in the decision-making process and the interpretability of the learned features is a matter of concern. In this work, we propose a method for improving the feature interpretability of neural network classifiers. Initially, we propose a baseline convolutional neural network with state of the art performance in terms of accuracy and weakly supervised localization. Subsequently, the loss is modified to integrate robustness to adversarial examples into the training process. In this work, feature interpretability is quantified via evaluating the weakly supervised localization using the ground truth bounding boxes. Interpretability is also visually assessed using class activation maps and saliency maps. The method is applied to NIH ChestX-ray14, the largest publicly available chest x-rays dataset. We demonstrate that the adversarially robust optimization paradigm improves feature interpretability both quantitatively and visually.
Tasks Decision Making
Published 2019-05-09
URL https://arxiv.org/abs/1905.03767v2
PDF https://arxiv.org/pdf/1905.03767v2.pdf
PWC https://paperswithcode.com/paper/190503767
Repo
Framework

Multi-style Generative Reading Comprehension

Title Multi-style Generative Reading Comprehension
Authors Kyosuke Nishida, Itsumi Saito, Kosuke Nishida, Kazutoshi Shinoda, Atsushi Otsuka, Hisako Asano, Junji Tomita
Abstract This study tackles generative reading comprehension (RC), which consists of answering questions based on textual evidence and natural language generation (NLG). We propose a multi-style abstractive summarization model for question answering, called Masque. The proposed model has two key characteristics. First, unlike most studies on RC that have focused on extracting an answer span from the provided passages, our model instead focuses on generating a summary from the question and multiple passages. This serves to cover various answer styles required for real-world applications. Second, whereas previous studies built a specific model for each answer style because of the difficulty of acquiring one general model, our approach learns multi-style answers within a model to improve the NLG capability for all styles involved. This also enables our model to give an answer in the target style. Experiments show that our model achieves state-of-the-art performance on the Q&A task and the Q&A + NLG task of MS MARCO 2.1 and the summary task of NarrativeQA. We observe that the transfer of the style-independent NLG capability to the target style is the key to its success.
Tasks Abstractive Text Summarization, Question Answering, Reading Comprehension, Text Generation
Published 2019-01-08
URL https://arxiv.org/abs/1901.02262v2
PDF https://arxiv.org/pdf/1901.02262v2.pdf
PWC https://paperswithcode.com/paper/multi-style-generative-reading-comprehension
Repo
Framework

UU-Nets Connecting Discriminator and Generator for Image to Image Translation

Title UU-Nets Connecting Discriminator and Generator for Image to Image Translation
Authors Wu Jionghao
Abstract Adversarial generative model have successfully manifest itself in image synthesis. However, the performance deteriorate and unstable, because discriminator is far stable than generator, and it is hard to control the game between the two modules. Various methods have been introduced to tackle the problem such as WGAN, Relativistic GAN and their successors by adding or restricting the loss function, which certainly help balance the min-max game, but they all focused on the loss function ignoring the intrinsic structure limitation. We present a UU-Net architecture inspired by U-net bridging the encoder and the decoder, UU-Net composed by two U-Net liked modules respectively served as generator and discriminator. Because the modules in U-net are symmetrical, therefore it shares weights easily between all four components. Thanks to UU-net’s modules identical and symmetric property, we could not only carried the features from inner generator’s encoder to its decoder, but also to the discriminator’s encoder and decoder. By this design, it give us more control and condition flexibility to intervene the process between the generator and the discriminator.
Tasks Image Generation, Image-to-Image Translation
Published 2019-04-04
URL http://arxiv.org/abs/1904.02675v1
PDF http://arxiv.org/pdf/1904.02675v1.pdf
PWC https://paperswithcode.com/paper/uu-nets-connecting-discriminator-and
Repo
Framework

Pyramid: Machine Learning Framework to Estimate the Optimal Timing and Resource Usage of a High-Level Synthesis Design

Title Pyramid: Machine Learning Framework to Estimate the Optimal Timing and Resource Usage of a High-Level Synthesis Design
Authors Hosein Mohammadi Makrani, Farnoud Farahmand, Hossein Sayadi, Sara Bondi, Sai Manoj Pudukotai Dinakarrao, Liang Zhao, Avesta Sasan, Houman Homayoun, Setareh Rafatirad
Abstract The emergence of High-Level Synthesis (HLS) tools shifted the paradigm of hardware design by making the process of mapping high-level programming languages to hardware design such as C to VHDL/Verilog feasible. HLS tools offer a plethora of techniques to optimize designs for both area and performance, but resource usage and timing reports of HLS tools mostly deviate from the post-implementation results. In addition, to evaluate a hardware design performance, it is critical to determine the maximum achievable clock frequency. Obtaining such information using static timing analysis provided by CAD tools is difficult, due to the multitude of tool options. Moreover, a binary search to find the maximum frequency is tedious, time-consuming, and often does not obtain the optimal result. To address these challenges, we propose a framework, called Pyramid, that uses machine learning to accurately estimate the optimal performance and resource utilization of an HLS design. For this purpose, we first create a database of C-to-FPGA results from a diverse set of benchmarks. To find the achievable maximum clock frequency, we use Minerva, which is an automated hardware optimization tool. Minerva determines the close-to-optimal settings of tools, using static timing analysis and a heuristic algorithm, and targets either optimal throughput or throughput-to-area. Pyramid uses the database to train an ensemble machine learning model to map the HLS-reported features to the results of Minerva. To this end, Pyramid re-calibrates the results of HLS to bridge the accuracy gap and enable developers to estimate the throughput or throughput-to-area of hardware design with more than 95% accuracy and alleviates the need to perform actual implementation for estimation.
Tasks
Published 2019-07-29
URL https://arxiv.org/abs/1907.12952v1
PDF https://arxiv.org/pdf/1907.12952v1.pdf
PWC https://paperswithcode.com/paper/pyramid-machine-learning-framework-to
Repo
Framework

AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions

Title AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions
Authors Qiuyun Zhang, Bin Guo, Hao Wang, Yunji Liang, Shaoyang Hao, Zhiwen Yu
Abstract In the last two decades, the landscape of text generation has undergone tremendous changes and is being reshaped by the success of deep learning. New technologies for text generation ranging from template-based methods to neural network-based methods emerged. Meanwhile, the research objectives have also changed from generating smooth and coherent sentences to infusing personalized traits to enrich the diversification of newly generated content. With the rapid development of text generation solutions, one comprehensive survey is urgent to summarize the achievements and track the state of the arts. In this survey paper, we present the general systematical framework, illustrate the widely utilized models and summarize the classic applications of text generation.
Tasks Text Generation
Published 2019-05-01
URL http://arxiv.org/abs/1905.01984v1
PDF http://arxiv.org/pdf/1905.01984v1.pdf
PWC https://paperswithcode.com/paper/ai-powered-text-generation-for-harmonious
Repo
Framework

Unsupervised Contextual Anomaly Detection using Joint Deep Variational Generative Models

Title Unsupervised Contextual Anomaly Detection using Joint Deep Variational Generative Models
Authors Yaniv Shulman
Abstract A method for unsupervised contextual anomaly detection is proposed using a cross-linked pair of Variational Auto-Encoders for assigning a normality score to an observation. The method enables a distinct separation of contextual from behavioral attributes and is robust to the presence of anomalous or novel contextual attributes. The method can be trained with data sets that contain anomalies without any special pre-processing.
Tasks Anomaly Detection
Published 2019-04-01
URL http://arxiv.org/abs/1904.00548v1
PDF http://arxiv.org/pdf/1904.00548v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-contextual-anomaly-detection
Repo
Framework

A Deep DUAL-PATH Network for Improved Mammogram Image Processing

Title A Deep DUAL-PATH Network for Improved Mammogram Image Processing
Authors Heyi Li, Dongdong Chen, William H. Nailon, Mike E. Davies, Dave Laurenson
Abstract We present, for the first time, a novel deep neural network architecture called \dcn with a dual-path connection between the input image and output class label for mammogram image processing. This architecture is built upon U-Net, which non-linearly maps the input data into a deep latent space. One path of the \dcnn, the locality preserving learner, is devoted to hierarchically extracting and exploiting intrinsic features of the input, while the other path, called the conditional graph learner, focuses on modeling the input-mask correlations. The learned mask is further used to improve classification results, and the two learning paths complement each other. By integrating the two learners our new architecture provides a simple but effective way to jointly learn the segmentation and predict the class label. Benefiting from the powerful expressive capacity of deep neural networks a more discriminative representation can be learned, in which both the semantics and structure are well preserved. Experimental results show that \dcn achieves the best mammography segmentation and classification simultaneously, outperforming recent state-of-the-art models.
Tasks
Published 2019-03-01
URL http://arxiv.org/abs/1903.00001v1
PDF http://arxiv.org/pdf/1903.00001v1.pdf
PWC https://paperswithcode.com/paper/a-deep-dual-path-network-for-improved
Repo
Framework

MSD-Kmeans: A Novel Algorithm for Efficient Detection of Global and Local Outliers

Title MSD-Kmeans: A Novel Algorithm for Efficient Detection of Global and Local Outliers
Authors Yuanyuan Wei, Julian Jang-Jaccard, Fariza Sabrina, Timothy McIntosh
Abstract Outlier detection is a technique in data mining that aims to detect unusual or unexpected records in the dataset. Existing outlier detection algorithms have different pros and cons and exhibit different sensitivity to noisy data such as extreme values. In this paper, we propose a novel cluster-based outlier detection algorithm named MSD-Kmeans that combines the statistical method of Mean and Standard Deviation (MSD) and the machine learning clustering algorithm K-means to detect outliers more accurately with the better control of extreme values. There are two phases in this combination method of MSD-Kmeans: (1) applying MSD algorithm to eliminate as many noisy data to minimize the interference on clusters, and (2) applying K-means algorithm to obtain local optimal clusters. We evaluate our algorithm and demonstrate its effectiveness in the context of detecting possible overcharging of taxi fares, as greedy dishonest drivers may attempt to charge high fares by detouring. We compare the performance indicators of MSD-Kmeans with those of other outlier detection algorithms, such as MSD, K-means, Z-score, MIQR and LOF, and prove that the proposed MSD-Kmeans algorithm achieves the highest measure of precision, accuracy, and F-measure. We conclude that MSD-Kmeans can be used for effective and efficient outlier detection on data of varying quality on IoT devices.
Tasks Outlier Detection
Published 2019-10-15
URL https://arxiv.org/abs/1910.06588v1
PDF https://arxiv.org/pdf/1910.06588v1.pdf
PWC https://paperswithcode.com/paper/msd-kmeans-a-novel-algorithm-for-efficient
Repo
Framework
comments powered by Disqus