October 18, 2019

3134 words 15 mins read

Paper Group ANR 547

Honey Authentication with Machine Learning Augmented Bright-Field Microscopy. Distribution Networks for Open Set Learning. Learning Light Field Reconstruction from a Single Coded Image. TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network. The Laplacian in RL: Learning Representations with Efficient Approximations. 3D Sc …

Honey Authentication with Machine Learning Augmented Bright-Field Microscopy


Title	Honey Authentication with Machine Learning Augmented Bright-Field Microscopy
Authors	Peter He, Alexis Gkantiragas, Gerard Glowacki
Abstract	Honey has been collected and used by humankind as both a food and medicine for thousands of years. However, in the modern economy, honey has become subject to mislabelling and adulteration making it the third most faked food product in the world. The international scale of fraudulent honey has had both economic and environmental ramifications. In this paper, we propose a novel method of identifying fraudulent honey using machine learning augmented microscopy.
Tasks
Published	2018-12-28
URL	http://arxiv.org/abs/1901.00516v1
PDF	http://arxiv.org/pdf/1901.00516v1.pdf
PWC	https://paperswithcode.com/paper/honey-authentication-with-machine-learning
Repo
Framework

Distribution Networks for Open Set Learning


Title	Distribution Networks for Open Set Learning
Authors	Chengsheng Mao, Liang Yao, Yuan Luo
Abstract	In open set learning, a model must be able to generalize to novel classes when it encounters a sample that does not belong to any of the classes it has seen before. Open set learning poses a realistic learning scenario that is receiving growing attention. Existing studies on open set learning mainly focused on detecting novel classes, but few studies tried to model them for differentiating novel classes. In this paper, we recognize that novel classes should be different from each other, and propose distribution networks for open set learning that can model different novel classes based on probability distributions. We hypothesize that, through a certain mapping, samples from different classes with the same classification criterion should follow different probability distributions from the same distribution family. A deep neural network is learned to map the samples in the original feature space to a latent space where the distributions of known classes can be jointly learned with the network. We additionally propose a distribution parameter transfer and updating strategy for novel class modeling when a novel class is detected in the latent space. By novel class modeling, the detected novel classes can serve as known classes to the subsequent classification. Our experimental results on image datasets MNIST and CIFAR10 show that the distribution networks can detect novel classes accurately, and model them well for the subsequent classification tasks.
Tasks	Open Set Learning
Published	2018-09-20
URL	http://arxiv.org/abs/1809.08106v2
PDF	http://arxiv.org/pdf/1809.08106v2.pdf
PWC	https://paperswithcode.com/paper/distribution-networks-for-open-set-learning
Repo
Framework

Learning Light Field Reconstruction from a Single Coded Image


Title	Learning Light Field Reconstruction from a Single Coded Image
Authors	Anil Kumar Vadathya, Saikiran Cholleti, Gautham Ramajayam, Vijayalakshmi Kanchana, Kaushik Mitra
Abstract	Light field imaging is a rich way of representing the 3D world around us. However, due to limited sensor resolution capturing light field data inherently poses spatio-angular resolution trade-off. In this paper, we propose a deep learning based solution to tackle the resolution trade-off. Specifically, we reconstruct full sensor resolution light field from a single coded image. We propose to do this in three stages 1) reconstruction of center view from the coded image 2) estimating disparity map from the coded image and center view 3) warping center view using the disparity to generate light field. We propose three neural networks for these stages. Our disparity estimation network is trained in an unsupervised manner alleviating the need for ground truth disparity. Our results demonstrate better recovery of parallax from the coded image. Also, we get better results than dictionary learning based approaches both qualitatively and quatitatively.
Tasks	Dictionary Learning, Disparity Estimation
Published	2018-01-20
URL	http://arxiv.org/abs/1801.06710v2
PDF	http://arxiv.org/pdf/1801.06710v2.pdf
PWC	https://paperswithcode.com/paper/learning-light-field-reconstruction-from-a
Repo
Framework

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network


Title	TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network
Authors	Yipeng Sun, Chengquan Zhang, Zuming Huang, Jiaming Liu, Junyu Han, Errui Ding
Abstract	Reading text from images remains challenging due to multi-orientation, perspective distortion and especially the curved nature of irregular text. Most of existing approaches attempt to solve the problem in two or multiple stages, which is considered to be the bottleneck to optimize the overall performance. To address this issue, we propose an end-to-end trainable network architecture, named TextNet, which is able to simultaneously localize and recognize irregular text from images. Specifically, we develop a scale-aware attention mechanism to learn multi-scale image features as a backbone network, sharing fully convolutional features and computation for localization and recognition. In text detection branch, we directly generate text proposals in quadrangles, covering oriented, perspective and curved text regions. To preserve text features for recognition, we introduce a perspective RoI transform layer, which can align quadrangle proposals into small feature maps. Furthermore, in order to extract effective features for recognition, we propose to encode the aligned RoI features by RNN into context information, combining spatial attention mechanism to generate text sequences. This overall pipeline is capable of handling both regular and irregular cases. Finally, text localization and recognition tasks can be jointly trained in an end-to-end fashion with designed multi-task loss. Experiments on standard benchmarks show that the proposed TextNet can achieve state-of-the-art performance, and outperform existing approaches on irregular datasets by a large margin.
Tasks	Optical Character Recognition
Published	2018-12-24
URL	http://arxiv.org/abs/1812.09900v1
PDF	http://arxiv.org/pdf/1812.09900v1.pdf
PWC	https://paperswithcode.com/paper/textnet-irregular-text-reading-from-images
Repo
Framework

The Laplacian in RL: Learning Representations with Efficient Approximations


Title	The Laplacian in RL: Learning Representations with Efficient Approximations
Authors	Yifan Wu, George Tucker, Ofir Nachum
Abstract	The smallest eigenvectors of the graph Laplacian are well-known to provide a succinct representation of the geometry of a weighted graph. In reinforcement learning (RL), where the weighted graph may be interpreted as the state transition process induced by a behavior policy acting on the environment, approximating the eigenvectors of the Laplacian provides a promising approach to state representation learning. However, existing methods for performing this approximation are ill-suited in general RL settings for two main reasons: First, they are computationally expensive, often requiring operations on large matrices. Second, these methods lack adequate justification beyond simple, tabular, finite-state settings. In this paper, we present a fully general and scalable method for approximating the eigenvectors of the Laplacian in a model-free RL context. We systematically evaluate our approach and empirically show that it generalizes beyond the tabular, finite-state setting. Even in tabular, finite-state settings, its ability to approximate the eigenvectors outperforms previous proposals. Finally, we show the potential benefits of using a Laplacian representation learned using our method in goal-achieving RL tasks, providing evidence that our technique can be used to significantly improve the performance of an RL agent.
Tasks	Representation Learning
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04586v1
PDF	http://arxiv.org/pdf/1810.04586v1.pdf
PWC	https://paperswithcode.com/paper/the-laplacian-in-rl-learning-representations
Repo
Framework

3D Scanning: A Comprehensive Survey


Title	3D Scanning: A Comprehensive Survey
Authors	Morteza Daneshmand, Ahmed Helmi, Egils Avots, Fatemeh Noroozi, Fatih Alisinanoglu, Hasan Sait Arslan, Jelena Gorbova, Rain Eric Haamer, Cagri Ozcinar, Gholamreza Anbarjafari
Abstract	This paper provides an overview of 3D scanning methodologies and technologies proposed in the existing scientific and industrial literature. Throughout the paper, various types of the related techniques are reviewed, which consist, mainly, of close-range, aerial, structure-from-motion and terrestrial photogrammetry, and mobile, terrestrial and airborne laser scanning, as well as time-of-flight, structured-light and phase-comparison methods, along with comparative and combinational studies, the latter being intended to help make a clearer distinction on the relevance and reliability of the possible choices. Moreover, outlier detection and surface fitting procedures are discussed concisely, which are necessary post-processing stages.
Tasks	Outlier Detection
Published	2018-01-24
URL	http://arxiv.org/abs/1801.08863v1
PDF	http://arxiv.org/pdf/1801.08863v1.pdf
PWC	https://paperswithcode.com/paper/3d-scanning-a-comprehensive-survey
Repo
Framework

Applications of Graph Integration to Function Comparison and Malware Classification


Title	Applications of Graph Integration to Function Comparison and Malware Classification
Authors	Michael A. Slawinski, Andy Wortman
Abstract	We classify .NET files as either benign or malicious by examining directed graphs derived from the set of functions comprising the given file. Each graph is viewed probabilistically as a Markov chain where each node represents a code block of the corresponding function, and by computing the PageRank vector (Perron vector with transport), a probability measure can be defined over the nodes of the given graph. Each graph is vectorized by computing Lebesgue antiderivatives of hand-engineered functions defined on the vertex set of the given graph against the PageRank measure. Files are subsequently vectorized by aggregating the set of vectors corresponding to the set of graphs resulting from decompiling the given file. The result is a fast, intuitive, and easy-to-compute glass-box vectorization scheme, which can be leveraged for training a standalone classifier or to augment an existing feature space. We refer to this vectorization technique as PageRank Measure Integration Vectorization (PMIV). We demonstrate the efficacy of PMIV by training a vanilla random forest on 2.5 million samples of decompiled .NET, evenly split between benign and malicious, from our in-house corpus and compare this model to a baseline model which leverages a text-only feature space. The median time needed for decompilation and scoring was 24ms.
Tasks	Malware Classification
Published	2018-10-11
URL	https://arxiv.org/abs/1810.04789v6
PDF	https://arxiv.org/pdf/1810.04789v6.pdf
PWC	https://paperswithcode.com/paper/applications-of-pagerank-to-function
Repo
Framework

A syllable based model for handwriting recognition


Title	A syllable based model for handwriting recognition
Authors	Wassim Swaileh, Thierry Paquet
Abstract	In this paper, we introduce a new modeling approach of texts for handwriting recognition based on syllables. We propose a supervised syllabification approach for the French and English languages for building a vocabulary of syllables. Statistical n-gram language models of syllables are trained on French and English Wikipedia corpora. The handwriting recognition system, based on optical HMM context independent character models, performs a two pass decoding, integrating the proposed syllabic models. Evaluation is carried out on the French RIMES dataset and English IAM dataset by analyzing the performance for various coverage of the syllable models. We also compare the syllable models with lexicon and character n-gram models. The proposed approach reaches interesting performances thanks to its capacity to cover a large amount of out of vocabulary words working with a limited amount of syllables combined with statistical n-gram of reasonable order.
Tasks
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07277v1
PDF	http://arxiv.org/pdf/1808.07277v1.pdf
PWC	https://paperswithcode.com/paper/a-syllable-based-model-for-handwriting
Repo
Framework

Taming Convergence for Asynchronous Stochastic Gradient Descent with Unbounded Delay in Non-Convex Learning


Title	Taming Convergence for Asynchronous Stochastic Gradient Descent with Unbounded Delay in Non-Convex Learning
Authors	Xin Zhang, Jia Liu, Zhengyuan Zhu
Abstract	Understanding the convergence performance of asynchronous stochastic gradient descent method (Async-SGD) has received increasing attention in recent years due to their foundational role in machine learning. To date, however, most of the existing works are restricted to either bounded gradient delays or convex settings. In this paper, we focus on Async-SGD and its variant Async-SGDI (which uses increasing batch size) for non-convex optimization problems with unbounded gradient delays. We prove $o(1/\sqrt{k})$ convergence rate for Async-SGD and $o(1/k)$ for Async-SGDI. Also, a unifying sufficient condition for Async-SGD’s convergence is established, which includes two major gradient delay models in the literature as special cases and yields a new delay model not considered thus far.
Tasks
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09470v1
PDF	http://arxiv.org/pdf/1805.09470v1.pdf
PWC	https://paperswithcode.com/paper/taming-convergence-for-asynchronous
Repo
Framework

MPDCompress - Matrix Permutation Decomposition Algorithm for Deep Neural Network Compression


Title	MPDCompress - Matrix Permutation Decomposition Algorithm for Deep Neural Network Compression
Authors	Lazar Supic, Rawan Naous, Ranko Sredojevic, Aleksandra Faust, Vladimir Stojanovic
Abstract	Deep neural networks (DNNs) have become the state-of-the-art technique for machine learning tasks in various applications. However, due to their size and the computational complexity, large DNNs are not readily deployable on edge devices in real-time. To manage complexity and accelerate computation, network compression techniques based on pruning and quantization have been proposed and shown to be effective in reducing network size. However, such network compression can result in irregular matrix structures that are mismatched with modern hardware-accelerated platforms, such as graphics processing units (GPUs) designed to perform the DNN matrix multiplications in a structured (block-based) way. We propose MPDCompress, a DNN compression algorithm based on matrix permutation decomposition via random mask generation. In-training application of the masks molds the synaptic weight connection matrix to a sub-graph separation format. Aided by the random permutations, a hardware-desirable block matrix is generated, allowing for a more efficient implementation and compression of the network. To show versatility, we empirically verify MPDCompress on several network models, compression rates, and image datasets. On the LeNet 300-100 model (MNIST dataset), Deep MNIST, and CIFAR10, we achieve 10 X network compression with less than 1% accuracy loss compared to non-compressed accuracy performance. On AlexNet for the full ImageNet ILSVRC-2012 dataset, we achieve 8 X network compression with less than 1% accuracy loss, with top-5 and top-1 accuracies of 79.6% and 56.4%, respectively. Finally, we observe that the algorithm can offer inference speedups across various hardware platforms, with 4 X faster operation achieved on several mobile GPUs.
Tasks	Neural Network Compression, Quantization
Published	2018-05-30
URL	http://arxiv.org/abs/1805.12085v1
PDF	http://arxiv.org/pdf/1805.12085v1.pdf
PWC	https://paperswithcode.com/paper/mpdcompress-matrix-permutation-decomposition
Repo
Framework

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach


Title	The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach
Authors	Iulian Vlad Serban, Chinnadhurai Sankar, Michael Pieper, Joelle Pineau, Yoshua Bengio
Abstract	Deep reinforcement learning has recently shown many impressive successes. However, one major obstacle towards applying such methods to real-world problems is their lack of data-efficiency. To this end, we propose the Bottleneck Simulator: a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn an effective policy from few examples. The learned transition model employs an abstract, discrete (bottleneck) state, which increases sample efficiency by reducing the number of model parameters and by exploiting structural properties of the environment. We provide a mathematical analysis of the Bottleneck Simulator in terms of fixed points of the learned policy, which reveals how performance is affected by four distinct sources of error: an error related to the abstract space structure, an error related to the transition model estimation variance, an error related to the transition model estimation bias, and an error related to the transition model class bias. Finally, we evaluate the Bottleneck Simulator on two natural language processing tasks: a text adventure game and a real-world, complex dialogue response selection task. On both tasks, the Bottleneck Simulator yields excellent performance beating competing approaches.
Tasks
Published	2018-07-12
URL	http://arxiv.org/abs/1807.04723v1
PDF	http://arxiv.org/pdf/1807.04723v1.pdf
PWC	https://paperswithcode.com/paper/the-bottleneck-simulator-a-model-based-deep
Repo
Framework

Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition


Title	Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition
Authors	Wei-Ning Hsu, Hao Tang, James Glass
Abstract	The current trend in automatic speech recognition is to leverage large amounts of labeled data to train supervised neural network models. Unfortunately, obtaining data for a wide range of domains to train robust models can be costly. However, it is relatively inexpensive to collect large amounts of unlabeled data from domains that we want the models to generalize to. In this paper, we propose a novel unsupervised adaptation method that learns to synthesize labeled data for the target domain from unlabeled in-domain data and labeled out-of-domain data. We first learn without supervision an interpretable latent representation of speech that encodes linguistic and nuisance factors (e.g., speaker and channel) using different latent variables. To transform a labeled out-of-domain utterance without altering its transcript, we transform the latent nuisance variables while maintaining the linguistic variables. To demonstrate our approach, we focus on a channel mismatch setting, where the domain of interest is distant conversational speech, and labels are only available for close-talking speech. Our proposed method is evaluated on the AMI dataset, outperforming all baselines and bridging the gap between unadapted and in-domain models by over 77% without using any parallel data.
Tasks	Speech Recognition
Published	2018-06-13
URL	http://arxiv.org/abs/1806.04872v1
PDF	http://arxiv.org/pdf/1806.04872v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-adaptation-with-interpretable
Repo
Framework

Exploratory Data Analysis of a Network Telescope Traffic and Prediction of Port Probing Rates


Title	Exploratory Data Analysis of a Network Telescope Traffic and Prediction of Port Probing Rates
Authors	Mehdi Zakroum, Abdellah Houmz, Mounir Ghogho, Ghita Mezzour, Abdelkader Lahmadi, Jérôme François, Mohammed El Koutbi
Abstract	Understanding the properties exhibited by large scale network probing traffic would improve cyber threat intelligence. In addition, the prediction of probing rates is a key feature for security practitioners in their endeavors for making better operational decisions and for enhancing their defense strategy skills. In this work, we study different aspects of the traffic captured by a /20 network telescope. First, we perform an exploratory data analysis of the collected probing activities. The investigation includes probing rates at the port level, services interesting top network probers and the distribution of probing rates by geolocation. Second, we extract the network probers exploration patterns. We model these behaviors using transition graphs decorated with probabilities of switching from a port to another. Finally, we assess the capacity of Non-stationary Autoregressive and Vector Autoregressive models in predicting port probing rates as a first step towards using more robust models for better forecasting performance.
Tasks
Published	2018-12-23
URL	http://arxiv.org/abs/1812.09790v2
PDF	http://arxiv.org/pdf/1812.09790v2.pdf
PWC	https://paperswithcode.com/paper/exploratory-data-analysis-of-a-network
Repo
Framework


Title	Social Vehicle Swarms: A Novel Perspective on Social-aware Vehicular Communication Architecture
Authors	Yue Zhang, Fang Tian, Bin Song, Xiaojiang Du
Abstract	Internet of vehicles is a promising area related to D2D communication and internet of things. We present a novel perspective for vehicular communications, social vehicle swarms, to study and analyze socially aware internet of vehicles with the assistance of an agent-based model intended to reveal hidden patterns behind superficial data. After discussing its components, namely its agents, environments, and rules, we introduce supportive technology and methods, deep reinforcement learning, privacy preserving data mining and sub-cloud computing, in order to detect the most significant and interesting information for each individual effectively, which is the key desire. Finally, several relevant research topics and challenges are discussed.
Tasks
Published	2018-10-29
URL	http://arxiv.org/abs/1810.11947v1
PDF	http://arxiv.org/pdf/1810.11947v1.pdf
PWC	https://paperswithcode.com/paper/social-vehicle-swarms-a-novel-perspective-on
Repo
Framework

From Classical to Generalized Zero-Shot Learning: a Simple Adaptation Process


Title	From Classical to Generalized Zero-Shot Learning: a Simple Adaptation Process
Authors	Yannick Le Cacheux, Hervé Le Borgne, Michel Crucianu
Abstract	Zero-shot learning (ZSL) is concerned with the recognition of previously unseen classes. It relies on additional semantic knowledge for which a mapping can be learned with training examples of seen classes. While classical ZSL considers the recognition performance on unseen classes only, generalized zero-shot learning (GZSL) aims at maximizing performance on both seen and unseen classes. In this paper, we propose a new process for training and evaluation in the GZSL setting; this process addresses the gap in performance between samples from unseen and seen classes by penalizing the latter, and enables to select hyper-parameters well-suited to the GZSL task. It can be applied to any existing ZSL approach and leads to a significant performance boost: the experimental evaluation shows that GZSL performance, averaged over eight state-of-the-art methods, is improved from 28.5 to 42.2 on CUB and from 28.2 to 57.1 on AwA2.
Tasks	Zero-Shot Learning
Published	2018-09-26
URL	http://arxiv.org/abs/1809.10120v1
PDF	http://arxiv.org/pdf/1809.10120v1.pdf
PWC	https://paperswithcode.com/paper/from-classical-to-generalized-zero-shot
Repo
Framework