July 27, 2019

3260 words 16 mins read

Paper Group ANR 695

R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections. Semi-supervised classification for dynamic Android malware detection. The Convergence of Machine Learning and Communications. A Comparison of Resampling and Recursive Partitioning Methods in Random Forest for Estimating the Asymptotic Variance Using the Infin …

R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections


Title	R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections
Authors	TonTon Hsien-De Huang, Hung-Yu Kao
Abstract	The influence of Deep Learning on image identification and natural language processing has attracted enormous attention globally. The convolution neural network that can learn without prior extraction of features fits well in response to the rapid iteration of Android malware. The traditional solution for detecting Android malware requires continuous learning through pre-extracted features to maintain high performance of identifying the malware. In order to reduce the manpower of feature engineering prior to the condition of not to extract pre-selected features, we have developed a coloR-inspired convolutional neuRal networks (CNN)-based AndroiD malware Detection (R2-D2) system. The system can convert the bytecode of classes.dex from Android archive file to rgb color code and store it as a color image with fixed size. The color image is input to the convolutional neural network for automatic feature extraction and training. The data was collected from Jan. 2017 to Aug 2017. During the period of time, we have collected approximately 2 million of benign and malicious Android apps for our experiments with the help from our research partner Leopard Mobile Inc. Our experiment results demonstrate that the proposed system has accurate security analysis on contracts. Furthermore, we keep our research results and experiment materials on http://R2D2.TWMAN.ORG.
Tasks	Android Malware Detection, Feature Engineering, Malware Detection
Published	2017-05-12
URL	http://arxiv.org/abs/1705.04448v5
PDF	http://arxiv.org/pdf/1705.04448v5.pdf
PWC	https://paperswithcode.com/paper/r2-d2-color-inspired-convolutional-neural
Repo
Framework

Semi-supervised classification for dynamic Android malware detection


Title	Semi-supervised classification for dynamic Android malware detection
Authors	Li Chen, Mingwei Zhang, Chih-Yuan Yang, Ravi Sahita
Abstract	A growing number of threats to Android phones creates challenges for malware detection. Manually labeling the samples into benign or different malicious families requires tremendous human efforts, while it is comparably easy and cheap to obtain a large amount of unlabeled APKs from various sources. Moreover, the fast-paced evolution of Android malware continuously generates derivative malware families. These families often contain new signatures, which can escape detection when using static analysis. These practical challenges can also cause traditional supervised machine learning algorithms to degrade in performance. In this paper, we propose a framework that uses model-based semi-supervised (MBSS) classification scheme on the dynamic Android API call logs. The semi-supervised approach efficiently uses the labeled and unlabeled APKs to estimate a finite mixture model of Gaussian distributions via conditional expectation-maximization and efficiently detects malwares during out-of-sample testing. We compare MBSS with the popular malware detection classifiers such as support vector machine (SVM), $k$-nearest neighbor (kNN) and linear discriminant analysis (LDA). Under the ideal classification setting, MBSS has competitive performance with 98% accuracy and very low false positive rate for in-sample classification. For out-of-sample testing, the out-of-sample test data exhibit similar behavior of retrieving phone information and sending to the network, compared with in-sample training set. When this similarity is strong, MBSS and SVM with linear kernel maintain 90% detection rate while $k$NN and LDA suffer great performance degradation. When this similarity is slightly weaker, all classifiers degrade in performance, but MBSS still performs significantly better than other classifiers.
Tasks	Android Malware Detection, Malware Detection
Published	2017-04-19
URL	http://arxiv.org/abs/1704.05948v1
PDF	http://arxiv.org/pdf/1704.05948v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-classification-for-dynamic
Repo
Framework

The Convergence of Machine Learning and Communications


Title	The Convergence of Machine Learning and Communications
Authors	Wojciech Samek, Slawomir Stanczak, Thomas Wiegand
Abstract	The areas of machine learning and communication technology are converging. Today’s communications systems generate a huge amount of traffic data, which can help to significantly enhance the design and management of networks and communication components when combined with advanced machine learning methods. Furthermore, recently developed end-to-end training procedures offer new ways to jointly optimize the components of a communication system. Also in many emerging application fields of communication technology, e.g., smart cities or internet of things, machine learning methods are of central importance. This paper gives an overview over the use of machine learning in different areas of communications and discusses two exemplar applications in wireless networking. Furthermore, it identifies promising future research topics and discusses their potential impact.
Tasks
Published	2017-08-28
URL	http://arxiv.org/abs/1708.08299v1
PDF	http://arxiv.org/pdf/1708.08299v1.pdf
PWC	https://paperswithcode.com/paper/the-convergence-of-machine-learning-and
Repo
Framework

A Comparison of Resampling and Recursive Partitioning Methods in Random Forest for Estimating the Asymptotic Variance Using the Infinitesimal Jackknife


Title	A Comparison of Resampling and Recursive Partitioning Methods in Random Forest for Estimating the Asymptotic Variance Using the Infinitesimal Jackknife
Authors	Cole Brokamp, MB Rao, Patrick Ryan, Roman Jandarov
Abstract	The infinitesimal jackknife (IJ) has recently been applied to the random forest to estimate its prediction variance. These theorems were verified under a traditional random forest framework which uses classification and regression trees (CART) and bootstrap resampling. However, random forests using conditional inference (CI) trees and subsampling have been found to be not prone to variable selection bias. Here, we conduct simulation experiments using a novel approach to explore the applicability of the IJ to random forests using variations on the resampling method and base learner. Test data points were simulated and each trained using random forest on one hundred simulated training data sets using different combinations of resampling and base learners. Using CI trees instead of traditional CART trees as well as using subsampling instead of bootstrap sampling resulted in a much more accurate estimation of prediction variance when using the IJ. The random forest variations here have been incorporated into an open source software package for the R programming language.
Tasks
Published	2017-06-19
URL	http://arxiv.org/abs/1706.06150v2
PDF	http://arxiv.org/pdf/1706.06150v2.pdf
PWC	https://paperswithcode.com/paper/a-comparison-of-resampling-and-recursive
Repo
Framework

A Predictive Account of Cafe Wall Illusions Using a Quantitative Model


Title	A Predictive Account of Cafe Wall Illusions Using a Quantitative Model
Authors	Nasim Nematzadeh, David M. W. Powers
Abstract	This paper explores the tilt illusion effect in the Cafe Wall pattern using a classical Gaussian Receptive Field model. In this illusion, the mortar lines are misperceived as diverging or converging rather than horizontal. We examine the capability of a simple bioplausible filtering model to recognize different degrees of tilt effect in the Cafe Wall illusion based on different characteristics of the pattern. Our study employed a Difference of Gaussians model of retinal to cortical ON center and/or OFF center receptive fields. A wide range of parameters of the stimulus, for example mortar thickness, luminance, tiles contrast, phase of the tile displacement, have been studied. Our model constructs an edge map representation at multiple scales that reveals tilt cues and clues involved in the illusory perception of the Cafe Wall pattern. We present here that our model can not only detect the tilt in this pattern, but also can predict the strength of the illusion and quantify the degree of tilt. For the first time quantitative predictions of a model are reported for this stimulus. The results of our simulations are consistent with previous psychophysical findings across the full range of Cafe Wall variations tested. Our results also suggest that the Difference of Gaussians mechanism is the heart of the effects explained by, and the mechanisms proposed for, the Irradiation, Brightness Induction, and Bandpass Filtering models.
Tasks
Published	2017-05-19
URL	http://arxiv.org/abs/1705.06846v4
PDF	http://arxiv.org/pdf/1705.06846v4.pdf
PWC	https://paperswithcode.com/paper/a-predictive-account-of-cafe-wall-illusions
Repo
Framework

Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics


Title	Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics
Authors	Christoph Wehmeyer, Frank Noé
Abstract	Inspired by the success of deep learning techniques in the physical and chemical sciences, we apply a modification of an autoencoder type deep neural network to the task of dimension reduction of molecular dynamics data. We can show that our time-lagged autoencoder reliably finds low-dimensional embeddings for high-dimensional feature spaces which capture the slow dynamics of the underlying stochastic processes - beyond the capabilities of linear dimension reduction techniques.
Tasks	Dimensionality Reduction
Published	2017-10-30
URL	http://arxiv.org/abs/1710.11239v1
PDF	http://arxiv.org/pdf/1710.11239v1.pdf
PWC	https://paperswithcode.com/paper/time-lagged-autoencoders-deep-learning-of
Repo
Framework

Towards Automatic Construction of Diverse, High-quality Image Dataset


Title	Towards Automatic Construction of Diverse, High-quality Image Dataset
Authors	Yazhou Yao, Jian Zhang, Fumin Shen, Li Liu, Fan Zhu, Dongxiang Zhang, Heng-Tao Shen
Abstract	The availability of labeled image datasets has been shown critical for high-level image understanding, which continuously drives the progress of feature designing and models developing. However, constructing labeled image datasets is laborious and monotonous. To eliminate manual annotation, in this work, we propose a novel image dataset construction framework by employing multiple textual queries. We aim at collecting diverse and accurate images for given queries from the Web. Specifically, we formulate noisy textual queries removing and noisy images filtering as a multi-view and multi-instance learning problem separately. Our proposed approach not only improves the accuracy but also enhances the diversity of the selected images. To verify the effectiveness of our proposed approach, we construct an image dataset with 100 categories. The experiments show significant performance gains by using the generated data of our approach on several tasks, such as image classification, cross-dataset generalization, and object detection. The proposed method also consistently outperforms existing weakly supervised and web-supervised approaches.
Tasks	Image Classification, Object Detection
Published	2017-08-22
URL	http://arxiv.org/abs/1708.06495v2
PDF	http://arxiv.org/pdf/1708.06495v2.pdf
PWC	https://paperswithcode.com/paper/towards-automatic-construction-of-diverse
Repo
Framework

Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning


Title	Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning
Authors	Qing Sun, Stefan Lee, Dhruv Batra
Abstract	We develop the first approximate inference algorithm for 1-Best (and M-Best) decoding in bidirectional neural sequence models by extending Beam Search (BS) to reason about both forward and backward time dependencies. Beam Search (BS) is a widely used approximate inference algorithm for decoding sequences from unidirectional neural sequence models. Interestingly, approximate inference in bidirectional models remains an open problem, despite their significant advantage in modeling information from both the past and future. To enable the use of bidirectional models, we present Bidirectional Beam Search (BiBS), an efficient algorithm for approximate bidirectional inference.To evaluate our method and as an interesting problem in its own right, we introduce a novel Fill-in-the-Blank Image Captioning task which requires reasoning about both past and future sentence structure to reconstruct sensible image descriptions. We use this task as well as the Visual Madlibs dataset to demonstrate the effectiveness of our approach, consistently outperforming all baseline methods.
Tasks	Image Captioning
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08759v1
PDF	http://arxiv.org/pdf/1705.08759v1.pdf
PWC	https://paperswithcode.com/paper/bidirectional-beam-search-forward-backward
Repo
Framework

Visual Graph Mining


Title	Visual Graph Mining
Authors	Quanshi Zhang, Xuan Song, Ryosuke Shibasaki
Abstract	In this study, we formulate the concept of “mining maximal-size frequent subgraphs” in the challenging domain of visual data (images and videos). In general, visual knowledge can usually be modeled as attributed relational graphs (ARGs) with local attributes representing local parts and pairwise attributes describing the spatial relationship between parts. Thus, from a practical perspective, such mining of maximal-size subgraphs can be regarded as a general platform for discovering and modeling the common objects within cluttered and unlabeled visual data. Then, from a theoretical perspective, visual graph mining should encode and overcome the great fuzziness of messy data collected from complex real-world situations, which conflicts with the conventional theoretical basis of graph mining designed for tabular data. Common subgraphs hidden in these ARGs usually have soft attributes, with considerable inter-graph variation. More importantly, we should also discover the latent pattern space, including similarity metrics for the pattern and hidden node relations, during the mining process. In this study, we redefine the visual subgraph pattern that encodes all of these challenges in a general way, and propose an approximate but efficient solution to graph mining. We conduct five experiments to evaluate our method with different kinds of visual data, including videos and RGB/RGB-D images. These experiments demonstrate the generality of the proposed method.
Tasks
Published	2017-08-13
URL	http://arxiv.org/abs/1708.03921v1
PDF	http://arxiv.org/pdf/1708.03921v1.pdf
PWC	https://paperswithcode.com/paper/visual-graph-mining
Repo
Framework

Derivate-based Component-Trees for Multi-Channel Image Segmentation


Title	Derivate-based Component-Trees for Multi-Channel Image Segmentation
Authors	Tobias Böttger, Dominik Gutermuth
Abstract	We introduce the concept of derivate-based component-trees for images with an arbitrary number of channels. The approach is a natural extension of the classical component-tree devoted to gray-scale images. The similar structure enables the translation of many gray-level image processing techniques based on the component-tree to hyperspectral and color images. As an example application, we present an image segmentation approach that extracts Maximally Stable Homogeneous Regions (MSHR). The approach very similar to MSER but can be applied to images with an arbitrary number of channels. As opposed to MSER, our approach implicitly segments regions with are both lighter and darker than their background for gray-scale images and can be used in OCR applications where MSER will fail. We introduce a local flooding-based immersion for the derivate-based component-tree construction which is linear in the number of pixels. In the experiments, we show that the runtime scales favorably with an increasing number of channels and may improve algorithms which build on MSER.
Tasks	Optical Character Recognition, Semantic Segmentation
Published	2017-05-04
URL	http://arxiv.org/abs/1705.01906v2
PDF	http://arxiv.org/pdf/1705.01906v2.pdf
PWC	https://paperswithcode.com/paper/derivate-based-component-trees-for-multi
Repo
Framework

Cascade Ranking for Operational E-commerce Search


Title	Cascade Ranking for Operational E-commerce Search
Authors	Shichen Liu, Fei Xiao, Wenwu Ou, Luo Si
Abstract	In the ‘Big Data’ era, many real-world applications like search involve the ranking problem for a large number of items. It is important to obtain effective ranking results and at the same time obtain the results efficiently in a timely manner for providing good user experience and saving computational costs. Valuable prior research has been conducted for learning to efficiently rank like the cascade ranking (learning) model, which uses a sequence of ranking functions to progressively filter some items and rank the remaining items. However, most existing research of learning to efficiently rank in search is studied in a relatively small computing environments with simulated user queries. This paper presents novel research and thorough study of designing and deploying a Cascade model in a Large-scale Operational E-commerce Search application (CLOES), which deals with hundreds of millions of user queries per day with hundreds of servers. The challenge of the real-world application provides new insights for research: 1). Real-world search applications often involve multiple factors of preferences or constraints with respect to user experience and computational costs such as search accuracy, search latency, size of search results and total CPU cost, while most existing search solutions only address one or two factors; 2). Effectiveness of e-commerce search involves multiple types of user behaviors such as click and purchase, while most existing cascade ranking in search only models the click behavior. Based on these observations, a novel cascade ranking model is designed and deployed in an operational e-commerce search application. An extensive set of experiments demonstrate the advantage of the proposed work to address multiple factors of effectiveness, efficiency and user experience in the real-world application.
Tasks
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02093v1
PDF	http://arxiv.org/pdf/1706.02093v1.pdf
PWC	https://paperswithcode.com/paper/cascade-ranking-for-operational-e-commerce
Repo
Framework

No Reference Stereoscopic Video Quality Assessment Using Joint Motion and Depth Statistics


Title	No Reference Stereoscopic Video Quality Assessment Using Joint Motion and Depth Statistics
Authors	Appina Balasubramanyam, Jalli Akshith, Battula Shanmukh Srinivas, Channappayya S Sumohana
Abstract	We present a no reference (NR) quality assessment algorithm for assessing the perceptual quality of natural stereoscopic 3D (S3D) videos. This work is inspired by our finding that the joint statistics of the subband coefficients of motion (optical flow or motion vector magnitude) and depth (disparity map) of natural S3D videos possess a unique signature. Specifically, we empirically show that the joint statistics of the motion and depth subband coefficients of S3D video frames can be modeled accurately using a Bivariate Generalized Gaussian Distribution (BGGD). We then demonstrate that the parameters of the BGGD model possess the ability to discern quality variations in S3D videos. Therefore, the BGGD model parameters are employed as motion and depth quality features. In addition to these features, we rely on a frame level spatial quality feature that is computed using a robust off the shelf NR image quality assessment (IQA) algorithm. These frame level motion, depth and spatial features are consolidated and used with the corresponding S3D video’s difference mean opinion score (DMOS) labels for supervised learning using support vector regression (SVR). The overall quality of an S3D video is computed by averaging the frame level quality predictions of the constituent video frames. The proposed algorithm, dubbed Video QUality Evaluation using MOtion and DEpth Statistics (VQUEMODES) is shown to outperform the state of the art methods when evaluated over the IRCCYN and LFOVIA S3D subjective quality assessment databases.
Tasks	Image Quality Assessment, Optical Flow Estimation, Video Quality Assessment
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05480v1
PDF	http://arxiv.org/pdf/1711.05480v1.pdf
PWC	https://paperswithcode.com/paper/no-reference-stereoscopic-video-quality
Repo
Framework

Lexical Resources for Hindi Marathi MT


Title	Lexical Resources for Hindi Marathi MT
Authors	Sreelekha S, Pushpak Bhattacharyya
Abstract	In this paper we describe some ways to utilize various lexical resources to improve the quality of statistical machine translation system. We have augmented the training corpus with various lexical resources such as IndoWordnet semantic relation set, function words, kridanta pairs and verb phrases etc. Our research on the usage of lexical resources mainly focused on two ways such as augmenting parallel corpus with more vocabulary and augmenting with various word forms. We have described case studies, evaluations and detailed error analysis for both Marathi to Hindi and Hindi to Marathi machine translation systems. From the evaluations we observed that, there is an incremental growth in the quality of machine translation as the usage of various lexical resources increases. Moreover usage of various lexical resources helps to improve the coverage and quality of machine translation where limited parallel corpus is available.
Tasks	Machine Translation
Published	2017-03-04
URL	http://arxiv.org/abs/1703.01485v1
PDF	http://arxiv.org/pdf/1703.01485v1.pdf
PWC	https://paperswithcode.com/paper/lexical-resources-for-hindi-marathi-mt
Repo
Framework

Training an adaptive dialogue policy for interactive learning of visually grounded word meanings


Title	Training an adaptive dialogue policy for interactive learning of visually grounded word meanings
Authors	Yanchao Yu, Arash Eshghi, Oliver Lemon
Abstract	We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic parsing/generation framework - Dynamic Syntax and Type Theory with Records (DS-TTR) - with a set of visual classifiers that are learned throughout the interaction and which ground the meaning representations that it produces. We use this system in interaction with a simulated human tutor to study the effects of different dialogue policies and capabilities on the accuracy of learned meanings, learning rates, and efforts/costs to the tutor. We show that the overall performance of the learning agent is affected by (1) who takes initiative in the dialogues; (2) the ability to express/use their confidence level about visual attributes; and (3) the ability to process elliptical and incrementally constructed dialogue turns. Ultimately, we train an adaptive dialogue policy which optimises the trade-off between classifier accuracy and tutoring costs.
Tasks	Semantic Parsing
Published	2017-09-29
URL	http://arxiv.org/abs/1709.10426v1
PDF	http://arxiv.org/pdf/1709.10426v1.pdf
PWC	https://paperswithcode.com/paper/training-an-adaptive-dialogue-policy-for
Repo
Framework

An Ensemble Classifier for Predicting the Onset of Type II Diabetes


Title	An Ensemble Classifier for Predicting the Onset of Type II Diabetes
Authors	John Semerdjian, Spencer Frank
Abstract	Prediction of disease onset from patient survey and lifestyle data is quickly becoming an important tool for diagnosing a disease before it progresses. In this study, data from the National Health and Nutrition Examination Survey (NHANES) questionnaire is used to predict the onset of type II diabetes. An ensemble model using the output of five classification algorithms was developed to predict the onset on diabetes based on 16 features. The ensemble model had an AUC of 0.834 indicating high performance.
Tasks
Published	2017-08-24
URL	http://arxiv.org/abs/1708.07480v1
PDF	http://arxiv.org/pdf/1708.07480v1.pdf
PWC	https://paperswithcode.com/paper/an-ensemble-classifier-for-predicting-the
Repo
Framework