October 18, 2019

2962 words 14 mins read

Paper Group ANR 525

Paper Group ANR 525

Malicious Web Domain Identification using Online Credibility and Performance Data by Considering the Class Imbalance Issue. Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting. Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents. Genre-Agnostic Key Classificat …

Malicious Web Domain Identification using Online Credibility and Performance Data by Considering the Class Imbalance Issue

Title Malicious Web Domain Identification using Online Credibility and Performance Data by Considering the Class Imbalance Issue
Authors Zhongyi Hu, Raymond Chiong, Ilung Pranata, Yukun Bao, Yuqing Lin
Abstract Purpose: Malicious web domain identification is of significant importance to the security protection of Internet users. With online credibility and performance data, this paper aims to investigate the use of machine learning tech-niques for malicious web domain identification by considering the class imbalance issue (i.e., there are more benign web domains than malicious ones). Design/methodology/approach: We propose an integrated resampling approach to handle class imbalance by combining the Synthetic Minority Over-sampling TEchnique (SMOTE) and Particle Swarm Optimisation (PSO), a population-based meta-heuristic algorithm. We use the SMOTE for over-sampling and PSO for under-sampling. Findings: By applying eight well-known machine learning classifiers, the proposed integrated resampling approach is comprehensively examined using several imbalanced web domain datasets with different imbalance ratios. Com-pared to five other well-known resampling approaches, experimental results confirm that the proposed approach is highly effective. Practical implications: This study not only inspires the practical use of online credibility and performance data for identifying malicious web domains, but also provides an effective resampling approach for handling the class imbal-ance issue in the area of malicious web domain identification. Originality/value: Online credibility and performance data is applied to build malicious web domain identification models using machine learning techniques. An integrated resampling approach is proposed to address the class im-balance issue. The performance of the proposed approach is confirmed based on real-world datasets with different imbalance ratios.
Tasks
Published 2018-10-19
URL http://arxiv.org/abs/1810.08359v1
PDF http://arxiv.org/pdf/1810.08359v1.pdf
PWC https://paperswithcode.com/paper/malicious-web-domain-identification-using
Repo
Framework

Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting

Title Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting
Authors Yanwei Cui, Rogatien Tobossi, Olivia Vigouroux
Abstract In this paper, we apply neural networks into digital marketing world for the purpose of better targeting the potential customers. To do so, we model the customer online behaviours using dedicated neural network architectures. Starting from user searched keywords in a search engine to the landing page and different following pages, until the user left the site, we model the whole visited journey with a Recurrent Neural Network (RNN), together with Convolution Neural Networks (CNN) that can take into account of the semantic meaning of user searched keywords and different visited page names. With such model, we use Monte Carlo simulation to estimate the conversion rates of each potential customer in the future visiting. We believe our concept and the preliminary promising results in this paper enable the use of largely available customer online behaviours data for advanced digital marketing analysis.
Tasks
Published 2018-04-20
URL http://arxiv.org/abs/1804.07669v1
PDF http://arxiv.org/pdf/1804.07669v1.pdf
PWC https://paperswithcode.com/paper/modelling-customer-online-behaviours-with
Repo
Framework

Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents

Title Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents
Authors Tao Sun, Linbo Qiao, Dongsheng Li
Abstract The proximal inertial gradient descent is efficient for the composite minimization and applicable for broad of machine learning problems. In this paper, we revisit the computational complexity of this algorithm and present other novel results, especially on the convergence rates of the objective function values. The non-ergodic O(1/k) rate is proved for proximal inertial gradient descent with constant stepzise when the objective function is coercive. When the objective function fails to promise coercivity, we prove the sublinear rate with diminishing inertial parameters. In the case that the objective function satisfies optimal strong convexity condition (which is much weaker than the strong convexity), the linear convergence is proved with much larger and general stepsize than previous literature. We also extend our results to the multi-block version and present the computational complexity. Both cyclic and stochastic index selection strategies are considered.
Tasks
Published 2018-01-23
URL https://arxiv.org/abs/1801.07389v3
PDF https://arxiv.org/pdf/1801.07389v3.pdf
PWC https://paperswithcode.com/paper/on-the-complexity-of-convex-inertial-proximal
Repo
Framework

Genre-Agnostic Key Classification With Convolutional Neural Networks

Title Genre-Agnostic Key Classification With Convolutional Neural Networks
Authors Filip Korzeniowski, Gerhard Widmer
Abstract We propose modifications to the model structure and training procedure to a recently introduced Convolutional Neural Network for musical key classification. These modifications enable the network to learn a genre-independent model that performs better than models trained for specific music styles, which has not been the case in existing work. We analyse this generalisation capability on three datasets comprising distinct genres. We then evaluate the model on a number of unseen data sets, and show its superior performance compared to the state of the art. Finally, we investigate the model’s performance on short excerpts of audio. From these experiments, we conclude that models need to consider the harmonic coherence of the whole piece when classifying the local key of short segments of audio.
Tasks
Published 2018-08-16
URL http://arxiv.org/abs/1808.05340v1
PDF http://arxiv.org/pdf/1808.05340v1.pdf
PWC https://paperswithcode.com/paper/genre-agnostic-key-classification-with
Repo
Framework

Unsupervised Segmentation of 3D Medical Images Based on Clustering and Deep Representation Learning

Title Unsupervised Segmentation of 3D Medical Images Based on Clustering and Deep Representation Learning
Authors Takayasu Moriya, Holger R. Roth, Shota Nakamura, Hirohisa Oda, Kai Nagara, Masahiro Oda, Kensaku Mori
Abstract This paper presents a novel unsupervised segmentation method for 3D medical images. Convolutional neural networks (CNNs) have brought significant advances in image segmentation. However, most of the recent methods rely on supervised learning, which requires large amounts of manually annotated data. Thus, it is challenging for these methods to cope with the growing amount of medical images. This paper proposes a unified approach to unsupervised deep representation learning and clustering for segmentation. Our proposed method consists of two phases. In the first phase, we learn deep feature representations of training patches from a target image using joint unsupervised learning (JULE) that alternately clusters representations generated by a CNN and updates the CNN parameters using cluster labels as supervisory signals. We extend JULE to 3D medical images by utilizing 3D convolutions throughout the CNN architecture. In the second phase, we apply k-means to the deep representations from the trained CNN and then project cluster labels to the target image in order to obtain the fully segmented image. We evaluated our methods on three images of lung cancer specimens scanned with micro-computed tomography (micro-CT). The automatic segmentation of pathological regions in micro-CT could further contribute to the pathological examination process. Hence, we aim to automatically divide each image into the regions of invasive carcinoma, noninvasive carcinoma, and normal tissue. Our experiments show the potential abilities of unsupervised deep representation learning for medical image segmentation.
Tasks Medical Image Segmentation, Representation Learning, Semantic Segmentation
Published 2018-04-11
URL http://arxiv.org/abs/1804.03830v1
PDF http://arxiv.org/pdf/1804.03830v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-segmentation-of-3d-medical
Repo
Framework

Multi-objective Analysis of MAP-Elites Performance

Title Multi-objective Analysis of MAP-Elites Performance
Authors Eivind Samuelsen, Kyrre Glette
Abstract In certain complex optimization tasks, it becomes necessary to use multiple measures to characterize the performance of different algorithms. This paper presents a method that combines ordinal effect sizes with Pareto dominance to analyze such cases. Since the method is ordinal, it can also generalize across different optimization tasks even when the performance measurements are differently scaled. Through a case study, we show that this method can discover and quantify relations that would be difficult to deduce using a conventional measure-by-measure analysis. This case study applies the method to the evolution of robot controller repertoires using the MAP-Elites algorithm. Here, we analyze the search performance across a large set of parametrizations; varying mutation size and operator type, as well as map resolution, across four different robot morphologies. We show that the average magnitude of mutations has a bigger effect on outcomes than their precise distributions.
Tasks
Published 2018-03-14
URL http://arxiv.org/abs/1803.05174v2
PDF http://arxiv.org/pdf/1803.05174v2.pdf
PWC https://paperswithcode.com/paper/multi-objective-analysis-of-map-elites
Repo
Framework

Comprehending Real Numbers: Development of Bengali Real Number Speech Corpus

Title Comprehending Real Numbers: Development of Bengali Real Number Speech Corpus
Authors Md Mahadi Hasan Nahid, Md. Ashraful Islam, Bishwajit Purkaystha, Md Saiful Islam
Abstract Speech recognition has received a less attention in Bengali literature due to the lack of a comprehensive dataset. In this paper, we describe the development process of the first comprehensive Bengali speech dataset on real numbers. It comprehends all the possible words that may arise in uttering any Bengali real number. The corpus has ten speakers from the different regions of Bengali native people. It comprises of more than two thousands of speech samples in a total duration of closed to four hours. We also provide a deep analysis of our corpus, highlight some of the notable features of it, and finally evaluate the performances of two of the notable Bengali speech recognizers on it.
Tasks Speech Recognition
Published 2018-03-27
URL http://arxiv.org/abs/1803.10136v1
PDF http://arxiv.org/pdf/1803.10136v1.pdf
PWC https://paperswithcode.com/paper/comprehending-real-numbers-development-of
Repo
Framework

Optimized Hidden Markov Model based on Constrained Particle Swarm Optimization

Title Optimized Hidden Markov Model based on Constrained Particle Swarm Optimization
Authors L. Chang, Yacine Ouzrout, Antoine Nongaillard, Abdelaziz Bouras
Abstract As one of Bayesian analysis tools, Hidden Markov Model (HMM) has been used to in extensive applications. Most HMMs are solved by Baum-Welch algorithm (BWHMM) to predict the model parameters, which is difficult to find global optimal solutions. This paper proposes an optimized Hidden Markov Model with Particle Swarm Optimization (PSO) algorithm and so is called PSOHMM. In order to overcome the statistical constraints in HMM, the paper develops re-normalization and re-mapping mechanisms to ensure the constraints in HMM. The experiments have shown that PSOHMM can search better solution than BWHMM, and has faster convergence speed.
Tasks
Published 2018-11-07
URL http://arxiv.org/abs/1811.03450v1
PDF http://arxiv.org/pdf/1811.03450v1.pdf
PWC https://paperswithcode.com/paper/optimized-hidden-markov-model-based-on
Repo
Framework

Generalized Coarse-to-Fine Visual Recognition with Progressive Training

Title Generalized Coarse-to-Fine Visual Recognition with Progressive Training
Authors Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille
Abstract Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn. Coarse-to-fine (C2F) learning is a promising direction, but it remains unclear how it is applied to a wide range of vision problems. This paper presents a generalized C2F framework by making two technical contributions. First, we provide a unified way of C2F propagation, in which the coarse prediction (a class vector, a detected box, a segmentation mask, etc.) is encoded into a dense (pixel-level) matrix and concatenated to the original input, so that the fine model takes the same design of the coarse model but sees additional information. Second, we present a progressive training strategy which starts with feeding the ground-truth instead of the coarse output into the fine model, and gradually increases the fraction of coarse output, so that at the end of training the fine model is ready for testing. We also relate our approach to curriculum learning by showing that data difficulty keeps increasing during the training process. We apply our framework to three vision tasks including image classification, object localization and semantic segmentation, and demonstrate consistent accuracy gain compared to the baseline training strategy.
Tasks Image Classification, Object Localization, Semantic Segmentation
Published 2018-11-29
URL http://arxiv.org/abs/1811.12047v2
PDF http://arxiv.org/pdf/1811.12047v2.pdf
PWC https://paperswithcode.com/paper/progressive-recurrent-learning-for-visual
Repo
Framework

Capturing human category representations by sampling in deep feature spaces

Title Capturing human category representations by sampling in deep feature spaces
Authors Joshua C. Peterson, Jordan W. Suchow, Krisha Aghi, Alexander Y. Ku, Thomas L. Griffiths
Abstract Understanding how people represent categories is a core problem in cognitive science. Decades of research have yielded a variety of formal theories of categories, but validating them with naturalistic stimuli is difficult. The challenge is that human category representations cannot be directly observed and running informative experiments with naturalistic stimuli such as images requires a workable representation of these stimuli. Deep neural networks have recently been successful in solving a range of computer vision tasks and provide a way to compactly represent image features. Here, we introduce a method to estimate the structure of human categories that combines ideas from cognitive science and machine learning, blending human-based algorithms with state-of-the-art deep image generators. We provide qualitative and quantitative results as a proof-of-concept for the method’s feasibility. Samples drawn from human distributions rival those from state-of-the-art generative models in quality and outperform alternative methods for estimating the structure of human categories.
Tasks
Published 2018-05-19
URL http://arxiv.org/abs/1805.07644v1
PDF http://arxiv.org/pdf/1805.07644v1.pdf
PWC https://paperswithcode.com/paper/capturing-human-category-representations-by
Repo
Framework

Instance Segmentation and Tracking with Cosine Embeddings and Recurrent Hourglass Networks

Title Instance Segmentation and Tracking with Cosine Embeddings and Recurrent Hourglass Networks
Authors Christian Payer, Darko Štern, Thomas Neff, Horst Bischof, Martin Urschler
Abstract Different to semantic segmentation, instance segmentation assigns unique labels to each individual instance of the same class. In this work, we propose a novel recurrent fully convolutional network architecture for tracking such instance segmentations over time. The network architecture incorporates convolutional gated recurrent units (ConvGRU) into a stacked hourglass network to utilize temporal video information. Furthermore, we train the network with a novel embedding loss based on cosine similarities, such that the network predicts unique embeddings for every instance throughout videos. Afterwards, these embeddings are clustered among subsequent video frames to create the final tracked instance segmentations. We evaluate the recurrent hourglass network by segmenting left ventricles in MR videos of the heart, where it outperforms a network that does not incorporate video information. Furthermore, we show applicability of the cosine embedding loss for segmenting leaf instances on still images of plants. Finally, we evaluate the framework for instance segmentation and tracking on six datasets of the ISBI celltracking challenge, where it shows state-of-the-art performance.
Tasks Instance Segmentation, Semantic Segmentation
Published 2018-06-06
URL http://arxiv.org/abs/1806.02070v3
PDF http://arxiv.org/pdf/1806.02070v3.pdf
PWC https://paperswithcode.com/paper/instance-segmentation-and-tracking-with
Repo
Framework

Cluster Analysis on Locally Asymptotically Self-similar Processes with Known Number of Clusters

Title Cluster Analysis on Locally Asymptotically Self-similar Processes with Known Number of Clusters
Authors Qidi Peng, Nan Rao, Ran Zhao
Abstract We conduct cluster analysis on a class of locally asymptotically self-similar stochastic processes, which includes multifractional Brownian motion as a representative. When the true number of clusters is supposed to be known, a new covariance-based dissimilarity measure is introduced, from which we obtain the approximately asymptotically consistent clustering algorithms. In simulation studies, clustering data sampled from multifractional Brownian motions with distinct functional Hurst parameters illustrates the approximated asymptotic consistency of the proposed algorithms. Clustering global financial markets’ equity indexes returns and sovereign CDS spreads provides a successful real world application.
Tasks
Published 2018-04-13
URL https://arxiv.org/abs/1804.06234v6
PDF https://arxiv.org/pdf/1804.06234v6.pdf
PWC https://paperswithcode.com/paper/clustering-analysis-on-locally-asymptotically
Repo
Framework

Explaining Neural Networks Semantically and Quantitatively

Title Explaining Neural Networks Semantically and Quantitatively
Authors Runjin Chen, Hao Chen, Ge Huang, Jie Ren, Quanshi Zhang
Abstract This paper presents a method to explain the knowledge encoded in a convolutional neural network (CNN) quantitatively and semantically. The analysis of the specific rationale of each prediction made by the CNN presents a key issue of understanding neural networks, but it is also of significant practical values in certain applications. In this study, we propose to distill knowledge from the CNN into an explainable additive model, so that we can use the explainable model to provide a quantitative explanation for the CNN prediction. We analyze the typical bias-interpreting problem of the explainable model and develop prior losses to guide the learning of the explainable additive model. Experimental results have demonstrated the effectiveness of our method.
Tasks
Published 2018-12-18
URL http://arxiv.org/abs/1812.07169v1
PDF http://arxiv.org/pdf/1812.07169v1.pdf
PWC https://paperswithcode.com/paper/explaining-neural-networks-semantically-and
Repo
Framework

Machine-learning inference of fluid variables from data using reservoir computing

Title Machine-learning inference of fluid variables from data using reservoir computing
Authors Kengo Nakai, Yoshitaka Saiki
Abstract We infer both microscopic and macroscopic behaviors of a three-dimensional chaotic fluid flow using reservoir computing. In our procedure of the inference, we assume no prior knowledge of a physical process of a fluid flow except that its behavior is complex but deterministic. We present two ways of inference of the complex behavior; the first called partial-inference requires continued knowledge of partial time-series data during the inference as well as past time-series data, while the second called full-inference requires only past time-series data as training data. For the first case, we are able to infer long-time motion of microscopic fluid variables. For the second case, we show that the reservoir dynamics constructed from only past data of energy functions can infer the future behavior of energy functions and reproduce the energy spectrum. It is also shown that we can infer a time-series data from only one measurement by using the delay coordinates. These implies that the obtained two reservoir systems constructed without the knowledge of microscopic data are equivalent to the dynamical systems describing macroscopic behavior of energy functions.
Tasks Time Series
Published 2018-05-23
URL http://arxiv.org/abs/1805.09917v3
PDF http://arxiv.org/pdf/1805.09917v3.pdf
PWC https://paperswithcode.com/paper/machine-learning-inference-of-fluid-variables
Repo
Framework

Bitstream-Based JPEG Image Encryption with File-Size Preserving

Title Bitstream-Based JPEG Image Encryption with File-Size Preserving
Authors Hiroyuki Kobayashi, Hitoshi Kiya
Abstract An encryption scheme of JPEG images in the bitstream domain is proposed. The proposed scheme preserves the JPEG format even after encrypting the images, and the file size of encrypted images is the exact same as that of the original JPEG images. Several methods for encrypting JPEG images in the bitstream domain have been proposed. However, since some marker codes are generated or lost in the encryption process, the file size of JPEG bitstreams is generally changed due to the encryption operations. The proposed method inputs JPEG bitstreams and selectively encrypts the additional bit components of the Huffman code in the bitstreams. This feature allows us to have encrypted images with the same data size as that recoded in the image transmission process, when JPEG images are replaced with the encrypted ones by the hooking, so that the image transmission are successfully carried out after the hooking.
Tasks
Published 2018-08-17
URL http://arxiv.org/abs/1808.06495v1
PDF http://arxiv.org/pdf/1808.06495v1.pdf
PWC https://paperswithcode.com/paper/bitstream-based-jpeg-image-encryption-with
Repo
Framework
comments powered by Disqus