Paper Group ANR 525
Malicious Web Domain Identification using Online Credibility and Performance Data by Considering the Class Imbalance Issue. Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting. Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents. Genre-Agnostic Key Classificat …
Malicious Web Domain Identification using Online Credibility and Performance Data by Considering the Class Imbalance Issue
Title | Malicious Web Domain Identification using Online Credibility and Performance Data by Considering the Class Imbalance Issue |
Authors | Zhongyi Hu, Raymond Chiong, Ilung Pranata, Yukun Bao, Yuqing Lin |
Abstract | Purpose: Malicious web domain identification is of significant importance to the security protection of Internet users. With online credibility and performance data, this paper aims to investigate the use of machine learning tech-niques for malicious web domain identification by considering the class imbalance issue (i.e., there are more benign web domains than malicious ones). Design/methodology/approach: We propose an integrated resampling approach to handle class imbalance by combining the Synthetic Minority Over-sampling TEchnique (SMOTE) and Particle Swarm Optimisation (PSO), a population-based meta-heuristic algorithm. We use the SMOTE for over-sampling and PSO for under-sampling. Findings: By applying eight well-known machine learning classifiers, the proposed integrated resampling approach is comprehensively examined using several imbalanced web domain datasets with different imbalance ratios. Com-pared to five other well-known resampling approaches, experimental results confirm that the proposed approach is highly effective. Practical implications: This study not only inspires the practical use of online credibility and performance data for identifying malicious web domains, but also provides an effective resampling approach for handling the class imbal-ance issue in the area of malicious web domain identification. Originality/value: Online credibility and performance data is applied to build malicious web domain identification models using machine learning techniques. An integrated resampling approach is proposed to address the class im-balance issue. The performance of the proposed approach is confirmed based on real-world datasets with different imbalance ratios. |
Tasks | |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08359v1 |
http://arxiv.org/pdf/1810.08359v1.pdf | |
PWC | https://paperswithcode.com/paper/malicious-web-domain-identification-using |
Repo | |
Framework | |
Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting
Title | Modelling customer online behaviours with neural networks: applications to conversion prediction and advertising retargeting |
Authors | Yanwei Cui, Rogatien Tobossi, Olivia Vigouroux |
Abstract | In this paper, we apply neural networks into digital marketing world for the purpose of better targeting the potential customers. To do so, we model the customer online behaviours using dedicated neural network architectures. Starting from user searched keywords in a search engine to the landing page and different following pages, until the user left the site, we model the whole visited journey with a Recurrent Neural Network (RNN), together with Convolution Neural Networks (CNN) that can take into account of the semantic meaning of user searched keywords and different visited page names. With such model, we use Monte Carlo simulation to estimate the conversion rates of each potential customer in the future visiting. We believe our concept and the preliminary promising results in this paper enable the use of largely available customer online behaviours data for advanced digital marketing analysis. |
Tasks | |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07669v1 |
http://arxiv.org/pdf/1804.07669v1.pdf | |
PWC | https://paperswithcode.com/paper/modelling-customer-online-behaviours-with |
Repo | |
Framework | |
Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents
Title | Non-ergodic Complexity of Convex Proximal Inertial Gradient Descents |
Authors | Tao Sun, Linbo Qiao, Dongsheng Li |
Abstract | The proximal inertial gradient descent is efficient for the composite minimization and applicable for broad of machine learning problems. In this paper, we revisit the computational complexity of this algorithm and present other novel results, especially on the convergence rates of the objective function values. The non-ergodic O(1/k) rate is proved for proximal inertial gradient descent with constant stepzise when the objective function is coercive. When the objective function fails to promise coercivity, we prove the sublinear rate with diminishing inertial parameters. In the case that the objective function satisfies optimal strong convexity condition (which is much weaker than the strong convexity), the linear convergence is proved with much larger and general stepsize than previous literature. We also extend our results to the multi-block version and present the computational complexity. Both cyclic and stochastic index selection strategies are considered. |
Tasks | |
Published | 2018-01-23 |
URL | https://arxiv.org/abs/1801.07389v3 |
https://arxiv.org/pdf/1801.07389v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-complexity-of-convex-inertial-proximal |
Repo | |
Framework | |
Genre-Agnostic Key Classification With Convolutional Neural Networks
Title | Genre-Agnostic Key Classification With Convolutional Neural Networks |
Authors | Filip Korzeniowski, Gerhard Widmer |
Abstract | We propose modifications to the model structure and training procedure to a recently introduced Convolutional Neural Network for musical key classification. These modifications enable the network to learn a genre-independent model that performs better than models trained for specific music styles, which has not been the case in existing work. We analyse this generalisation capability on three datasets comprising distinct genres. We then evaluate the model on a number of unseen data sets, and show its superior performance compared to the state of the art. Finally, we investigate the model’s performance on short excerpts of audio. From these experiments, we conclude that models need to consider the harmonic coherence of the whole piece when classifying the local key of short segments of audio. |
Tasks | |
Published | 2018-08-16 |
URL | http://arxiv.org/abs/1808.05340v1 |
http://arxiv.org/pdf/1808.05340v1.pdf | |
PWC | https://paperswithcode.com/paper/genre-agnostic-key-classification-with |
Repo | |
Framework | |
Unsupervised Segmentation of 3D Medical Images Based on Clustering and Deep Representation Learning
Title | Unsupervised Segmentation of 3D Medical Images Based on Clustering and Deep Representation Learning |
Authors | Takayasu Moriya, Holger R. Roth, Shota Nakamura, Hirohisa Oda, Kai Nagara, Masahiro Oda, Kensaku Mori |
Abstract | This paper presents a novel unsupervised segmentation method for 3D medical images. Convolutional neural networks (CNNs) have brought significant advances in image segmentation. However, most of the recent methods rely on supervised learning, which requires large amounts of manually annotated data. Thus, it is challenging for these methods to cope with the growing amount of medical images. This paper proposes a unified approach to unsupervised deep representation learning and clustering for segmentation. Our proposed method consists of two phases. In the first phase, we learn deep feature representations of training patches from a target image using joint unsupervised learning (JULE) that alternately clusters representations generated by a CNN and updates the CNN parameters using cluster labels as supervisory signals. We extend JULE to 3D medical images by utilizing 3D convolutions throughout the CNN architecture. In the second phase, we apply k-means to the deep representations from the trained CNN and then project cluster labels to the target image in order to obtain the fully segmented image. We evaluated our methods on three images of lung cancer specimens scanned with micro-computed tomography (micro-CT). The automatic segmentation of pathological regions in micro-CT could further contribute to the pathological examination process. Hence, we aim to automatically divide each image into the regions of invasive carcinoma, noninvasive carcinoma, and normal tissue. Our experiments show the potential abilities of unsupervised deep representation learning for medical image segmentation. |
Tasks | Medical Image Segmentation, Representation Learning, Semantic Segmentation |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.03830v1 |
http://arxiv.org/pdf/1804.03830v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-segmentation-of-3d-medical |
Repo | |
Framework | |
Multi-objective Analysis of MAP-Elites Performance
Title | Multi-objective Analysis of MAP-Elites Performance |
Authors | Eivind Samuelsen, Kyrre Glette |
Abstract | In certain complex optimization tasks, it becomes necessary to use multiple measures to characterize the performance of different algorithms. This paper presents a method that combines ordinal effect sizes with Pareto dominance to analyze such cases. Since the method is ordinal, it can also generalize across different optimization tasks even when the performance measurements are differently scaled. Through a case study, we show that this method can discover and quantify relations that would be difficult to deduce using a conventional measure-by-measure analysis. This case study applies the method to the evolution of robot controller repertoires using the MAP-Elites algorithm. Here, we analyze the search performance across a large set of parametrizations; varying mutation size and operator type, as well as map resolution, across four different robot morphologies. We show that the average magnitude of mutations has a bigger effect on outcomes than their precise distributions. |
Tasks | |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05174v2 |
http://arxiv.org/pdf/1803.05174v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-objective-analysis-of-map-elites |
Repo | |
Framework | |
Comprehending Real Numbers: Development of Bengali Real Number Speech Corpus
Title | Comprehending Real Numbers: Development of Bengali Real Number Speech Corpus |
Authors | Md Mahadi Hasan Nahid, Md. Ashraful Islam, Bishwajit Purkaystha, Md Saiful Islam |
Abstract | Speech recognition has received a less attention in Bengali literature due to the lack of a comprehensive dataset. In this paper, we describe the development process of the first comprehensive Bengali speech dataset on real numbers. It comprehends all the possible words that may arise in uttering any Bengali real number. The corpus has ten speakers from the different regions of Bengali native people. It comprises of more than two thousands of speech samples in a total duration of closed to four hours. We also provide a deep analysis of our corpus, highlight some of the notable features of it, and finally evaluate the performances of two of the notable Bengali speech recognizers on it. |
Tasks | Speech Recognition |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.10136v1 |
http://arxiv.org/pdf/1803.10136v1.pdf | |
PWC | https://paperswithcode.com/paper/comprehending-real-numbers-development-of |
Repo | |
Framework | |
Optimized Hidden Markov Model based on Constrained Particle Swarm Optimization
Title | Optimized Hidden Markov Model based on Constrained Particle Swarm Optimization |
Authors | L. Chang, Yacine Ouzrout, Antoine Nongaillard, Abdelaziz Bouras |
Abstract | As one of Bayesian analysis tools, Hidden Markov Model (HMM) has been used to in extensive applications. Most HMMs are solved by Baum-Welch algorithm (BWHMM) to predict the model parameters, which is difficult to find global optimal solutions. This paper proposes an optimized Hidden Markov Model with Particle Swarm Optimization (PSO) algorithm and so is called PSOHMM. In order to overcome the statistical constraints in HMM, the paper develops re-normalization and re-mapping mechanisms to ensure the constraints in HMM. The experiments have shown that PSOHMM can search better solution than BWHMM, and has faster convergence speed. |
Tasks | |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.03450v1 |
http://arxiv.org/pdf/1811.03450v1.pdf | |
PWC | https://paperswithcode.com/paper/optimized-hidden-markov-model-based-on |
Repo | |
Framework | |
Generalized Coarse-to-Fine Visual Recognition with Progressive Training
Title | Generalized Coarse-to-Fine Visual Recognition with Progressive Training |
Authors | Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille |
Abstract | Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn. Coarse-to-fine (C2F) learning is a promising direction, but it remains unclear how it is applied to a wide range of vision problems. This paper presents a generalized C2F framework by making two technical contributions. First, we provide a unified way of C2F propagation, in which the coarse prediction (a class vector, a detected box, a segmentation mask, etc.) is encoded into a dense (pixel-level) matrix and concatenated to the original input, so that the fine model takes the same design of the coarse model but sees additional information. Second, we present a progressive training strategy which starts with feeding the ground-truth instead of the coarse output into the fine model, and gradually increases the fraction of coarse output, so that at the end of training the fine model is ready for testing. We also relate our approach to curriculum learning by showing that data difficulty keeps increasing during the training process. We apply our framework to three vision tasks including image classification, object localization and semantic segmentation, and demonstrate consistent accuracy gain compared to the baseline training strategy. |
Tasks | Image Classification, Object Localization, Semantic Segmentation |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12047v2 |
http://arxiv.org/pdf/1811.12047v2.pdf | |
PWC | https://paperswithcode.com/paper/progressive-recurrent-learning-for-visual |
Repo | |
Framework | |
Capturing human category representations by sampling in deep feature spaces
Title | Capturing human category representations by sampling in deep feature spaces |
Authors | Joshua C. Peterson, Jordan W. Suchow, Krisha Aghi, Alexander Y. Ku, Thomas L. Griffiths |
Abstract | Understanding how people represent categories is a core problem in cognitive science. Decades of research have yielded a variety of formal theories of categories, but validating them with naturalistic stimuli is difficult. The challenge is that human category representations cannot be directly observed and running informative experiments with naturalistic stimuli such as images requires a workable representation of these stimuli. Deep neural networks have recently been successful in solving a range of computer vision tasks and provide a way to compactly represent image features. Here, we introduce a method to estimate the structure of human categories that combines ideas from cognitive science and machine learning, blending human-based algorithms with state-of-the-art deep image generators. We provide qualitative and quantitative results as a proof-of-concept for the method’s feasibility. Samples drawn from human distributions rival those from state-of-the-art generative models in quality and outperform alternative methods for estimating the structure of human categories. |
Tasks | |
Published | 2018-05-19 |
URL | http://arxiv.org/abs/1805.07644v1 |
http://arxiv.org/pdf/1805.07644v1.pdf | |
PWC | https://paperswithcode.com/paper/capturing-human-category-representations-by |
Repo | |
Framework | |
Instance Segmentation and Tracking with Cosine Embeddings and Recurrent Hourglass Networks
Title | Instance Segmentation and Tracking with Cosine Embeddings and Recurrent Hourglass Networks |
Authors | Christian Payer, Darko Štern, Thomas Neff, Horst Bischof, Martin Urschler |
Abstract | Different to semantic segmentation, instance segmentation assigns unique labels to each individual instance of the same class. In this work, we propose a novel recurrent fully convolutional network architecture for tracking such instance segmentations over time. The network architecture incorporates convolutional gated recurrent units (ConvGRU) into a stacked hourglass network to utilize temporal video information. Furthermore, we train the network with a novel embedding loss based on cosine similarities, such that the network predicts unique embeddings for every instance throughout videos. Afterwards, these embeddings are clustered among subsequent video frames to create the final tracked instance segmentations. We evaluate the recurrent hourglass network by segmenting left ventricles in MR videos of the heart, where it outperforms a network that does not incorporate video information. Furthermore, we show applicability of the cosine embedding loss for segmenting leaf instances on still images of plants. Finally, we evaluate the framework for instance segmentation and tracking on six datasets of the ISBI celltracking challenge, where it shows state-of-the-art performance. |
Tasks | Instance Segmentation, Semantic Segmentation |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02070v3 |
http://arxiv.org/pdf/1806.02070v3.pdf | |
PWC | https://paperswithcode.com/paper/instance-segmentation-and-tracking-with |
Repo | |
Framework | |
Cluster Analysis on Locally Asymptotically Self-similar Processes with Known Number of Clusters
Title | Cluster Analysis on Locally Asymptotically Self-similar Processes with Known Number of Clusters |
Authors | Qidi Peng, Nan Rao, Ran Zhao |
Abstract | We conduct cluster analysis on a class of locally asymptotically self-similar stochastic processes, which includes multifractional Brownian motion as a representative. When the true number of clusters is supposed to be known, a new covariance-based dissimilarity measure is introduced, from which we obtain the approximately asymptotically consistent clustering algorithms. In simulation studies, clustering data sampled from multifractional Brownian motions with distinct functional Hurst parameters illustrates the approximated asymptotic consistency of the proposed algorithms. Clustering global financial markets’ equity indexes returns and sovereign CDS spreads provides a successful real world application. |
Tasks | |
Published | 2018-04-13 |
URL | https://arxiv.org/abs/1804.06234v6 |
https://arxiv.org/pdf/1804.06234v6.pdf | |
PWC | https://paperswithcode.com/paper/clustering-analysis-on-locally-asymptotically |
Repo | |
Framework | |
Explaining Neural Networks Semantically and Quantitatively
Title | Explaining Neural Networks Semantically and Quantitatively |
Authors | Runjin Chen, Hao Chen, Ge Huang, Jie Ren, Quanshi Zhang |
Abstract | This paper presents a method to explain the knowledge encoded in a convolutional neural network (CNN) quantitatively and semantically. The analysis of the specific rationale of each prediction made by the CNN presents a key issue of understanding neural networks, but it is also of significant practical values in certain applications. In this study, we propose to distill knowledge from the CNN into an explainable additive model, so that we can use the explainable model to provide a quantitative explanation for the CNN prediction. We analyze the typical bias-interpreting problem of the explainable model and develop prior losses to guide the learning of the explainable additive model. Experimental results have demonstrated the effectiveness of our method. |
Tasks | |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07169v1 |
http://arxiv.org/pdf/1812.07169v1.pdf | |
PWC | https://paperswithcode.com/paper/explaining-neural-networks-semantically-and |
Repo | |
Framework | |
Machine-learning inference of fluid variables from data using reservoir computing
Title | Machine-learning inference of fluid variables from data using reservoir computing |
Authors | Kengo Nakai, Yoshitaka Saiki |
Abstract | We infer both microscopic and macroscopic behaviors of a three-dimensional chaotic fluid flow using reservoir computing. In our procedure of the inference, we assume no prior knowledge of a physical process of a fluid flow except that its behavior is complex but deterministic. We present two ways of inference of the complex behavior; the first called partial-inference requires continued knowledge of partial time-series data during the inference as well as past time-series data, while the second called full-inference requires only past time-series data as training data. For the first case, we are able to infer long-time motion of microscopic fluid variables. For the second case, we show that the reservoir dynamics constructed from only past data of energy functions can infer the future behavior of energy functions and reproduce the energy spectrum. It is also shown that we can infer a time-series data from only one measurement by using the delay coordinates. These implies that the obtained two reservoir systems constructed without the knowledge of microscopic data are equivalent to the dynamical systems describing macroscopic behavior of energy functions. |
Tasks | Time Series |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.09917v3 |
http://arxiv.org/pdf/1805.09917v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-inference-of-fluid-variables |
Repo | |
Framework | |
Bitstream-Based JPEG Image Encryption with File-Size Preserving
Title | Bitstream-Based JPEG Image Encryption with File-Size Preserving |
Authors | Hiroyuki Kobayashi, Hitoshi Kiya |
Abstract | An encryption scheme of JPEG images in the bitstream domain is proposed. The proposed scheme preserves the JPEG format even after encrypting the images, and the file size of encrypted images is the exact same as that of the original JPEG images. Several methods for encrypting JPEG images in the bitstream domain have been proposed. However, since some marker codes are generated or lost in the encryption process, the file size of JPEG bitstreams is generally changed due to the encryption operations. The proposed method inputs JPEG bitstreams and selectively encrypts the additional bit components of the Huffman code in the bitstreams. This feature allows us to have encrypted images with the same data size as that recoded in the image transmission process, when JPEG images are replaced with the encrypted ones by the hooking, so that the image transmission are successfully carried out after the hooking. |
Tasks | |
Published | 2018-08-17 |
URL | http://arxiv.org/abs/1808.06495v1 |
http://arxiv.org/pdf/1808.06495v1.pdf | |
PWC | https://paperswithcode.com/paper/bitstream-based-jpeg-image-encryption-with |
Repo | |
Framework | |