January 31, 2020

3092 words 15 mins read

Paper Group ANR 134

Catching the Phish: Detecting Phishing Attacks using Recurrent Neural Networks (RNNs). A Promotion Method for Generation Error Based Video Anomaly Detection. Context-Aware Saliency Detection for Image Retargeting Using Convolutional Neural Networks. Sentiment Analysis Challenges in Persian Language. Rectified Decision Trees: Towards Interpretabilit …

Catching the Phish: Detecting Phishing Attacks using Recurrent Neural Networks (RNNs)


Title	Catching the Phish: Detecting Phishing Attacks using Recurrent Neural Networks (RNNs)
Authors	Lukas Halgas, Ioannis Agrafiotis, Jason R. C. Nurse
Abstract	The emergence of online services in our daily lives has been accompanied by a range of malicious attempts to trick individuals into performing undesired actions, often to the benefit of the adversary. The most popular medium of these attempts is phishing attacks, particularly through emails and websites. In order to defend against such attacks, there is an urgent need for automated mechanisms to identify this malevolent content before it reaches users. Machine learning techniques have gradually become the standard for such classification problems. However, identifying common measurable features of phishing content (e.g., in emails) is notoriously difficult. To address this problem, we engage in a novel study into a phishing content classifier based on a recurrent neural network (RNN), which identifies such features without human input. At this stage, we scope our research to emails, but our approach can be extended to apply to websites. Our results show that the proposed system outperforms state-of-the-art tools. Furthermore, our classifier is efficient and takes into account only the text and, in particular, the textual structure of the email. Since these features are rarely considered in email classification, we argue that our classifier can complement existing classifiers with high information gain.
Tasks
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03640v1
PDF	https://arxiv.org/pdf/1908.03640v1.pdf
PWC	https://paperswithcode.com/paper/catching-the-phish-detecting-phishing-attacks
Repo
Framework

A Promotion Method for Generation Error Based Video Anomaly Detection


Title	A Promotion Method for Generation Error Based Video Anomaly Detection
Authors	Zhiguo Wang, Zhongliang Yang, Yu-Jin Zhang
Abstract	Surveillance video anomaly detection is to detect events that rarely or never happened in a certain scene. The generation error (GE)-based methods exhibit excellent performance on this task. They firstly train a generative neural network (GNN) to generate normal samples, then judge the samples with large GEs as anomalies. Almost all the GE-based methods utilize frame-level GEs to detect anomalies. However, anomalies generally occur in local areas, the frame-level GE introduces GEs of normal areas to anomaly discriminations, that brings two problems: i) The GE of normal areas reduces the anomaly saliency of the anomalous frame. ii) Different videos have different normal-GE-levels, thus it is hard to set a uniform threshold for all videos to detect anomalies. To address these problems, we propose a promotion method: utilize the maximum of block-level GEs on the frame to detect anomaly. Firstly, we calculate the block-level GEs at each position on the frame. Then, we utilize the maximum of the block-level GEs on the frame to detect anomalies. Based on the existed GNN models, experiments are carried out on multiple datasets. The results demonstrate the effectiveness of the proposed method and achieve state-of-the-art performance.
Tasks	Anomaly Detection
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08402v4
PDF	https://arxiv.org/pdf/1911.08402v4.pdf
PWC	https://paperswithcode.com/paper/a-boost-strategy-to-the-generative-error
Repo
Framework

Context-Aware Saliency Detection for Image Retargeting Using Convolutional Neural Networks


Title	Context-Aware Saliency Detection for Image Retargeting Using Convolutional Neural Networks
Authors	Mahdi Ahmadi, Nader Karimi, Shadrokh Samavi
Abstract	Image retargeting is the task of making images capable of being displayed on screens with different sizes. This work should be done so that high-level visual information and low-level features such as texture remain as intact as possible to the human visual system, while the output image may have different dimensions. Thus, simple methods such as scaling and cropping are not adequate for this purpose. In recent years, researchers have tried to improve the existing retargeting methods and introduce new ones. However, a specific method cannot be utilized to retarget all types of images. In other words, different images require different retargeting methods. Image retargeting has a close relationship to image saliency detection, which is relatively a new image processing task. Earlier saliency detection methods were based on local and global but low-level image information. These methods are called bottom-up methods. On the other hand, newer approaches are top-down and mixed methods that consider the high level and semantic information of the image too. In this paper, we introduce the proposed methods in both saliency detection and retargeting. For the saliency detection, the use of image context and semantic segmentation are examined, and a novel mixed bottom-up, and top-down saliency detection method is introduced. After saliency detection, a modified version of an existing retargeting method is utilized for retargeting the images. The results suggest that the proposed image retargeting pipeline has excellent performance compared to other tested methods. Also, the subjective evaluations on the Pascal dataset can be used as a retargeting quality assessment dataset for further research.
Tasks	Saliency Detection, Semantic Segmentation
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08071v1
PDF	https://arxiv.org/pdf/1910.08071v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-saliency-detection-for-image
Repo
Framework

Sentiment Analysis Challenges in Persian Language


Title	Sentiment Analysis Challenges in Persian Language
Authors	Mohammad Heydari
Abstract	The rapid growth in data on the internet requires a data mining process to reach a decision to support insight. The Persian language has strong potential for deep research in any aspect of natural language processing, especially sentimental analysis approach. Thousands of websites and blogs updates and modifies by Persian users around the world that contains millions of Persian context. This range of application requires a comprehensive structured framework to extract beneficial information for helping enterprises to enhance their business and initiate a customer-centric management process by producing effective recommender systems. Sentimental analysis is an intelligent approach for extracting useful information from huge amounts of data to help an enterprise for smart management process. In this road, machine learning and deep learning techniques will become very helpful but there is the number of challenges which are face to them. This paper tried to present and assert the most important challenges of sentimental analysis in the Persian language. This language is an Indo-European language which spoken by over 110 million people around the world and is an official language in Iran, Tajikistan, and Afghanistan. Its also widely used in Uzbekistan, Pakistan and Turkish by order.
Tasks	Recommendation Systems, Sentiment Analysis
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04407v2
PDF	https://arxiv.org/pdf/1907.04407v2.pdf
PWC	https://paperswithcode.com/paper/sentiment-analysis-challenges-in-persian
Repo
Framework

Rectified Decision Trees: Towards Interpretability, Compression and Empirical Soundness


Title	Rectified Decision Trees: Towards Interpretability, Compression and Empirical Soundness
Authors	Jiawang Bai, Yiming Li, Jiawei Li, Yong Jiang, Shutao Xia
Abstract	How to obtain a model with good interpretability and performance has always been an important research topic. In this paper, we propose rectified decision trees (ReDT), a knowledge distillation based decision trees rectification with high interpretability, small model size, and empirical soundness. Specifically, we extend the impurity calculation and the pure ending condition of the classical decision tree to propose a decision tree extension that allows the use of soft labels generated by a well-trained teacher model in training and prediction process. It is worth noting that for the acquisition of soft labels, we propose a new multiple cross-validation based method to reduce the effects of randomness and overfitting. These approaches ensure that ReDT retains excellent interpretability and even achieves fewer nodes than the decision tree in the aspect of compression while having relatively good performance. Besides, in contrast to traditional knowledge distillation, back propagation of the student model is not necessarily required in ReDT, which is an attempt of a new knowledge distillation approach. Extensive experiments are conducted, which demonstrates the superiority of ReDT in interpretability, compression, and empirical soundness.
Tasks
Published	2019-03-14
URL	http://arxiv.org/abs/1903.05965v1
PDF	http://arxiv.org/pdf/1903.05965v1.pdf
PWC	https://paperswithcode.com/paper/rectified-decision-trees-towards
Repo
Framework

Dynamic Time Warp Convolutional Networks


Title	Dynamic Time Warp Convolutional Networks
Authors	Yaniv Shulman
Abstract	Where dealing with temporal sequences it is fair to assume that the same kind of deformations that motivated the development of the Dynamic Time Warp algorithm could be relevant also in the calculation of the dot product (“convolution”) in a 1-D convolution layer. In this work a method is proposed for aligning the convolution filter and the input where they are locally out of phase utilising an algorithm similar to the Dynamic Time Warp. The proposed method enables embedding a non-parametric warping of temporal sequences for increasing similarity directly in deep networks and can expand on the generalisation capabilities and the capacity of standard 1-D convolution layer where local sequential deformations are present in the input. Experimental results demonstrate the proposed method exceeds or matches the standard 1-D convolution layer in terms of the maximum accuracy achieved on a number of time series classification tasks. In addition the impact of different hyperparameters settings is investigated given different datasets and the results support the conclusions of previous work done in relation to the choice of DTW parameter values. The proposed layer can be freely integrated with other typical layers to compose deep artificial neural networks of an arbitrary architecture that are trained using standard stochastic gradient descent.
Tasks	Time Series, Time Series Classification
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01944v1
PDF	https://arxiv.org/pdf/1911.01944v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-time-warp-convolutional-networks
Repo
Framework

Prediction of Reaction Time and Vigilance Variability from Spatiospectral Features of Resting-State EEG in a Long Sustained Attention Task


Title	Prediction of Reaction Time and Vigilance Variability from Spatiospectral Features of Resting-State EEG in a Long Sustained Attention Task
Authors	Mastaneh Torkamani-Azar, Sumeyra Demir Kanik, Serap Aydin, Mujdat Cetin
Abstract	Resting-state brain networks represent the intrinsic state of the brain during the majority of cognitive and sensorimotor tasks. However, no study has yet presented concise predictors of task-induced vigilance variability from spectrospatial features of the pre-task, resting-state electroencephalograms (EEG). We asked ten healthy volunteers (6 females, 4 males) to participate in 105-minute fixed-sequence-varying-duration sessions of sustained attention to response task (SART). A novel and adaptive vigilance scoring scheme was designed based on the performance and response time in consecutive trials, and demonstrated large inter-participant variability in terms of maintaining consistent tonic performance. Multiple linear regression using feature relevance analysis obtained significant predictors of the mean cumulative vigilance score (CVS), mean response time, and variabilities of these scores from the resting-state, band-power ratios of EEG signals, p<0.05. Single-layer neural networks trained with cross-validation also captured different associations for the beta sub-bands. Increase in the gamma (28-48 Hz) and upper beta ratios from the left central and temporal regions predicted slower reactions and more inconsistent vigilance as explained by the increased activation of default mode network (DMN) and differences between the high- and low-attention networks at temporal regions. Higher ratios of parietal alpha from the Brodmann’s areas 18, 19, and 37 during the eyes-open states predicted slower responses but more consistent CVS and reactions associated with the superior ability in vigilance maintenance. The proposed framework and these findings on the most stable and significant attention predictors from the intrinsic EEG power ratios can be used to model attention variations during the calibration sessions of BCI applications and vigilance monitoring systems.
Tasks	Calibration, EEG
Published	2019-10-21
URL	https://arxiv.org/abs/1910.10076v1
PDF	https://arxiv.org/pdf/1910.10076v1.pdf
PWC	https://paperswithcode.com/paper/prediction-of-reaction-time-and-vigilance
Repo
Framework

Online Hyper-parameter Learning for Auto-Augmentation Strategy


Title	Online Hyper-parameter Learning for Auto-Augmentation Strategy
Authors	Chen Lin, Minghao Guo, Chuming Li, Yuan Xin, Wei Wu, Dahua Lin, Wanli Ouyang, Junjie Yan
Abstract	Data augmentation is critical to the success of modern deep learning techniques. In this paper, we propose Online Hyper-parameter Learning for Auto-Augmentation (OHL-Auto-Aug), an economical solution that learns the augmentation policy distribution along with network training. Unlike previous methods on auto-augmentation that search augmentation strategies in an offline manner, our method formulates the augmentation policy as a parameterized probability distribution, thus allowing its parameters to be optimized jointly with network parameters. Our proposed OHL-Auto-Aug eliminates the need of re-training and dramatically reduces the cost of the overall search process, while establishes significantly accuracy improvements over baseline models. On both CIFAR-10 and ImageNet, our method achieves remarkable on search accuracy, 60x faster on CIFAR-10 and 24x faster on ImageNet, while maintaining competitive accuracies.
Tasks	Data Augmentation
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07373v2
PDF	https://arxiv.org/pdf/1905.07373v2.pdf
PWC	https://paperswithcode.com/paper/online-hyper-parameter-learning-for-auto
Repo
Framework

Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues


Title	Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues
Authors	Sungjoon Park, Donghyun Kim, Alice Oh
Abstract	The recent surge of text-based online counseling applications enables us to collect and analyze interactions between counselors and clients. A dataset of those interactions can be used to learn to automatically classify the client utterances into categories that help counselors in diagnosing client status and predicting counseling outcome. With proper anonymization, we collect counselor-client dialogues, define meaningful categories of client utterances with professional counselors, and develop a novel neural network model for classifying the client utterances. The central idea of our model, ConvMFiT, is a pre-trained conversation model which consists of a general language model built from an out-of-domain corpus and two role-specific language models built from unlabeled in-domain dialogues. The classification result shows that ConvMFiT outperforms state-of-the-art comparison models. Further, the attention weights in the learned model confirm that the model finds expected linguistic patterns for each category.
Tasks	Language Modelling
Published	2019-03-31
URL	http://arxiv.org/abs/1904.00350v1
PDF	http://arxiv.org/pdf/1904.00350v1.pdf
PWC	https://paperswithcode.com/paper/conversation-model-fine-tuning-for
Repo
Framework

Correlation Congruence for Knowledge Distillation


Title	Correlation Congruence for Knowledge Distillation
Authors	Baoyun Peng, Xiao Jin, Jiaheng Liu, Shunfeng Zhou, Yichao Wu, Yu Liu, Dongsheng Li, Zhaoning Zhang
Abstract	Most teacher-student frameworks based on knowledge distillation (KD) depend on a strong congruent constraint on instance level. However, they usually ignore the correlation between multiple instances, which is also valuable for knowledge transfer. In this work, we propose a new framework named correlation congruence for knowledge distillation (CCKD), which transfers not only the instance-level information, but also the correlation between instances. Furthermore, a generalized kernel method based on Taylor series expansion is proposed to better capture the correlation between instances. Empirical experiments and ablation studies on image classification tasks (including CIFAR-100, ImageNet-1K) and metric learning tasks (including ReID and Face Recognition) show that the proposed CCKD substantially outperforms the original KD and achieves state-of-the-art accuracy compared with other SOTA KD-based methods. The CCKD can be easily deployed in the majority of the teacher-student framework such as KD and hint-based learning methods.
Tasks	Face Recognition, Image Classification, Metric Learning, Transfer Learning
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01802v1
PDF	http://arxiv.org/pdf/1904.01802v1.pdf
PWC	https://paperswithcode.com/paper/correlation-congruence-for-knowledge
Repo
Framework

Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features


Title	Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features
Authors	Siddharth Gururani, Kilol Gupta, Dhaval Shah, Zahra Shakeri, Jervis Pinto
Abstract	This paper presents a simple yet effective method to achieve prosody transfer from a reference speech signal to synthesized speech. The main idea is to incorporate well-known acoustic correlates of prosody such as pitch and loudness contours of the reference speech into a modern neural text-to-speech (TTS) synthesizer such as Tacotron2 (TC2). More specifically, a small set of acoustic features are extracted from the reference audio and then used to condition a TC2 synthesizer. The trained model is evaluated using subjective listening tests and novel objective evaluations of prosody transfer are proposed. Listening tests show that the synthesized speech is rated as highly natural and that prosody is successfully transferred from the reference speech signal to the synthesized signal.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09645v1
PDF	https://arxiv.org/pdf/1911.09645v1.pdf
PWC	https://paperswithcode.com/paper/prosody-transfer-in-neural-text-to-speech
Repo
Framework

Logarithmic Regret for parameter-free Online Logistic Regression


Title	Logarithmic Regret for parameter-free Online Logistic Regression
Authors	Joseph De Vilmarest, Olivier Wintenberger
Abstract	We consider online optimization procedures in the context of logistic regression, focusing on the Extended Kalman Filter (EKF). We introduce a second-order algorithm close to the EKF, named Semi-Online Step (SOS), for which we prove a O(log(n)) regret in the adversarial setting, paving the way to similar results for the EKF. This regret bound on SOS is the first for such parameter-free algorithm in the adversarial logistic regression. We prove for the EKF in constant dynamics a O(log(n)) regret in expectation and in the well-specified logistic regression model.
Tasks
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09803v1
PDF	http://arxiv.org/pdf/1902.09803v1.pdf
PWC	https://paperswithcode.com/paper/logarithmic-regret-for-parameter-free-online
Repo
Framework

Audio-Conditioned U-Net for Position Estimation in Full Sheet Images


Title	Audio-Conditioned U-Net for Position Estimation in Full Sheet Images
Authors	Florian Henkel, Rainer Kelz, Gerhard Widmer
Abstract	The goal of score following is to track a musical performance, usually in the form of audio, in a corresponding score representation. Established methods mainly rely on computer-readable scores in the form of MIDI or MusicXML and achieve robust and reliable tracking results. Recently, multimodal deep learning methods have been used to follow along musical performances in raw sheet images. Among the current limits of these systems is that they require a non trivial amount of preprocessing steps that unravel the raw sheet image into a single long system of staves. The current work is an attempt at removing this particular limitation. We propose an architecture capable of estimating matching score positions directly within entire unprocessed sheet images. We argue that this is a necessary first step towards a fully integrated score following system that does not rely on any preprocessing steps such as optical music recognition.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07254v1
PDF	https://arxiv.org/pdf/1910.07254v1.pdf
PWC	https://paperswithcode.com/paper/audio-conditioned-u-net-for-position
Repo
Framework

Spectral Analysis of Latent Representations


Title	Spectral Analysis of Latent Representations
Authors	Justin Shenk, Mats L. Richter, Anders Arpteg, Mikael Huss
Abstract	We propose a metric, Layer Saturation, defined as the proportion of the number of eigenvalues needed to explain 99% of the variance of the latent representations, for analyzing the learned representations of neural network layers. Saturation is based on spectral analysis and can be computed efficiently, making live analysis of the representations practical during training. We provide an outlook for future applications of this metric by outlining the behaviour of layer saturation in different neural architectures and problems. We further show that saturation is related to the generalization and predictive performance of neural networks.
Tasks
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08589v1
PDF	https://arxiv.org/pdf/1907.08589v1.pdf
PWC	https://paperswithcode.com/paper/spectral-analysis-of-latent-representations
Repo
Framework

Paying Attention to Function Words


Title	Paying Attention to Function Words
Authors	Shane Steinert-Threlkeld
Abstract	All natural languages exhibit a distinction between content words (like nouns and adjectives) and function words (like determiners, auxiliaries, prepositions). Yet surprisingly little has been said about the emergence of this universal architectural feature of natural languages. Why have human languages evolved to exhibit this division of labor between content and function words? How could such a distinction have emerged in the first place? This paper takes steps towards answering these questions by showing how the distinction can emerge through reinforcement learning in agents playing a signaling game across contexts which contain multiple objects that possess multiple perceptually salient gradable properties.
Tasks
Published	2019-09-24
URL	https://arxiv.org/abs/1909.11060v1
PDF	https://arxiv.org/pdf/1909.11060v1.pdf
PWC	https://paperswithcode.com/paper/paying-attention-to-function-words
Repo
Framework