Paper Group ANR 1063
Random CapsNet Forest Model for Imbalanced Malware Type Classification Task. Towards automatic construction of multi-network models for heterogeneous multi-task learning. Sensor Fusion using Backward Shortcut Connections for Sleep Apnea Detection in Multi-Modal Data. Reinforcement Learning with Dynamic Boltzmann Softmax Updates. A Geometric Approac …
Random CapsNet Forest Model for Imbalanced Malware Type Classification Task
Title | Random CapsNet Forest Model for Imbalanced Malware Type Classification Task |
Authors | Aykut Çayır, Uğur Ünal, Hasan Dağ |
Abstract | Behavior of a malware varies with respect to malware types. Therefore,knowing type of a malware affects strategies of system protection softwares. Many malware type classification models empowered by machine and deep learning achieve superior accuracies to predict malware types.Machine learning based models need to do heavy feature engineering and feature engineering is dominantly effecting performance of models.On the other hand, deep learning based models require less feature engineering than machine learning based models. However, traditional deep learning architectures and components cause very complex and data sensitive models. Capsule network architecture minimizes this complexity and data sensitivity unlike classical convolutional neural network architectures. This paper proposes an ensemble capsule network model based on bootstrap aggregating technique. The proposed method are tested on two malware datasets, whose the-state-of-the-art results are well-known. |
Tasks | Feature Engineering |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.10836v3 |
https://arxiv.org/pdf/1912.10836v3.pdf | |
PWC | https://paperswithcode.com/paper/random-capsnet-forest-model-for-imbalanced |
Repo | |
Framework | |
Towards automatic construction of multi-network models for heterogeneous multi-task learning
Title | Towards automatic construction of multi-network models for heterogeneous multi-task learning |
Authors | Unai Garciarena, Alexander Mendiburu, Roberto Santana |
Abstract | Multi-task learning, as it is understood nowadays, consists of using one single model to carry out several similar tasks. From classifying hand-written characters of different alphabets to figuring out how to play several Atari games using reinforcement learning, multi-task models have been able to widen their performance range across different tasks, although these tasks are usually of a similar nature. In this work, we attempt to widen this range even further, by including heterogeneous tasks in a single learning procedure. To do so, we firstly formally define a multi-network model, identifying the necessary components and characteristics to allow different adaptations of said model depending on the tasks it is required to fulfill. Secondly, employing the formal definition as a starting point, we develop an illustrative model example consisting of three different tasks (classification, regression and data sampling). The performance of this model implementation is then analyzed, showing its capabilities. Motivated by the results of the analysis, we enumerate a set of open challenges and future research lines over which the full potential of the proposed model definition can be exploited. |
Tasks | Atari Games, Multi-Task Learning |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09171v1 |
http://arxiv.org/pdf/1903.09171v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-automatic-construction-of-multi |
Repo | |
Framework | |
Sensor Fusion using Backward Shortcut Connections for Sleep Apnea Detection in Multi-Modal Data
Title | Sensor Fusion using Backward Shortcut Connections for Sleep Apnea Detection in Multi-Modal Data |
Authors | Tom Van Steenkiste, Dirk Deschrijver, Tom Dhaene |
Abstract | Sleep apnea is a common respiratory disorder characterized by breathing pauses during the night. Consequences of untreated sleep apnea can be severe. Still, many people remain undiagnosed due to shortages of hospital beds and trained sleep technicians. To assist in the diagnosis process, automated detection methods are being developed. Recent works have demonstrated that deep learning models can extract useful information from raw respiratory data and that such models can be used as a robust sleep apnea detector. However, trained sleep technicians take into account multiple sensor signals when annotating sleep recordings instead of relying on a single respiratory estimate. To improve the predictive performance and reliability of the models, early and late sensor fusion methods are explored in this work. In addition, a novel late sensor fusion method is proposed which uses backward shortcut connections to improve the learning of the first stages of the models. The performance of these fusion methods is analyzed using CNN as well as LSTM deep learning base-models. The results demonstrate a significant and consistent improvement in predictive performance over the single sensor methods and over the other explored sensor fusion methods, by using the proposed sensor fusion method with backward shortcut connections. |
Tasks | Sensor Fusion, Sleep apnea detection |
Published | 2019-12-14 |
URL | https://arxiv.org/abs/1912.06879v1 |
https://arxiv.org/pdf/1912.06879v1.pdf | |
PWC | https://paperswithcode.com/paper/sensor-fusion-using-backward-shortcut |
Repo | |
Framework | |
Reinforcement Learning with Dynamic Boltzmann Softmax Updates
Title | Reinforcement Learning with Dynamic Boltzmann Softmax Updates |
Authors | Ling Pan, Qingpeng Cai, Qi Meng, Wei Chen, Longbo Huang, Tie-Yan Liu |
Abstract | Value function estimation is an important task in reinforcement learning, i.e., prediction. The Boltzmann softmax operator is a natural value estimator and can provide several benefits. However, it does not satisfy the non-expansion property, and its direct use may fail to converge even in value iteration. In this paper, we propose to update the value function with dynamic Boltzmann softmax (DBS) operator, which has good convergence property in the setting of planning and learning. Experimental results on GridWorld show that the DBS operator enables better estimation of the value function, which rectifies the convergence issue of the softmax operator. Finally, we propose the DBS-DQN algorithm by applying dynamic Boltzmann softmax updates in deep Q-network, which outperforms DQN substantially in 40 out of 49 Atari games. |
Tasks | Atari Games, Q-Learning |
Published | 2019-03-14 |
URL | https://arxiv.org/abs/1903.05926v4 |
https://arxiv.org/pdf/1903.05926v4.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-with-dynamic-boltzmann |
Repo | |
Framework | |
A Geometric Approach to Online Streaming Feature Selection
Title | A Geometric Approach to Online Streaming Feature Selection |
Authors | Salimeh Yasaei Sekeh, Madan Ravi Ganesh, Shurjo Banerjee, Jason J. Corso, Alfred O. Hero |
Abstract | Online Streaming Feature Selection (OSFS) is a sequential learning problem where individual features across all samples are made available to algorithms in a streaming fashion. In this work, firstly, we assert that OSFS’s main assumption of having data from all the samples available at runtime is unrealistic and introduce a new setting where features and samples are streamed concurrently called OSFS with Streaming Samples (OSFS-SS). Secondly, the primary OSFS method, SAOLA utilizes an unbounded mutual information measure and requires multiple comparison steps between the stored and incoming feature sets to evaluate a feature’s importance. We introduce Geometric Online Adaption, an algorithm that requires relatively less feature comparison steps and uses a bounded conditional geometric dependency measure. Our algorithm outperforms several OSFS baselines including SAOLA on a variety of datasets. We also extend SAOLA to work in the OSFS-SS setting and show that GOA continues to achieve the best results. Thirdly, the current paradigm of the OSFS algorithm comparison is flawed. Algorithms are measured by comparing the number of features used and the accuracy obtained by the learner, two properties that are fundamentally at odds with one another. Without fixing a limit on either of these properties, the qualities of features obtained by different algorithms are incomparable. We try to rectify this inconsistency by fixing the maximum number of features available to the learner and comparing algorithms in terms of their accuracy. Additionally, we characterize the behaviour of SAOLA and GOA on feature sets derived from popular deep convolutional featurizers. |
Tasks | Feature Selection |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01182v2 |
https://arxiv.org/pdf/1910.01182v2.pdf | |
PWC | https://paperswithcode.com/paper/geometric-online-adaptation-graph-based-osfs |
Repo | |
Framework | |
Application Inference using Machine Learning based Side Channel Analysis
Title | Application Inference using Machine Learning based Side Channel Analysis |
Authors | Nikhil Chawla, Arvind Singh, Monodeep Kar, Saibal Mukhopadhyay |
Abstract | The proliferation of ubiquitous computing requires energy-efficient as well as secure operation of modern processors. Side channel attacks are becoming a critical threat to security and privacy of devices embedded in modern computing infrastructures. Unintended information leakage via physical signatures such as power consumption, electromagnetic emission (EM) and execution time have emerged as a key security consideration for SoCs. Also, information published on purpose at user privilege level accessible through software interfaces results in software only attacks. In this paper, we used a supervised learning based approach for inferring applications executing on android platform based on features extracted from EM side-channel emissions and software exposed dynamic voltage frequency scaling(DVFS) states. We highlight the importance of machine learning based approach in utilizing these multi-dimensional features on a complex SoC, against profiling-based approaches. We also show that learning the instantaneous frequency states polled from onboard frequency driver (cpufreq) is adequate to identify a known application and flag potentially malicious unknown application. The experimental results on benchmarking applications running on ARMv8 processor in Snapdragon 820 board demonstrates early detection of these apps, and atleast 85% accuracy in detecting unknown applications. Overall, the highlight is to utilize a low-complexity path to application inference attacks through learning instantaneous frequency states pattern of CPU core. |
Tasks | |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04428v1 |
https://arxiv.org/pdf/1907.04428v1.pdf | |
PWC | https://paperswithcode.com/paper/application-inference-using-machine-learning |
Repo | |
Framework | |
A Contextual-Bandit Approach to Online Learning to Rank for Relevance and Diversity
Title | A Contextual-Bandit Approach to Online Learning to Rank for Relevance and Diversity |
Authors | Chang Li, Haoyun Feng, Maarten de Rijke |
Abstract | Online learning to rank (LTR) focuses on learning a policy from user interactions that builds a list of items sorted in decreasing order of the item utility. It is a core area in modern interactive systems, such as search engines, recommender systems, or conversational assistants. Previous online LTR approaches either assume the relevance of an item in the list to be independent of other items in the list or the relevance of an item to be a submodular function of the utility of the list. The former type of approach may result in a list of low diversity that has relevant items covering the same aspects, while the latter approaches may lead to a highly diversified list but with some non-relevant items. In this paper, we study an online LTR problem that considers both item relevance and topical diversity. We assume cascading user behavior, where a user browses the displayed list of items from top to bottom and clicks the first attractive item and stops browsing the rest. We propose a hybrid contextual bandit approach, called CascadeHybrid, for solving this problem. CascadeHybrid models item relevance and topical diversity using two independent functions and simultaneously learns those functions from user click feedback. We derive a gap-free bound on the n-step regret of CascadeHybrid. We conduct experiments to evaluate CascadeHybrid on the MovieLens and Yahoo music datasets. Our experimental results show that CascadeHybrid outperforms the baselines on both datasets. |
Tasks | Learning-To-Rank, Recommendation Systems |
Published | 2019-12-01 |
URL | https://arxiv.org/abs/1912.00508v2 |
https://arxiv.org/pdf/1912.00508v2.pdf | |
PWC | https://paperswithcode.com/paper/a-contextual-bandit-approach-to-online |
Repo | |
Framework | |
Spatially-weighted Anomaly Detection with Regression Model
Title | Spatially-weighted Anomaly Detection with Regression Model |
Authors | Daiki Kimura, Minori Narita, Asim Munawar, Ryuki Tachibana |
Abstract | Visual anomaly detection is common in several applications including medical screening and production quality check. Although a definition of the anomaly is an unknown trend in data, in many cases some hints or samples of the anomaly class can be given in advance. Conventional methods cannot use the available anomaly data, and also do not have a robustness of noise. In this paper, we propose a novel spatially-weighted reconstruction-loss-based anomaly detection with a likelihood value from a regression model trained by all known data. The spatial weights are calculated by a region of interest generated from employing visualization of the regression model. We introduce some ways to combine with various strategies to propose a state-of-the-art method. Comparing with other methods on three different datasets, we empirically verify the proposed method performs better than the others. |
Tasks | Anomaly Detection |
Published | 2019-03-23 |
URL | http://arxiv.org/abs/1903.09798v2 |
http://arxiv.org/pdf/1903.09798v2.pdf | |
PWC | https://paperswithcode.com/paper/spatially-weighted-anomaly-detection-with |
Repo | |
Framework | |
Shallow Syntax in Deep Water
Title | Shallow Syntax in Deep Water |
Authors | Swabha Swayamdipta, Matthew Peters, Brendan Roof, Chris Dyer, Noah A. Smith |
Abstract | Shallow syntax provides an approximation of phrase-syntactic structure of sentences; it can be produced with high accuracy, and is computationally cheap to obtain. We investigate the role of shallow syntax-aware representations for NLP tasks using two techniques. First, we enhance the ELMo architecture to allow pretraining on predicted shallow syntactic parses, instead of just raw text, so that contextual embeddings make use of shallow syntactic context. Our second method involves shallow syntactic features obtained automatically on downstream task data. Neither approach leads to a significant gain on any of the four downstream tasks we considered relative to ELMo-only baselines. Further analysis using black-box probes confirms that our shallow-syntax-aware contextual embeddings do not transfer to linguistic tasks any more easily than ELMo’s embeddings. We take these findings as evidence that ELMo-style pretraining discovers representations which make additional awareness of shallow syntax redundant. |
Tasks | |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11047v1 |
https://arxiv.org/pdf/1908.11047v1.pdf | |
PWC | https://paperswithcode.com/paper/shallow-syntax-in-deep-water |
Repo | |
Framework | |
A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity
Title | A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity |
Authors | Yoshinari Fujinuma, Jordan Boyd-Graber, Michael J. Paul |
Abstract | Cross-lingual word embeddings encode the meaning of words from different languages into a shared low-dimensional space. An important requirement for many downstream tasks is that word similarity should be independent of language - i.e., word vectors within one language should not be more similar to each other than to words in another language. We measure this characteristic using modularity, a network measurement that measures the strength of clusters in a graph. Modularity has a moderate to strong correlation with three downstream tasks, even though modularity is based only on the structure of embeddings and does not require any external resources. We show through experiments that modularity can serve as an intrinsic validation metric to improve unsupervised cross-lingual word embeddings, particularly on distant language pairs in low-resource settings. |
Tasks | Word Embeddings |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01926v1 |
https://arxiv.org/pdf/1906.01926v1.pdf | |
PWC | https://paperswithcode.com/paper/a-resource-free-evaluation-metric-for-cross |
Repo | |
Framework | |
Exploring Uncertainty Measures for Image-Caption Embedding-and-Retrieval Task
Title | Exploring Uncertainty Measures for Image-Caption Embedding-and-Retrieval Task |
Authors | Kenta Hama, Takashi Matsubara, Kuniaki Uehara, Jianfei Cai |
Abstract | With the wide development of black-box machine learning algorithms, particularly deep neural network (DNN), the practical demand for the reliability assessment is rapidly rising. On the basis of the concept that `Bayesian deep learning knows what it does not know,’ the uncertainty of DNN outputs has been investigated as a reliability measure for the classification and regression tasks. However, in the image-caption retrieval task, well-known samples are not always easy-to-retrieve samples. This study investigates two aspects of image-caption embedding-and-retrieval systems. On one hand, we quantify feature uncertainty by considering image-caption embedding as a regression task, and use it for model averaging, which can improve the retrieval performance. On the other hand, we further quantify posterior uncertainty by considering the retrieval as a classification task, and use it as a reliability measure, which can greatly improve the retrieval performance by rejecting uncertain queries. The consistent performance of two uncertainty measures is observed with different datasets (MS COCO and Flickr30k), different deep learning architectures (dropout and batch normalization), and different similarity functions. | |
Tasks | |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.08504v1 |
http://arxiv.org/pdf/1904.08504v1.pdf | |
PWC | https://paperswithcode.com/paper/190408504 |
Repo | |
Framework | |
$α^α$-Rank: Practically Scaling $α$-Rank through Stochastic Optimisation
Title | $α^α$-Rank: Practically Scaling $α$-Rank through Stochastic Optimisation |
Authors | Yaodong Yang, Rasul Tutunov, Phu Sakulwongtana, Haitham Bou Ammar |
Abstract | Recently, $\alpha$-Rank, a graph-based algorithm, has been proposed as a solution to ranking joint policy profiles in large scale multi-agent systems. $\alpha$-Rank claimed tractability through a polynomial time implementation with respect to the total number of pure strategy profiles. Here, we note that inputs to the algorithm were not clearly specified in the original presentation; as such, we deem complexity claims as not grounded, and conjecture solving $\alpha$-Rank is NP-hard. The authors of $\alpha$-Rank suggested that the input to $\alpha$-Rank can be an exponentially-sized payoff matrix; a claim promised to be clarified in subsequent manuscripts. Even though $\alpha$-Rank exhibits a polynomial-time solution with respect to such an input, we further reflect additional critical problems. We demonstrate that due to the need of constructing an exponentially large Markov chain, $\alpha$-Rank is infeasible beyond a small finite number of agents. We ground these claims by adopting amount of dollars spent as a non-refutable evaluation metric. Realising such scalability issue, we present a stochastic implementation of $\alpha$-Rank with a double oracle mechanism allowing for reductions in joint strategy spaces. Our method, $\alpha^\alpha$-Rank, does not need to save exponentially-large transition matrix, and can terminate early under required precision. Although theoretically our method exhibits similar worst-case complexity guarantees compared to $\alpha$-Rank, it allows us, for the first time, to practically conduct large-scale multi-agent evaluations. On $10^4 \times 10^4$ random matrices, we achieve $1000x$ speed reduction. Furthermore, we also show successful results on large joint strategy profiles with a maximum size in the order of $\mathcal{O}(2^{25})$ ($\approx 33$ million joint strategies) – a setting not evaluable using $\alpha$-Rank with reasonable computational budget. |
Tasks | Stochastic Optimization |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11628v6 |
https://arxiv.org/pdf/1909.11628v6.pdf | |
PWC | https://paperswithcode.com/paper/-rank-scalable-multi-agent-evaluation-through |
Repo | |
Framework | |
Automatic Calibration of Artificial Neural Networks for Zebrafish Collective Behaviours using a Quality Diversity Algorithm
Title | Automatic Calibration of Artificial Neural Networks for Zebrafish Collective Behaviours using a Quality Diversity Algorithm |
Authors | Leo Cazenille, Nicolas Bredeche, José Halloy |
Abstract | During the last two decades, various models have been proposed for fish collective motion. These models are mainly developed to decipher the biological mechanisms of social interaction between animals. They consider very simple homogeneous unbounded environments and it is not clear that they can simulate accurately the collective trajectories. Moreover when the models are more accurate, the question of their scalability to either larger groups or more elaborate environments remains open. This study deals with learning how to simulate realistic collective motion of collective of zebrafish, using real-world tracking data. The objective is to devise an agent-based model that can be implemented on an artificial robotic fish that can blend into a collective of real fish. We present a novel approach that uses Quality Diversity algorithms, a class of algorithms that emphasise exploration over pure optimisation. In particular, we use CVT-MAP-Elites, a variant of the state-of-the-art MAP-Elites algorithm for high dimensional search space. Results show that Quality Diversity algorithms not only outperform classic evolutionary reinforcement learning methods at the macroscopic level (i.e. group behaviour), but are also able to generate more realistic biomimetic behaviours at the microscopic level (i.e. individual behaviour). |
Tasks | Calibration |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09209v1 |
https://arxiv.org/pdf/1907.09209v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-calibration-of-artificial-neural |
Repo | |
Framework | |
Driver Identification via the Steering Wheel
Title | Driver Identification via the Steering Wheel |
Authors | Bernhard Gahr, Shu Liu, Kevin Koch, Filipe Barata, André Dahlinger, Benjamin Ryder, Elgar Fleisch, Felix Wortmann |
Abstract | Driver identification has emerged as a vital research field, where both practitioners and researchers investigate the potential of driver identification to enable a personalized driving experience. Within recent years, a selection of studies have reported that individuals could be perfectly identified based on their driving behavior under controlled conditions. However, research investigating the potential of driver identification under naturalistic conditions claim accuracies only marginally higher than random guess. The paper at hand provides a comprehensive summary of the recent work, highlighting the main discrepancies in the design of the machine learning approaches, primarily the window length parameter that was considered. Key findings further indicate that the longitudinal vehicle control information is particularly useful for driver identification, leaving the research gap on the extent to which the lateral vehicle control can be used for reliable identification. Building upon existing work, we provide a novel approach for the design of the window length parameter that provides evidence that reliable driver identification can be achieved with data limited to the steering wheel only. The results and insights in this paper are based on data collected from the largest naturalistic driving study conducted in this field. Overall, a neural network based on GRUs was found to provide better identification performance than traditional methods, increasing the prediction accuracy from under 15% to over 65% for 15 drivers. When leveraging the full field study dataset, comprising 72 drivers, the accuracy of identification prediction of the approach improved a random guess approach by a factor of 25. |
Tasks | |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03953v1 |
https://arxiv.org/pdf/1909.03953v1.pdf | |
PWC | https://paperswithcode.com/paper/driver-identification-via-the-steering-wheel |
Repo | |
Framework | |
QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection
Title | QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection |
Authors | Honghao Wei, Xiaohan Kang, Weina Wang, Lei Ying |
Abstract | This paper combines data-driven and model-driven methods for real-time misinformation detection. Our algorithm, named QuickStop, is an optimal stopping algorithm based on a probabilistic information spreading model obtained from labeled data. The algorithm consists of an offline machine learning algorithm for learning the probabilistic information spreading model and an online optimal stopping algorithm to detect misinformation. The online detection algorithm has both low computational and memory complexities. Our numerical evaluations with a real-world dataset show that QuickStop outperforms existing misinformation detection algorithms in terms of both accuracy and detection time (number of observations needed for detection). Our evaluations with synthetic data further show that QuickStop is robust to (offline) learning errors. |
Tasks | |
Published | 2019-03-04 |
URL | https://arxiv.org/abs/1903.04887v2 |
https://arxiv.org/pdf/1903.04887v2.pdf | |
PWC | https://paperswithcode.com/paper/quickstop-a-markov-optimal-stopping-approach |
Repo | |
Framework | |