Paper Group ANR 384
CryptoSPN: Privacy-preserving Sum-Product Network Inference. Classification of Traffic Using Neural Networks by Rejecting: a Novel Approach in Classifying VPN Traffic. Transferable Task Execution from Pixels through Deep Planning Domain Learning. Natural Language Processing Advancements By Deep Learning: A Survey. Physically Plausible Spectral Reco …
CryptoSPN: Privacy-preserving Sum-Product Network Inference
Title | CryptoSPN: Privacy-preserving Sum-Product Network Inference |
Authors | Amos Treiber, Alejandro Molina, Christian Weinert, Thomas Schneider, Kristian Kersting |
Abstract | AI algorithms, and machine learning (ML) techniques in particular, are increasingly important to individuals’ lives, but have caused a range of privacy concerns addressed by, e.g., the European GDPR. Using cryptographic techniques, it is possible to perform inference tasks remotely on sensitive client data in a privacy-preserving way: the server learns nothing about the input data and the model predictions, while the client learns nothing about the ML model (which is often considered intellectual property and might contain traces of sensitive data). While such privacy-preserving solutions are relatively efficient, they are mostly targeted at neural networks, can degrade the predictive accuracy, and usually reveal the network’s topology. Furthermore, existing solutions are not readily accessible to ML experts, as prototype implementations are not well-integrated into ML frameworks and require extensive cryptographic knowledge. In this paper, we present CryptoSPN, a framework for privacy-preserving inference of sum-product networks (SPNs). SPNs are a tractable probabilistic graphical model that allows a range of exact inference queries in linear time. Specifically, we show how to efficiently perform SPN inference via secure multi-party computation (SMPC) without accuracy degradation while hiding sensitive client and training information with provable security guarantees. Next to foundations, CryptoSPN encompasses tools to easily transform existing SPNs into privacy-preserving executables. Our empirical results demonstrate that CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00801v1 |
https://arxiv.org/pdf/2002.00801v1.pdf | |
PWC | https://paperswithcode.com/paper/cryptospn-privacy-preserving-sum-product |
Repo | |
Framework | |
Classification of Traffic Using Neural Networks by Rejecting: a Novel Approach in Classifying VPN Traffic
Title | Classification of Traffic Using Neural Networks by Rejecting: a Novel Approach in Classifying VPN Traffic |
Authors | Ali Parchekani, Salar Nouri Naghadeh, Vahid Shah-Mansouri |
Abstract | Traffic flows are set of packets transferring between a client and a server with the same set of source and destination IP and port numbers. Traffic classification is referred to as the task of categorizing traffic flows into application-aware classes such as chats, streaming, VoIP, etc. Classification can be used for several purposes including policy enforcement and control or QoS management. In this paper, we introduce a novel end-to-end traffic classification method to distinguish between traffic classes including VPN traffic. Classification of VPN traffic is not trivial using traditional classification approaches due to its encrypted nature. We utilize two well-known neural networks, namely multi-layer perceptron and recurrent neural network focused on two metrics: class scores and distance from the center of the classes. Such approaches combined extraction, selection, and classification functionality into a single end-to-end system to systematically learn the non-linear relationship between input and predicted performance. Therefore, we could distinguish VPN traffics from Non-VPN traffics by rejecting the unrelated features of the VPN class. Moreover, obtain the application of Non-VPN traffics at the same time. The approach is evaluated using the general traffic dataset ISCX VPN-nonVPN and the acquired real dataset. The results of the analysis demonstrate that our proposed model fulfills the realistic project’s criterion for precision. |
Tasks | |
Published | 2020-01-10 |
URL | https://arxiv.org/abs/2001.03665v1 |
https://arxiv.org/pdf/2001.03665v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-of-traffic-using-neural |
Repo | |
Framework | |
Transferable Task Execution from Pixels through Deep Planning Domain Learning
Title | Transferable Task Execution from Pixels through Deep Planning Domain Learning |
Authors | Kei Kase, Chris Paxton, Hammad Mazhar, Tetsuya Ogata, Dieter Fox |
Abstract | While robots can learn models to solve many manipulation tasks from raw visual input, they cannot usually use these models to solve new problems. On the other hand, symbolic planning methods such as STRIPS have long been able to solve new problems given only a domain definition and a symbolic goal, but these approaches often struggle on the real world robotic tasks due to the challenges of grounding these symbols from sensor data in a partially-observable world. We propose Deep Planning Domain Learning (DPDL), an approach that combines the strengths of both methods to learn a hierarchical model. DPDL learns a high-level model which predicts values for a large set of logical predicates consisting of the current symbolic world state, and separately learns a low-level policy which translates symbolic operators into executable actions on the robot. This allows us to perform complex, multi-step tasks even when the robot has not been explicitly trained on them. We show our method on manipulation tasks in a photorealistic kitchen scenario. |
Tasks | |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.03726v1 |
https://arxiv.org/pdf/2003.03726v1.pdf | |
PWC | https://paperswithcode.com/paper/transferable-task-execution-from-pixels |
Repo | |
Framework | |
Natural Language Processing Advancements By Deep Learning: A Survey
Title | Natural Language Processing Advancements By Deep Learning: A Survey |
Authors | Amirsina Torfi, Rouzbeh A. Shirvani, Yaser Keneshloo, Nader Tavvaf, Edward A. Fox |
Abstract | Natural Language Processing (NLP) helps empower intelligent machines by enhancing a better understanding of the human language for linguistic-based human-computer communication. Recent developments in computational power and the advent of large amounts of linguistic data have heightened the need and demand for automating semantic analysis using data-driven approaches. The utilization of data-driven strategies is pervasive now due to the significant improvements demonstrated through the usage of deep learning methods in areas such as Computer Vision, Automatic Speech Recognition, and in particular, NLP. This survey categorizes and addresses the different aspects and applications of NLP that have benefited from deep learning. It covers core NLP tasks and applications and describes how deep learning methods and models advance these areas. We further analyze and compare different approaches and state-of-the-art models. |
Tasks | Speech Recognition |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.01200v2 |
https://arxiv.org/pdf/2003.01200v2.pdf | |
PWC | https://paperswithcode.com/paper/natural-language-processing-advancements-by |
Repo | |
Framework | |
Physically Plausible Spectral Reconstruction from RGB Images
Title | Physically Plausible Spectral Reconstruction from RGB Images |
Authors | Yi-Tun Lin, Graham D. Finlayson |
Abstract | Recently Convolutional Neural Networks (CNN) have been used to reconstruct hyperspectral information from RGB images. Moreover, this spectral reconstruction problem (SR) can often be solved with good (low) error. However, these methods are not physically plausible: that is when the recovered spectra are reintegrated with the underlying camera sensitivities, the resulting predicted RGB is not the same as the actual RGB, and sometimes this discrepancy can be large. The problem is further compounded by exposure change. Indeed, most learning-based SR models train for a fixed exposure setting and we show that this can result in poor performance when exposure varies. In this paper we show how CNN learning can be extended so that physical plausibility is enforced and the problem resulting from changing exposures is mitigated. Our SR solution improves the state-of-the-art spectral recovery performance under varying exposure conditions while simultaneously ensuring physical plausibility (the recovered spectra reintegrate to the input RGBs exactly). |
Tasks | |
Published | 2020-01-02 |
URL | https://arxiv.org/abs/2001.00558v1 |
https://arxiv.org/pdf/2001.00558v1.pdf | |
PWC | https://paperswithcode.com/paper/physically-plausible-spectral-reconstruction |
Repo | |
Framework | |
Using a thousand optimization tasks to learn hyperparameter search strategies
Title | Using a thousand optimization tasks to learn hyperparameter search strategies |
Authors | Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein |
Abstract | We present TaskSet, a dataset of tasks for use in training and evaluating optimizers. TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional neural networks, to variational autoencoders, to non-volume preserving flows on a variety of datasets. As an example application of such a dataset we explore meta-learning an ordered list of hyperparameters to try sequentially. By learning this hyperparameter list from data generated using TaskSet we achieve large speedups in sample efficiency over random search. Next we use the diversity of the TaskSet and our method for learning hyperparameter lists to empirically explore the generalization of these lists to new optimization tasks in a variety of settings including ImageNet classification with Resnet50 and LM1B language modeling with transformers. As part of this work we have opensourced code for all tasks, as well as ~29 million training curves for these problems and the corresponding hyperparameters. |
Tasks | Image Classification, Language Modelling, Meta-Learning |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.11887v3 |
https://arxiv.org/pdf/2002.11887v3.pdf | |
PWC | https://paperswithcode.com/paper/using-a-thousand-optimization-tasks-to-learn |
Repo | |
Framework | |
SEERL: Sample Efficient Ensemble Reinforcement Learning
Title | SEERL: Sample Efficient Ensemble Reinforcement Learning |
Authors | Rohan Saphal, Balaraman Ravindran, Dheevatsa Mudigere, Sasikanth Avancha, Bharat Kaul |
Abstract | Ensemble learning is a very prevalent method employed in machine learning. The relative success of ensemble methods is attributed to its ability to tackle a wide range of instances and complex problems that require different low-level approaches. However, ensemble methods are relatively less popular in reinforcement learning owing to the high sample complexity and computational expense involved. We present a new training and evaluation framework for model-free algorithms that use ensembles of policies obtained from a single training instance. These policies are diverse in nature and are learned through directed perturbation of the model parameters at regular intervals. We show that learning an adequately diverse set of policies is required for a good ensemble while extreme diversity can prove detrimental to overall performance. We evaluate our approach to challenging discrete and continuous control tasks and also discuss various ensembling strategies. Our framework is substantially sample efficient, computationally inexpensive and is seen to outperform state of the art(SOTA) scores in Atari 2600 and Mujoco. Video results can be found at https://www.youtube.com/channel/UC95Kctu9Mp8BlFmtGD2TGTA |
Tasks | Continuous Control |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05209v1 |
https://arxiv.org/pdf/2001.05209v1.pdf | |
PWC | https://paperswithcode.com/paper/seerl-sample-efficient-ensemble-reinforcement-1 |
Repo | |
Framework | |
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition
Title | Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition |
Authors | Xiaodong Cui, Wei Zhang, Ulrich Finkler, George Saon, Michael Picheny, David Kung |
Abstract | The past decade has witnessed great progress in Automatic Speech Recognition (ASR) due to advances in deep learning. The improvements in performance can be attributed to both improved models and large-scale training data. Key to training such models is the employment of efficient distributed learning techniques. In this article, we provide an overview of distributed training techniques for deep neural network acoustic models for ASR. Starting with the fundamentals of data parallel stochastic gradient descent (SGD) and ASR acoustic modeling, we will investigate various distributed training strategies and their realizations in high performance computing (HPC) environments with an emphasis on striking the balance between communication and computation. Experiments are carried out on a popular public benchmark to study the convergence, speedup and recognition performance of the investigated strategies. |
Tasks | Speech Recognition |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10502v1 |
https://arxiv.org/pdf/2002.10502v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-training-of-deep-neural-network |
Repo | |
Framework | |
Dynamic Epistemic Logic Games with Epistemic Temporal Goals
Title | Dynamic Epistemic Logic Games with Epistemic Temporal Goals |
Authors | Bastien Maubert, Aniello Murano, Sophie Pinchinat, François Schwarzentruber, Silvia Stranieri |
Abstract | Dynamic Epistemic Logic (DEL) is a logical framework in which one can describe in great detail how actions are perceived by the agents, and how they affect the world. DEL games were recently introduced as a way to define classes of games with imperfect information where the actions available to the players are described very precisely. This framework makes it possible to define easily, for instance, classes of games where players can only use public actions or public announcements. These games have been studied for reachability objectives, where the aim is to reach a situation satisfying some epistemic property expressed in epistemic logic; several (un)decidability results have been established. In this work we show that the decidability results obtained for reachability objectives extend to a much more general class of winning conditions, namely those expressible in the epistemic temporal logic LTLK. To do so we establish that the infinite game structures generated by DEL public actions are regular, and we describe how to obtain finite representations on which we rely to solve them. |
Tasks | |
Published | 2020-01-20 |
URL | https://arxiv.org/abs/2001.07141v1 |
https://arxiv.org/pdf/2001.07141v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-epistemic-logic-games-with-epistemic |
Repo | |
Framework | |
Learn to Schedule (LEASCH): A Deep reinforcement learning approach for radio resource scheduling in the 5G MAC layer
Title | Learn to Schedule (LEASCH): A Deep reinforcement learning approach for radio resource scheduling in the 5G MAC layer |
Authors | F. AL-Tam, N. Correia, J. Rodriguez |
Abstract | Network management tools are usually inherited from one generation to another. This was successful since these tools have been kept in check and updated regularly to fit new networking goals and service requirements. Unfortunately, new networking services will render this approach obsolete and handcrafting new tools or upgrading the current ones may lead to complicated systems that will be extremely difficult to maintain and improve. Fortunately, recent advances in AI have provided new promising tools that can help solving many network management problems. Following this interesting trend, the current article presents LEASCH, a deep reinforcement learning model able to solve the radio resource scheduling problem in the MAC layer of 5G networks. LEASCH is developed and trained in a sand-box and then deployed in a 5G network. The experimental results validate the effectiveness of LEASCH compared to conventional baseline methods in many key performance indicators. |
Tasks | |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.11003v1 |
https://arxiv.org/pdf/2003.11003v1.pdf | |
PWC | https://paperswithcode.com/paper/learn-to-schedule-leasch-a-deep-reinforcement |
Repo | |
Framework | |
Characterizing Reading Time on Enterprise Emails
Title | Characterizing Reading Time on Enterprise Emails |
Authors | Xinyi Li, Chia-Jung Lee, Milad Shokouhi, Susan Dumais |
Abstract | Email is an integral part of people’s work and life, enabling them to perform activities such as communicating, searching, managing tasks and storing information. Modern email clients take a step forward and help improve users’ productivity by automatically creating reminders, tasks or responses. The act of reading is arguably the only activity that is in common in most – if not all – of the interactions that users have with their emails. In this paper, we characterize how users read their enterprise emails, and reveal the various contextual factors that impact reading time. Our approach starts with a reading time analysis based on the reading events from a major email platform, followed by a user study to provide explanations for some discoveries. We identify multiple temporal and user contextual factors that are correlated with reading time. For instance, email reading time is correlated with user devices: on desktop reading time increases through the morning and peaks at noon but on mobile it increases through the evening till midnight. The reading time is also negatively correlated with the screen size. We have established the connection between user status and reading time: users spend more time reading emails when they have fewer meetings and busy hours during the day. In addition, we find that users also reread emails across devices. Among the cross-device reading events, 76% of reread emails are first visited on mobile and then on desktop. Overall, our study is the first to characterize enterprise email reading time on a very large scale. The findings provide insights to develop better metrics and user models for understanding and improving email interactions. |
Tasks | |
Published | 2020-01-03 |
URL | https://arxiv.org/abs/2001.00802v1 |
https://arxiv.org/pdf/2001.00802v1.pdf | |
PWC | https://paperswithcode.com/paper/characterizing-reading-time-on-enterprise |
Repo | |
Framework | |
Treatment effect estimation with disentangled latent factors
Title | Treatment effect estimation with disentangled latent factors |
Authors | Weijia Zhang, Lin Liu, Jiuyong Li |
Abstract | A pressing concern faced by cancer patients is their prognosis under different treatment options. Considering a binary-treatment, e.g., to receive radiotherapy or not, the problem can be characterized as estimating the treatment effect of radiotherapy on the survival outcome of the patients. Estimating treatment effect from observational studies is a fundamental problem, yet it is still especially challenging due to the counterfactual and confounding problems. In this work, we show the importance of differentiating confounding factors from factors that only affect the treatment or the outcome, and propose a data-driven approach to learn and disentangle the latent factors into three disjoint sets for a more accurate estimating treatment estimator. Empirical validations on semi-synthetic benchmark and real-world datasets demonstrate the effectiveness of the proposed method. |
Tasks | |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2001.10652v1 |
https://arxiv.org/pdf/2001.10652v1.pdf | |
PWC | https://paperswithcode.com/paper/treatment-effect-estimation-with-disentangled |
Repo | |
Framework | |
Nonlinear system identification with regularized Tensor Network B-splines
Title | Nonlinear system identification with regularized Tensor Network B-splines |
Authors | Ridvan Karagoz, Kim Batselier |
Abstract | This article introduces the Tensor Network B-spline model for the regularized identification of nonlinear systems using a nonlinear autoregressive exogenous (NARX) approach. Tensor network theory is used to alleviate the curse of dimensionality of multivariate B-splines by representing the high-dimensional weight tensor as a low-rank approximation. An iterative algorithm based on the alternating linear scheme is developed to directly estimate the low-rank tensor network approximation, removing the need to ever explicitly construct the exponentially large weight tensor. This reduces the computational and storage complexity significantly, allowing the identification of NARX systems with a large number of inputs and lags. The proposed algorithm is numerically stable, robust to noise, guaranteed to monotonically converge, and allows the straightforward incorporation of regularization. The TNBS-NARX model is validated through the identification of the cascaded watertank benchmark nonlinear system, on which it achieves state-of-the-art performance while identifying a 16-dimensional B-spline surface in 4 seconds on a standard desktop computer. An open-source MATLAB implementation is available on GitHub. |
Tasks | |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07594v1 |
https://arxiv.org/pdf/2003.07594v1.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-system-identification-with |
Repo | |
Framework | |
Surrogate Assisted Evolutionary Algorithm for Medium Scale Expensive Multi-Objective Optimisation Problems
Title | Surrogate Assisted Evolutionary Algorithm for Medium Scale Expensive Multi-Objective Optimisation Problems |
Authors | Xiaoran Ruan, Ke Li, Bilel Derbel, Arnaud Liefooghe |
Abstract | Building a surrogate model of an objective function has shown to be effective to assist evolutionary algorithms (EAs) to solve real-world complex optimisation problems which involve either computationally expensive numerical simulations or costly physical experiments. However, their effectiveness mostly focuses on small-scale problems with less than 10 decision variables. The scalability of surrogate assisted EAs (SAEAs) have not been well studied yet. In this paper, we propose a Gaussian process surrogate model assisted EA for medium-scale expensive multi-objective optimisation problems with up to 50 decision variables. There are three distinctive features of our proposed SAEA. First, instead of using all decision variables in surrogate model building, we only use those correlated ones to build the surrogate model for each objective function. Second, rather than directly optimising the surrogate objective functions, the original multi-objective optimisation problem is transformed to a new one based on the surrogate models. Last but not the least, a subset selection method is developed to choose a couple of promising candidate solutions for actual objective function evaluations thus to update the training dataset. The effectiveness of our proposed algorithm is validated on benchmark problems with 10, 20, 50 variables, comparing with three state-of-the-art SAEAs. |
Tasks | |
Published | 2020-02-08 |
URL | https://arxiv.org/abs/2002.03150v1 |
https://arxiv.org/pdf/2002.03150v1.pdf | |
PWC | https://paperswithcode.com/paper/surrogate-assisted-evolutionary-algorithm-for |
Repo | |
Framework | |
A Hybrid Approach for Tracking Individual Players in Broadcast Match Videos
Title | A Hybrid Approach for Tracking Individual Players in Broadcast Match Videos |
Authors | Roberto L. Castro, Diego Andrade, Basilio Fraguela |
Abstract | Tracking people in a video sequence is a challenging task that has been approached from many perspectives. This task becomes even more complicated when the person to track is a player in a broadcasted sport event, the reasons being the existence of difficulties such as frequent camera movements or switches, total and partial occlusions between players, and blurry frames due to the codification algorithm of the video. This paper introduces a player tracking solution which is both fast and accurate. This allows to track a player precisely in real-time. The approach combines several models that are executed concurrently in a relatively modest hardware, and whose accuracy has been validated against hand-labeled broadcast video sequences. Regarding the accuracy, the tests show that the area under curve (AUC) of our approach is around 0.6, which is similar to generic state of the art solutions. As for performance, our proposal can process high definition videos (1920x1080 px) at 80 fps. |
Tasks | |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03271v2 |
https://arxiv.org/pdf/2003.03271v2.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-approach-for-tracking-individual |
Repo | |
Framework | |