Paper Group ANR 1508
Kernel Density Estimation Bias under Minimal Assumptions. Empirical Analysis of Session-Based Recommendation Algorithms. Two-Stage Session-based Recommendations with Candidate Rank Embeddings. Augmentation Scheme for Dealing with Imbalanced Network Traffic Classification Using Deep Learning. The many faces of deep learning. CUPCF: Combining Users P …
Kernel Density Estimation Bias under Minimal Assumptions
Title | Kernel Density Estimation Bias under Minimal Assumptions |
Authors | Maciej Skorski |
Abstract | Kernel Density Estimation is a very popular technique of approximating a density function from samples. The accuracy is generally well-understood and depends, roughly speaking, on the kernel decay and local smoothness of the true density. However concrete statements in the literature are often invoked in very specific settings (simplified or overly conservative assumptions) or miss important but subtle points (e.g. it is common to heuristically apply Taylor’s expansion globally without referring to compactness). The contribution of this paper is twofold (a) we demonstrate that, when the bandwidth is an arbitrary invertible matrix going to zero, it is necessary to keep a certain balance between the \emph{kernel decay} and \emph{magnitudes of bandwidth eigenvalues}; in fact, without the sufficient decay the estimates may not be even bounded (b) we give a rigorous derivation of bounds with explicit constants for the bias, under possibly minimal assumptions. This connects the kernel decay, bandwidth norm, bandwidth determinant and density smoothness. It has been folklore that the issue with Taylor’s formula can be fixed with more complicated assumptions on the density (for example p. 95 of “Kernel Smoothing” by Wand and Jones); we show that this is actually not necessary and can be handled by the kernel decay alone. |
Tasks | Density Estimation |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00331v1 |
http://arxiv.org/pdf/1901.00331v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-density-estimation-bias-under-minimal |
Repo | |
Framework | |
Empirical Analysis of Session-Based Recommendation Algorithms
Title | Empirical Analysis of Session-Based Recommendation Algorithms |
Authors | Malte Ludewig, Noemi Mauro, Sara Latifi, Dietmar Jannach |
Abstract | Recommender systems are tools that support online users by pointing them to potential items of interest in situations of information overload. In recent years, the class of session-based recommendation algorithms received more attention in the research literature. These algorithms base their recommendations solely on the observed interactions with the user in an ongoing session and do not require the existence of long-term preference profiles. Most recently, a number of deep learning based (“neural”) approaches to session-based recommendations were proposed. However, previous research indicates that today’s complex neural recommendation methods are not always better than comparably simple algorithms in terms of prediction accuracy. With this work, our goal is to shed light on the state-of-the-art in the area of session-based recommendation and on the progress that is made with neural approaches. For this purpose, we compare twelve algorithmic approaches, among them six recent neural methods, under identical conditions on various datasets. We find that the progress in terms of prediction accuracy that is achieved with neural methods is still limited. In most cases, our experiments show that simple heuristic methods based on nearest-neighbors schemes are preferable over conceptually and computationally more complex methods. Observations from a user study furthermore indicate that recommendations based on heuristic methods were also well accepted by the study participants. To support future progress and reproducibility in this area, we publicly share the session-rec evaluation framework that was used in our research. |
Tasks | Recommendation Systems, Session-Based Recommendations |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12781v1 |
https://arxiv.org/pdf/1910.12781v1.pdf | |
PWC | https://paperswithcode.com/paper/empirical-analysis-of-session-based |
Repo | |
Framework | |
Two-Stage Session-based Recommendations with Candidate Rank Embeddings
Title | Two-Stage Session-based Recommendations with Candidate Rank Embeddings |
Authors | José Antonio Sánchez Rodríguez, Jui-Chieh Wu, Mustafa Khandwawala |
Abstract | Recent advances in Session-based recommender systems have gained attention due to their potential of providing real-time personalized recommendations with high recall, especially when compared to traditional methods like matrix factorization and item-based collaborative filtering. Nowadays, two of the most recent methods are Short-Term Attention/Memory Priority Model for Session-based Recommendation (STAMP) and Neural Attentive Session-based Recommendation (NARM). However, when these two methods were applied in the similar-item recommendation dataset of Zalando (Fashion-Similar), they did not work out-of-the-box compared to a simple Collaborative-Filtering approach. Aiming for improving the similar-item recommendation, we propose to concentrate efforts on enhancing the rank of the few most relevant items from the original recommendations, by employing the information of the session of the user encoded by an attention network. The efficacy of this strategy was confirmed when using a novel Candidate Rank Embedding that encodes the global ranking information of each candidate in the re-ranking process. Experimental results in Fashion-Similar show significant improvements over the baseline on Recall and MRR at 20, as well as improvements in Click Through Rate based on an online test. Additionally, it is important to point out from the evaluation that was performed the potential of this method on the next click prediction problem because when applied to STAMP and NARM, it improves the Recall and MRR at 20 on two publicly available real-world datasets. |
Tasks | Recommendation Systems, Session-Based Recommendations |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08284v1 |
https://arxiv.org/pdf/1908.08284v1.pdf | |
PWC | https://paperswithcode.com/paper/two-stage-session-based-recommendations-with |
Repo | |
Framework | |
Augmentation Scheme for Dealing with Imbalanced Network Traffic Classification Using Deep Learning
Title | Augmentation Scheme for Dealing with Imbalanced Network Traffic Classification Using Deep Learning |
Authors | Ramin Hasibi, Matin Shokri, Mehdi Dehghan |
Abstract | One of the most important tasks in network management is identifying different types of traffic flows. As a result, a type of management service, called Network Traffic Classifier (NTC), has been introduced. One type of NTCs that has gained huge attention in recent years applies deep learning on packets in order to classify flows. Internet is an imbalanced environment i.e., some classes of applications are a lot more populated than others e.g., HTTP. Additionally, one of the challenges in deep learning methods is that they do not perform well in imbalanced environments in terms of evaluation metrics such as precision, recall, and $\mathrm{F_1}$ measure. In order to solve this problem, we recommend the use of augmentation methods to balance the dataset. In this paper, we propose a novel data augmentation approach based on the use of Long Short Term Memory (LSTM) networks for generating traffic flow patterns and Kernel Density Estimation (KDE) for replicating the numerical features of each class. First, we use the LSTM network in order to learn and generate the sequence of packets in a flow for classes with less population. Then, we complete the features of the sequence with generating random values based on the distribution of a certain feature, which will be estimated using KDE. Finally, we compare the training of a Convolutional Recurrent Neural Network (CRNN) in large-scale imbalanced, sampled, and augmented datasets. The contribution of our augmentation scheme is then evaluated on all of the datasets through measurements of precision, recall, and F1 measure for every class of application. The results demonstrate that our scheme is well suited for network traffic flow datasets and improves the performance of deep learning algorithms when it comes to above-mentioned metrics. |
Tasks | Data Augmentation, Density Estimation |
Published | 2019-01-01 |
URL | http://arxiv.org/abs/1901.00204v1 |
http://arxiv.org/pdf/1901.00204v1.pdf | |
PWC | https://paperswithcode.com/paper/augmentation-scheme-for-dealing-with |
Repo | |
Framework | |
The many faces of deep learning
Title | The many faces of deep learning |
Authors | Raul Vicente |
Abstract | Deep learning has sparked a network of mutual interactions between different disciplines and AI. Naturally, each discipline focuses and interprets the workings of deep learning in different ways. This diversity of perspectives on deep learning, from neuroscience to statistical physics, is a rich source of inspiration that fuels novel developments in the theory and applications of machine learning. In this perspective, we collect and synthesize different intuitions scattered across several communities as for how deep learning works. In particular, we will briefly discuss the different perspectives that disciplines across mathematics, physics, computation, and neuroscience take on how deep learning does its tricks. Our discussion on each perspective is necessarily shallow due to the multiple views that had to be covered. The deepness in this case should come from putting all these faces of deep learning together in the reader’s mind, so that one can look at the same problem from different angles. |
Tasks | |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1908.10206v1 |
https://arxiv.org/pdf/1908.10206v1.pdf | |
PWC | https://paperswithcode.com/paper/the-many-faces-of-deep-learning |
Repo | |
Framework | |
CUPCF: Combining Users Preferences in Collaborative Filtering for Better Recommendation
Title | CUPCF: Combining Users Preferences in Collaborative Filtering for Better Recommendation |
Authors | Mostafa Khalaji, Nilufar Mohammadnejad |
Abstract | How to make the best decision between the opinions and tastes of your friends and acquaintances? Therefore, recommender systems are used to solve such issues. The common algorithms use a similarity measure to predict active users’ tastes over a particular item. According to the cold start and data sparsity problems, these systems cannot predict and suggest particular items to users. In this paper, we introduce a new recommender system is able to find user preferences and based on it, provides the recommendations. Our proposed system called CUPCF is a combination of two similarity measures in collaborative filtering to solve the data sparsity problem and poor prediction (high prediction error rate) problems for better recommendation. The experimental results based on MovieLens dataset show that, combined with the preferences of the user’s nearest neighbor, the proposed system error rate compared to a number of state-of-the-art recommendation methods improved. Furthermore, the results indicate the efficiency of CUPCF. The maximum improved error rate of the system is 15.5% and the maximum values of Accuracy, Precision and Recall of CUPCF are 0.91402, 0.91436 and 0.9974 respectively. |
Tasks | Recommendation Systems |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.05609v1 |
https://arxiv.org/pdf/1908.05609v1.pdf | |
PWC | https://paperswithcode.com/paper/cupcf-combining-users-preferences-in |
Repo | |
Framework | |
Minimax Testing of Identity to a Reference Ergodic Markov Chain
Title | Minimax Testing of Identity to a Reference Ergodic Markov Chain |
Authors | Geoffrey Wolfer, Aryeh Kontorovich |
Abstract | We exhibit an efficient procedure for testing, based on a single long state sequence, whether an unknown Markov chain is identical to or $\varepsilon$-far from a given reference chain. We obtain nearly matching (up to logarithmic factors) upper and lower sample complexity bounds for our notion of distance, which is based on total variation. Perhaps surprisingly, we discover that the sample complexity depends solely on the properties of the known reference chain and does not involve the unknown chain at all, which is not even assumed to be ergodic. |
Tasks | |
Published | 2019-01-31 |
URL | https://arxiv.org/abs/1902.00080v3 |
https://arxiv.org/pdf/1902.00080v3.pdf | |
PWC | https://paperswithcode.com/paper/minimax-testing-of-identity-to-a-reference |
Repo | |
Framework | |
Faster and Accurate Classification for JPEG2000 Compressed Images in Networked Applications
Title | Faster and Accurate Classification for JPEG2000 Compressed Images in Networked Applications |
Authors | Lahiru D. Chamain, Zhi Ding |
Abstract | JPEG2000 (j2k) is a highly popular format for image and video compression.With the rapidly growing applications of cloud based image classification, most existing j2k-compatible schemes would stream compressed color images from the source before reconstruction at the processing center as inputs to deep CNNs. We propose to remove the computationally costly reconstruction step by training a deep CNN image classifier using the CDF 9/7 Discrete Wavelet Transformed (DWT) coefficients directly extracted from j2k-compressed images. We demonstrate additional computation savings by utilizing shallower CNN to achieve classification of good accuracy in the DWT domain. Furthermore, we show that traditional augmentation transforms such as flipping/shifting are ineffective in the DWT domain and present different augmentation transformations to achieve more accurate classification without any additional cost. This way, faster and more accurate classification is possible for j2k encoded images without image reconstruction. Through experiments on CIFAR-10 and Tiny ImageNet data sets, we show that the performance of the proposed solution is consistent for image transmission over limited channel bandwidth. |
Tasks | Image Classification, Image Reconstruction |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.05638v1 |
https://arxiv.org/pdf/1909.05638v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-and-accurate-classification-for |
Repo | |
Framework | |
Medical image super-resolution method based on dense blended attention network
Title | Medical image super-resolution method based on dense blended attention network |
Authors | Kewen Liu, Yuan Ma, Hongxia Xiong, Zejun Yan, Zhijun Zhou, Panpan Fang, Chaoyang Liu |
Abstract | In order to address the issue that medical image would suffer from severe blurring caused by the lack of high-frequency details in the process of image super-resolution reconstruction, a novel medical image super-resolution method based on dense neural network and blended attention mechanism is proposed. The proposed method adds blended attention blocks to dense neural network(DenseNet), so that the neural network can concentrate more attention to the regions and channels with sufficient high-frequency details. Batch normalization layers are removed to avoid loss of high-frequency texture details. Final obtained high resolution medical image are obtained using deconvolutional layers at the very end of the network as up-sampling operators. Experimental results show that the proposed method has an improvement of 0.05db to 11.25dB and 0.6% to 14.04% on the peak signal-to-noise ratio(PSNR) metric and structural similarity index(SSIM) metric, respectively, compared with the mainstream image super-resolution methods. This work provides a new idea for theoretical studies of medical image super-resolution reconstruction. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.05084v1 |
https://arxiv.org/pdf/1905.05084v1.pdf | |
PWC | https://paperswithcode.com/paper/medical-image-super-resolution-method-based |
Repo | |
Framework | |
SPLINE-Net: Sparse Photometric Stereo through Lighting Interpolation and Normal Estimation Networks
Title | SPLINE-Net: Sparse Photometric Stereo through Lighting Interpolation and Normal Estimation Networks |
Authors | Qian Zheng, Yiming Jia, Boxin Shi, Xudong Jiang, Ling-Yu Duan, Alex C. Kot |
Abstract | This paper solves the Sparse Photometric stereo through Lighting Interpolation and Normal Estimation using a generative Network (SPLINE-Net). SPLINE-Net contains a lighting interpolation network to generate dense lighting observations given a sparse set of lights as inputs followed by a normal estimation network to estimate surface normals. Both networks are jointly constrained by the proposed symmetric and asymmetric loss functions to enforce isotropic constrain and perform outlier rejection of global illumination effects. SPLINE-Net is verified to outperform existing methods for photometric stereo of general BRDFs by using only ten images of different lights instead of using nearly one hundred images. |
Tasks | |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04088v2 |
https://arxiv.org/pdf/1905.04088v2.pdf | |
PWC | https://paperswithcode.com/paper/spline-net-sparse-photometric-stereo-through |
Repo | |
Framework | |
Testing Preferential Domains Using Sampling
Title | Testing Preferential Domains Using Sampling |
Authors | Palash Dey, Swaprava Nath, Garima Shakya |
Abstract | A preferential domain is a collection of sets of preferences which are linear orders over a set of alternatives. These domains have been studied extensively in social choice theory due to both its practical importance and theoretical elegance. Examples of some extensively studied preferential domains include single peaked, single crossing, Euclidean, etc. In this paper, we study the sample complexity of testing whether a given preference profile is close to some specific domain. We consider two notions of closeness: (a) closeness via preferences, and (b) closeness via alternatives. We further explore the effect of assuming that the {\em outlier} preferences/alternatives to be random (instead of arbitrary) on the sample complexity of the testing problem. In most cases, we show that the above testing problem can be solved with high probability for all commonly used domains by observing only a small number of samples (independent of the number of preferences, $n$, and often the number of alternatives, $m$). In the remaining few cases, we prove either impossibility results or $\Omega(n)$ lower bound on the sample complexity. We complement our theoretical findings with extensive simulations to figure out the actual constant factors of our asymptotic sample complexity bounds. |
Tasks | |
Published | 2019-02-24 |
URL | http://arxiv.org/abs/1902.08930v1 |
http://arxiv.org/pdf/1902.08930v1.pdf | |
PWC | https://paperswithcode.com/paper/testing-preferential-domains-using-sampling |
Repo | |
Framework | |
Adversarial Generation and Encoding of Nested Texts
Title | Adversarial Generation and Encoding of Nested Texts |
Authors | Alon Rozental |
Abstract | In this paper we propose a new language model called AGENT, which stands for Adversarial Generation and Encoding of Nested Texts. AGENT is designed for encoding, generating and refining documents that consist of a long and coherent text, such as an entire book, provided they are hierarchically annotated (nested). i.e. divided into sentences, paragraphs and chapters. The core idea of our system is learning vector representations for each level of the text hierarchy (sentences, paragraphs, etc…), and train each such representation to perform 3 tasks: The task of reconstructing the sequence of vectors from a lower level that was used to create the representation, and generalized versions of the Masked Language Modeling (MLM) and “Next Sentence Prediction” tasks from BERT Devlin et al. [2018]. Additionally we present a new adversarial model for long text generation and suggest a way to improve the coherence of the generated text by traversing its vector representation tree. |
Tasks | Language Modelling, Text Generation |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.00238v1 |
https://arxiv.org/pdf/1906.00238v1.pdf | |
PWC | https://paperswithcode.com/paper/190600238 |
Repo | |
Framework | |
Facial Feature Embedded CycleGAN for VIS-NIR Translation
Title | Facial Feature Embedded CycleGAN for VIS-NIR Translation |
Authors | Huijiao Wang, Li Wang, Xulei Yang, Lei Yu, Haijian Zhang |
Abstract | VIS-NIR face recognition remains a challenging task due to the distinction between spectral components of two modalities and insufficient paired training data. Inspired by the CycleGAN, this paper presents a method aiming to translate VIS face images into fake NIR images whose distributions are intended to approximate those of true NIR images, which is achieved by proposing a new facial feature embedded CycleGAN. Firstly, to learn the particular feature of NIR domain while preserving common facial representation between VIS and NIR domains, we employ a general facial feature extractor (FFE) to replace the encoder in the original generator of CycleGAN. For implementing the facial feature extractor, herein the MobileFaceNet is pretrained on a VIS face database, and is able to extract effective features. Secondly, the domain-invariant feature learning is enhanced by considering a new pixel consistency loss. Lastly, we establish a new WHU VIS-NIR database which varies in face rotation and expressions to enrich the training data. Experimental results on the Oulu-CASIA NIR-VIS database and the WHU VIS-NIR database show that the proposed FFE-based CycleGAN (FFE-CycleGAN) outperforms state-of-the-art VIS-NIR face recognition methods and achieves 96.5% accuracy. |
Tasks | Face Recognition |
Published | 2019-04-20 |
URL | http://arxiv.org/abs/1904.09464v2 |
http://arxiv.org/pdf/1904.09464v2.pdf | |
PWC | https://paperswithcode.com/paper/facial-feature-embedded-cyclegan-for-vis-nir |
Repo | |
Framework | |
Long-Term Planning and Situational Awareness in OpenAI Five
Title | Long-Term Planning and Situational Awareness in OpenAI Five |
Authors | Jonathan Raiman, Susan Zhang, Filip Wolski |
Abstract | Understanding how knowledge about the world is represented within model-free deep reinforcement learning methods is a major challenge given the black box nature of its learning process within high-dimensional observation and action spaces. AlphaStar and OpenAI Five have shown that agents can be trained without any explicit hierarchical macro-actions to reach superhuman skill in games that require taking thousands of actions before reaching the final goal. Assessing the agent’s plans and game understanding becomes challenging given the lack of hierarchy or explicit representations of macro-actions in these models, coupled with the incomprehensible nature of the internal representations. In this paper, we study the distributed representations learned by OpenAI Five to investigate how game knowledge is gradually obtained over the course of training. We also introduce a general technique for learning a model from the agent’s hidden states to identify the formation of plans and subgoals. We show that the agent can learn situational similarity across actions, and find evidence of planning towards accomplishing subgoals minutes before they are executed. We perform a qualitative analysis of these predictions during the games against the DotA 2 world champions OG in April 2019. |
Tasks | Dota 2 |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06721v1 |
https://arxiv.org/pdf/1912.06721v1.pdf | |
PWC | https://paperswithcode.com/paper/long-term-planning-and-situational-awareness |
Repo | |
Framework | |
Future Data Helps Training: Modeling Future Contexts for Session-based Recommendation
Title | Future Data Helps Training: Modeling Future Contexts for Session-based Recommendation |
Authors | Fajie Yuan, Xiangnan He, Haochuan Jiang, Guibing Guo, Jian Xiong, Zhezhao Xu, Yilin Xiong |
Abstract | Session-based recommender systems have attracted much attention recently. To capture the sequential dependencies, existing methods resort either to data augmentation techniques or left-to-right style autoregressive training.Since these methods are aimed to model the sequential nature of user behaviors, they ignore the future data of a target interaction when constructing the prediction model for it. However, we argue that the future interactions after a target interaction, which are also available during training, provide valuable signal on user preference and can be used to enhance the recommendation quality. Properly integrating future data into model training, however, is non-trivial to achieve, since it disobeys machine learning principles and can easily cause data leakage. To this end, we propose a new encoder-decoder framework named Gap-filling based Recommender (GRec), which trains the encoder and decoder by a gap-filling mechanism. Specifically, the encoder takes a partially-complete session sequence (where some items are masked by purpose) as input, and the decoder predicts these masked items conditioned on the encoded representation. We instantiate the general GRec framework using convolutional neural network with sparse kernels, giving consideration to both accuracy and efficiency. We conduct experiments on two real-world datasets covering short-, medium-, and long-range user sessions, showing that GRec significantly outperforms the state-of-the-art sequential recommendation methods. More empirical studies verify the high utility of modeling future contexts under our GRec framework. |
Tasks | Data Augmentation, Recommendation Systems, Session-Based Recommendations |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04473v4 |
https://arxiv.org/pdf/1906.04473v4.pdf | |
PWC | https://paperswithcode.com/paper/modeling-the-past-and-future-contexts-for |
Repo | |
Framework | |