January 28, 2020

3408 words 16 mins read

Paper Group ANR 872

Paper Group ANR 872

Neuroscience-inspired online unsupervised learning algorithms. Deep geometric knowledge distillation with graphs. Robust contrastive learning and nonlinear ICA in the presence of outliers. Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis. Exploring Graph Neural Networks for Stock Market Predictions with Rolling Window …

Neuroscience-inspired online unsupervised learning algorithms

Title Neuroscience-inspired online unsupervised learning algorithms
Authors Cengiz Pehlevan, Dmitri B. Chklovskii
Abstract Although the currently popular deep learning networks achieve unprecedented performance on some tasks, the human brain still has a monopoly on general intelligence. Motivated by this and biological implausibility of deep learning networks, we developed a family of biologically plausible artificial neural networks (NNs) for unsupervised learning. Our approach is based on optimizing principled objective functions containing a term that matches the pairwise similarity of outputs to the similarity of inputs, hence the name - similarity-based. Gradient-based online optimization of such similarity-based objective functions can be implemented by NNs with biologically plausible local learning rules. Similarity-based cost functions and associated NNs solve unsupervised learning tasks such as linear dimensionality reduction, sparse and/or nonnegative feature extraction, blind nonnegative source separation, clustering and manifold learning.
Tasks Dimensionality Reduction
Published 2019-08-05
URL https://arxiv.org/abs/1908.01867v2
PDF https://arxiv.org/pdf/1908.01867v2.pdf
PWC https://paperswithcode.com/paper/neuroscience-inspired-online-unsupervised
Repo
Framework

Deep geometric knowledge distillation with graphs

Title Deep geometric knowledge distillation with graphs
Authors Carlos Lassance, Myriam Bontonou, Ghouthi Boukli Hacene, Vincent Gripon, Jian Tang, Antonio Ortega
Abstract In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the student to mimic the teacher representation can be effective, but it requires that both share the same latent space dimensions. In this work, we focus instead on relative knowledge distillation (RKD), which considers the geometry of the respective latent spaces, allowing for dimension-agnostic transfer of knowledge. Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces. Using classical computer vision benchmarks, we demonstrate the ability of the proposed method to efficiently distillate knowledge from the teacher to the student, leading to better accuracy for the same budget as compared to existing RKD alternatives.
Tasks
Published 2019-11-08
URL https://arxiv.org/abs/1911.03080v1
PDF https://arxiv.org/pdf/1911.03080v1.pdf
PWC https://paperswithcode.com/paper/deep-geometric-knowledge-distillation-with
Repo
Framework

Robust contrastive learning and nonlinear ICA in the presence of outliers

Title Robust contrastive learning and nonlinear ICA in the presence of outliers
Authors Hiroaki Sasaki, Takashi Takenouchi, Ricardo Monti, Aapo Hyvärinen
Abstract Nonlinear independent component analysis (ICA) is a general framework for unsupervised representation learning, and aimed at recovering the latent variables in data. Recent practical methods perform nonlinear ICA by solving a series of classification problems based on logistic regression. However, it is well-known that logistic regression is vulnerable to outliers, and thus the performance can be strongly weakened by outliers. In this paper, we first theoretically analyze nonlinear ICA models in the presence of outliers. Our analysis implies that estimation in nonlinear ICA can be seriously hampered when outliers exist on the tails of the (noncontaminated) target density, which happens in a typical case of contamination by outliers. We develop two robust nonlinear ICA methods based on the {\gamma}-divergence, which is a robust alternative to the KL-divergence in logistic regression. The proposed methods are shown to have desired robustness properties in the context of nonlinear ICA. We also experimentally demonstrate that the proposed methods are very robust and outperform existing methods in the presence of outliers. Finally, the proposed method is applied to ICA-based causal discovery and shown to find a plausible causal relationship on fMRI data.
Tasks Causal Discovery, Representation Learning, Unsupervised Representation Learning
Published 2019-11-01
URL https://arxiv.org/abs/1911.00265v1
PDF https://arxiv.org/pdf/1911.00265v1.pdf
PWC https://paperswithcode.com/paper/robust-contrastive-learning-and-nonlinear-ica
Repo
Framework

Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis

Title Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis
Authors Jing Ma, Qiuchen Zhang, Jian Lou, Joyce C. Ho, Li Xiong, Xiaoqian Jiang
Abstract Tensor factorization has been demonstrated as an efficient approach for computational phenotyping, where massive electronic health records (EHRs) are converted to concise and meaningful clinical concepts. While distributing the tensor factorization tasks to local sites can avoid direct data sharing, it still requires the exchange of intermediary results which could reveal sensitive patient information. Therefore, the challenge is how to jointly decompose the tensor under rigorous and principled privacy constraints, while still support the model’s interpretability. We propose DPFact, a privacy-preserving collaborative tensor factorization method for computational phenotyping using EHR. It embeds advanced privacy-preserving mechanisms with collaborative learning. Hospitals can keep their EHR database private but also collaboratively learn meaningful clinical concepts by sharing differentially private intermediary results. Moreover, DPFact solves the heterogeneous patient population using a structured sparsity term. In our framework, each hospital decomposes its local tensors, and sends the updated intermediary results with output perturbation every several iterations to a semi-trusted server which generates the phenotypes. The evaluation on both real-world and synthetic datasets demonstrated that under strict privacy constraints, our method is more accurate and communication-efficient than state-of-the-art baseline methods.
Tasks Computational Phenotyping
Published 2019-08-26
URL https://arxiv.org/abs/1908.09888v2
PDF https://arxiv.org/pdf/1908.09888v2.pdf
PWC https://paperswithcode.com/paper/privacy-preserving-tensor-factorization-for
Repo
Framework

Exploring Graph Neural Networks for Stock Market Predictions with Rolling Window Analysis

Title Exploring Graph Neural Networks for Stock Market Predictions with Rolling Window Analysis
Authors Daiki Matsunaga, Toyotaro Suzumura, Toshihiro Takahashi
Abstract Recently, there has been a surge of interest in the use of machine learning to help aid in the accurate predictions of financial markets. Despite the exciting advances in this cross-section of finance and AI, many of the current approaches are limited to using technical analysis to capture historical trends of each stock price and thus limited to certain experimental setups to obtain good prediction results. On the other hand, professional investors additionally use their rich knowledge of inter-market and inter-company relations to map the connectivity of companies and events, and use this map to make better market predictions. For instance, they would predict the movement of a certain company’s stock price based not only on its former stock price trends but also on the performance of its suppliers or customers, the overall industry, macroeconomic factors and trade policies. This paper investigates the effectiveness of work at the intersection of market predictions and graph neural networks, which hold the potential to mimic the ways in which investors make decisions by incorporating company knowledge graphs directly into the predictive model. The main goal of this work is to test the validity of this approach across different markets and longer time horizons for backtesting using rolling window analysis. In this work, we concentrate on the prediction of individual stock prices in the Japanese Nikkei 225 market over a period of roughly 20 years. For the knowledge graph, we use the Nikkei Value Search data, which is a rich dataset showing mainly supplier relations among Japanese and foreign companies. Our preliminary results show a 29.5% increase and a 2.2-fold increase in the return ratio and Sharpe ratio, respectively, when compared to the market benchmark, as well as a 6.32% increase and 1.3-fold increase, respectively, compared to the baseline LSTM model.
Tasks Knowledge Graphs
Published 2019-09-24
URL https://arxiv.org/abs/1909.10660v3
PDF https://arxiv.org/pdf/1909.10660v3.pdf
PWC https://paperswithcode.com/paper/exploring-graph-neural-networks-for-stock
Repo
Framework

Learning to Scaffold the Development of Robotic Manipulation Skills

Title Learning to Scaffold the Development of Robotic Manipulation Skills
Authors Lin Shao, Toki Migimatsu, Jeannette Bohg
Abstract Learning contact-rich, robotic manipulation skills is a challenging problem due to the high-dimensionality of the state and action space as well as uncertainty from noisy sensors and inaccurate motor control. To combat these factors and achieve more robust manipulation, humans actively exploit contact constraints in the environment. By adopting a similar strategy, robots can also achieve more robust manipulation. In this paper, we enable a robot to autonomously modify its environment and thereby discover how to ease manipulation skill learning. Specifically, we provide the robot with fixtures that it can freely place within the environment. These fixtures provide hard constraints that limit the outcome of robot actions. Thereby, they funnel uncertainty from perception and motor control and scaffold manipulation skill learning. We propose a learning system that consists of two learning loops. In the outer loop, the robot positions the fixture in the workspace. In the inner loop, the robot learns a manipulation skill and after a fixed number of episodes, returns the reward to the outer loop. Thereby, the robot is incentivised to place the fixture such that the inner loop quickly achieves a high reward. We demonstrate our framework both in simulation and in the real world on three tasks: peg insertion, wrench manipulation and shallow-depth insertion. We show that manipulation skill learning is dramatically sped up through this way of scaffolding.
Tasks
Published 2019-11-03
URL https://arxiv.org/abs/1911.00969v1
PDF https://arxiv.org/pdf/1911.00969v1.pdf
PWC https://paperswithcode.com/paper/learning-to-scaffold-the-development-of
Repo
Framework

Processing Tweets for Cybersecurity Threat Awareness

Title Processing Tweets for Cybersecurity Threat Awareness
Authors Fernando Alves, Aurélien Bettini, Pedro M. Ferreira, Alysson Bessani
Abstract Receiving timely and relevant security information is crucial for maintaining a high-security level on an IT infrastructure. This information can be extracted from Open Source Intelligence published daily by users, security organisations, and researchers. In particular, Twitter has become an information hub for obtaining cutting-edge information about many subjects, including cybersecurity. This work proposes SYNAPSE, a Twitter-based streaming threat monitor that generates a continuously updated summary of the threat landscape related to a monitored infrastructure. Its tweet-processing pipeline is composed of filtering, feature extraction, binary classification, an innovative clustering strategy, and generation of Indicators of Compromise (IoCs). A quantitative evaluation considering all tweets from 80 accounts over more than 8 months (over 195.000 tweets), shows that our approach timely and successfully finds the majority of security-related tweets concerning an example IT infrastructure (true positive rate above 90%), incorrectly selects a small number of tweets as relevant (false positive rate under 10%), and summarises the results to very few IoCs per day. A qualitative evaluation of the IoCs generated by SYNAPSE demonstrates their relevance (based on the CVSS score and the availability of patches or exploits), and timeliness (based on threat disclosure dates from NVD).
Tasks
Published 2019-04-03
URL http://arxiv.org/abs/1904.02072v1
PDF http://arxiv.org/pdf/1904.02072v1.pdf
PWC https://paperswithcode.com/paper/processing-tweets-for-cybersecurity-threat
Repo
Framework

Deep learning on butterfly phenotypes tests evolution’s oldest mathematical model

Title Deep learning on butterfly phenotypes tests evolution’s oldest mathematical model
Authors Jennifer F. Hoyal Cuthill, Nicholas Guttenberg, Sophie Ledger, Robyn Crowther, Blanca Huertas
Abstract Traditional anatomical analyses captured only a fraction of real phenomic information. Here, we apply deep learning to quantify total phenotypic similarity across 2468 butterfly photographs, covering 38 subspecies from the polymorphic mimicry complex of $\textit{Heliconius erato}$ and $\textit{Heliconius melpomene}$. Euclidean phenotypic distances, calculated using a deep convolutional triplet network, demonstrate significant convergence between interspecies co-mimics. This quantitatively validates a key prediction of M"ullerian mimicry theory, evolutionary biology’s oldest mathematical model. Phenotypic neighbor-joining trees are significantly correlated with wing pattern gene phylogenies, demonstrating objective, phylogenetically informative phenome capture. Comparative analyses indicate frequency-dependent, mutual convergence with coevolutionary exchange of wing pattern features. Therefore, phenotypic analysis supports reciprocal coevolution, predicted by classical mimicry theory but since disputed, and reveals mutual convergence as an intrinsic generator for the surprising diversity of M"ullerian mimicry. This demonstrates that deep learning can generate phenomic spatial embeddings which enable quantitative tests of evolutionary hypotheses previously only testable subjectively.
Tasks
Published 2019-08-15
URL https://arxiv.org/abs/1908.05635v1
PDF https://arxiv.org/pdf/1908.05635v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-on-butterfly-phenotypes-tests
Repo
Framework

Real-time Convolutional Networks for Depth-based Human Pose Estimation

Title Real-time Convolutional Networks for Depth-based Human Pose Estimation
Authors Angel Martínez-González, Michael Villamizar, Olivier Canévet, Jean-Marc Odobez
Abstract We propose to combine recent Convolutional Neural Networks (CNN) models with depth imaging to obtain a reliable and fast multi-person pose estimation algorithm applicable to Human Robot Interaction (HRI) scenarios. Our hypothesis is that depth images contain less structures and are easier to process than RGB images while keeping the required information for human detection and pose inference, thus allowing the use of simpler networks for the task. Our contributions are threefold. (i) we propose a fast and efficient network based on residual blocks (called RPM) for body landmark localization from depth images; (ii) we created a public dataset DIH comprising more than 170k synthetic images of human bodies with various shapes and viewpoints as well as real (annotated) data for evaluation; (iii) we show that our model trained on synthetic data from scratch can perform well on real data, obtaining similar results to larger models initialized with pre-trained networks. It thus provides a good trade-off between performance and computation. Experiments on real data demonstrate the validity of our approach.
Tasks Human Detection, Multi-Person Pose Estimation, Pose Estimation
Published 2019-10-30
URL https://arxiv.org/abs/1910.13911v1
PDF https://arxiv.org/pdf/1910.13911v1.pdf
PWC https://paperswithcode.com/paper/real-time-convolutional-networks-for-depth
Repo
Framework

Unsupervised Natural Question Answering with a Small Model

Title Unsupervised Natural Question Answering with a Small Model
Authors Martin Andrews, Sam Witteveen
Abstract The recent (2019-02) demonstration of the power of huge language models such as GPT-2 to memorise the answers to factoid questions raises questions about the extent to which knowledge is being embedded directly within these large models. This short paper describes an architecture through which much smaller models can also answer such questions - by making use of ‘raw’ external knowledge. The contribution of this work is that the methods presented here rely on unsupervised learning techniques, complementing the unsupervised training of the Language Model. The goal of this line of research is to be able to add knowledge explicitly, without extensive training.
Tasks Language Modelling, Question Answering
Published 2019-11-19
URL https://arxiv.org/abs/1911.08340v1
PDF https://arxiv.org/pdf/1911.08340v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-natural-question-answering-with-1
Repo
Framework

Cascade Size Distributions and Why They Matter

Title Cascade Size Distributions and Why They Matter
Authors Rebekka Burkholz, John Quackenbush
Abstract How likely is it that a few initial node activations are amplified to produce large response cascades that span a considerable part of an entire network? Our answer to this question relies on the Independent Cascade Model for weighted directed networks. In using this model, most of our insights have been derived from the study of average effects. Here, we shift the focus on the full probability distribution of the final cascade size. This shift allows us to explore both typical cascade outcomes and improbable but relevant extreme events. We present an efficient message passing algorithm to compute the final cascade size distribution and activation probabilities of nodes conditional on the final cascade size. Our approach is exact on trees but can be applied to any network topology. It approximates locally tree-like networks well and can lead to surprisingly good performance on more dense networks, as we show using real world data, including a miRNA- miRNA probabilistic interaction network for gastrointestinal cancer. We demonstrate the utility of our algorithms for clustering of nodes according to their functionality and influence maximization.
Tasks
Published 2019-09-09
URL https://arxiv.org/abs/1909.05416v1
PDF https://arxiv.org/pdf/1909.05416v1.pdf
PWC https://paperswithcode.com/paper/cascade-size-distributions-and-why-they
Repo
Framework

Enabling Value Sensitive AI Systems through Participatory Design Fictions

Title Enabling Value Sensitive AI Systems through Participatory Design Fictions
Authors Q. Vera Liao, Michael Muller
Abstract Two general routes have been followed to develop artificial agents that are sensitive to human values—a top-down approach to encode values into the agents, and a bottom-up approach to learn from human actions, whether from real-world interactions or stories. Although both approaches have made exciting scientific progress, they may face challenges when applied to the current development practices of AI systems, which require the under-standing of the specific domains and specific stakeholders involved. In this work, we bring together perspectives from the human-computer interaction (HCI) community, where designing technologies sensitive to user values has been a longstanding focus. We highlight several well-established areas focusing on developing empirical methods for inquiring user values. Based on these methods, we propose participatory design fictions to study user values involved in AI systems and present preliminary results from a case study. With this paper, we invite the consideration of user-centered value inquiry and value learning.
Tasks
Published 2019-12-13
URL https://arxiv.org/abs/1912.07381v1
PDF https://arxiv.org/pdf/1912.07381v1.pdf
PWC https://paperswithcode.com/paper/enabling-value-sensitive-ai-systems-through
Repo
Framework

Recycled ADMM: Improving the Privacy and Accuracy of Distributed Algorithms

Title Recycled ADMM: Improving the Privacy and Accuracy of Distributed Algorithms
Authors Xueru Zhang, Mohammad Mahdi Khalili, Mingyan Liu
Abstract Alternating direction method of multiplier (ADMM) is a powerful method to solve decentralized convex optimization problems. In distributed settings, each node performs computation with its local data and the local results are exchanged among neighboring nodes in an iterative fashion. During this iterative process the leakage of data privacy arises and can accumulate significantly over many iterations, making it difficult to balance the privacy-accuracy tradeoff. We propose Recycled ADMM (R-ADMM), where a linear approximation is applied to every even iteration, its solution directly calculated using only results from the previous, odd iteration. It turns out that under such a scheme, half of the updates incur no privacy loss and require much less computation compared to the conventional ADMM. Moreover, R-ADMM can be further modified (MR-ADMM) such that each node independently determines its own penalty parameter over iterations. We obtain a sufficient condition for the convergence of both algorithms and provide the privacy analysis based on objective perturbation. It can be shown that the privacy-accuracy tradeoff can be improved significantly compared with conventional ADMM.
Tasks
Published 2019-10-08
URL https://arxiv.org/abs/1910.04581v1
PDF https://arxiv.org/pdf/1910.04581v1.pdf
PWC https://paperswithcode.com/paper/recycled-admm-improving-the-privacy-and
Repo
Framework

Parallel Split-Join Networks for Shared-account Cross-domain Sequential Recommendations

Title Parallel Split-Join Networks for Shared-account Cross-domain Sequential Recommendations
Authors Pengjie Ren, Yujie Lin, Muyang Ma, Zhumin Chen, Zhaochun Ren, Jun Ma, Maarten de Rijke
Abstract Sequential Recommendation (SR) has been attracting a growing attention for the superiority in modeling sequential information of user behaviors. We study SR in a particularly challenging context, in which multiple individual users share a single account (shared-account) and in which user behaviors are available in multiple domains (cross-domain). These characteristics bring new challenges on top of those of the traditional SR task. On the one hand, we need to identify the behaviors by different user roles under the same account in order to recommend the right item to the right user role at the right time. On the other hand, we need to discriminate the behaviors from one domain that might be helpful to improve recommendations in the other domains. In this work, we formulate Shared-account Cross-domain Sequential Recommendation (SCSR) and propose a parallel modeling network to address the two challenges above, namely Parallel Split-Join Network (PSJNet). We present two variants of PSJNet, PSJNet-I and PSJNet-II. PSJNet-I is a “Split-by-Join” framework where it splits the mixed representations to get role-specific representations and join them to get cross-domain representations at each timestamp simultaneously. PSJNet-II is a “Split-and-Join” framework where it first splits role-specific representations at each timestamp, and then the representations from all timestamps and all roles are joined to get cross-domain representations. We use two datasets to assess the effectiveness of PSJNet. The first dataset is a simulated SCSR dataset obtained by randomly merging the Amazon logs from different users in movie and book domains. The second dataset is a real-world SCSR dataset built from smart TV watching logs of a commercial company. Our experimental results demonstrate that PSJNet outperforms state-of-the-art baselines in terms of MRR and Recall.
Tasks
Published 2019-10-06
URL https://arxiv.org/abs/1910.02448v3
PDF https://arxiv.org/pdf/1910.02448v3.pdf
PWC https://paperswithcode.com/paper/parallel-segregation-integration-networks-for
Repo
Framework

An Elastic Energy Minimization Framework for Mean Surface Calculation

Title An Elastic Energy Minimization Framework for Mean Surface Calculation
Authors Jozsef Molnar, Peter Horvath
Abstract As the continuation of the contour mean calculation - designed for averaging the manual delineations of 3D layer stack images - in this paper, the most important equations: a) the reparameterization equations to determine the minimizing diffeomorphism and b) the proper centroid calculation for the surface mean calculation are presented. The chosen representation space: escaled Position by Square root Normal (RPSN) is a real valued vector space, invariant under the action of the reparameterization group and the imposed L2 metric (used to define the distance function) has well defined meaning: the sum of the central second moments of the coordinate functions. For comparision purpose, the reparameterization equations for elastic surface matching, using the Square Root Normal Function (SRNF) are also provided. The reparameterization equations for these cases have formal similarity, albeit the targeted applications differ: SRNF representation suitable for shape analysis purpose whereas RPSN is more fit for the cases where all contextual information - including the relative translation between the constituent surfaces - are to be retained (but the sake of theoretical completeness, the possibility of the consistent relative displacement removal in the RPSN case is also addressed).
Tasks
Published 2019-07-31
URL https://arxiv.org/abs/1907.13557v1
PDF https://arxiv.org/pdf/1907.13557v1.pdf
PWC https://paperswithcode.com/paper/an-elastic-energy-minimization-framework-for
Repo
Framework
comments powered by Disqus