January 28, 2020

3192 words 15 mins read

Paper Group ANR 968

Deep learning control of artificial avatars in group coordination tasks. On Learning Meaningful Code Changes via Neural Machine Translation. Sparse Solutions of a Class of Constrained Optimization Problems. Discourse Tagging for Scientific Evidence Extraction. Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting. Col …

Deep learning control of artificial avatars in group coordination tasks


Title	Deep learning control of artificial avatars in group coordination tasks
Authors	Maria Lombardi, Davide Liuzza, Mario di Bernardo
Abstract	In many joint-action scenarios, humans and robots have to coordinate their movements to accomplish a given shared task. Lifting an object together, sawing a wood log, transferring objects from a point to another are all examples where motor coordination between humans and machines is a crucial requirement. While the dyadic coordination between a human and a robot has been studied in previous investigations, the multi-agent scenario in which a robot has to be integrated into a human group still remains a less explored field of research. In this paper we discuss how to synthesise an artificial agent able to coordinate its motion in human ensembles. Driven by a control architecture based on deep reinforcement learning, such an artificial agent will be able to autonomously move itself in order to synchronise its motion with that of the group while exhibiting human-like kinematic features. As a paradigmatic coordination task we take a group version of the so-called mirror-game which is highlighted as a good benchmark in the human movement literature.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04656v1
PDF	https://arxiv.org/pdf/1906.04656v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-control-of-artificial-avatars
Repo
Framework

On Learning Meaningful Code Changes via Neural Machine Translation


Title	On Learning Meaningful Code Changes via Neural Machine Translation
Authors	Michele Tufano, Jevgenija Pantiuchina, Cody Watson, Gabriele Bavota, Denys Poshyvanyk
Abstract	Recent years have seen the rise of Deep Learning (DL) techniques applied to source code. Researchers have exploited DL to automate several development and maintenance tasks, such as writing commit messages, generating comments and detecting vulnerabilities among others. One of the long lasting dreams of applying DL to source code is the possibility to automate non-trivial coding activities. While some steps in this direction have been taken (e.g., learning how to fix bugs), there is still a glaring lack of empirical evidence on the types of code changes that can be learned and automatically applied by DL. Our goal is to make this first important step by quantitatively and qualitatively investigating the ability of a Neural Machine Translation (NMT) model to learn how to automatically apply code changes implemented by developers during pull requests. We train and experiment with the NMT model on a set of 236k pairs of code components before and after the implementation of the changes provided in the pull requests. We show that, when applied in a narrow enough context (i.e., small/medium-sized pairs of methods before/after the pull request changes), NMT can automatically replicate the changes implemented by developers during pull requests in up to 36% of the cases. Moreover, our qualitative analysis shows that the model is capable of learning and replicating a wide variety of meaningful code changes, especially refactorings and bug-fixing activities. Our results pave the way for novel research in the area of DL on code, such as the automatic learning and applications of refactoring.
Tasks	Machine Translation
Published	2019-01-25
URL	http://arxiv.org/abs/1901.09102v1
PDF	http://arxiv.org/pdf/1901.09102v1.pdf
PWC	https://paperswithcode.com/paper/on-learning-meaningful-code-changes-via
Repo
Framework

Sparse Solutions of a Class of Constrained Optimization Problems


Title	Sparse Solutions of a Class of Constrained Optimization Problems
Authors	Lei Yang, Xiaojun Chen, Shuhuang Xiang
Abstract	In this paper, we consider a well-known sparse optimization problem that aims to find a sparse solution of a possibly noisy underdetermined system of linear equations. Mathematically, it can be modeled in a unified manner by minimizing $\bf{x}_p^p$ subject to $\A\bf{x}-\bf{b}_q\leq\sigma$ for given $A \in \mathbb{R}^{m \times n}$, $\bf{b}\in\mathbb{R}^m$, $\sigma \geq0$, $0\leq p\leq 1$ and $q \geq 1$. We then study various properties of the optimal solutions of this problem. Specifically, without any condition on the matrix $A$, we provide upper bounds in cardinality and infinity norm for the optimal solutions, and show that all optimal solutions must be on the boundary of the feasible set when $0<p<1$. Moreover, for $q \in {1,\infty}$, we show that the problem with $0<p<1$ has a finite number of optimal solutions and prove that there exists $0<p^<1$ such that the solution set of the problem with any $0<p<p^$ is contained in the solution set of the problem with $p=0$ and there further exists $0<\bar{p}<p^$ such that the solution set of the problem with any $0<p\leq\bar{p}$ remains unchanged. An estimation of such $p^$ is also provided. In addition, to solve the constrained nonconvex non-Lipschitz $L_p$-$L_1$ problem ($0<p<1$ and $q=1$), we propose a smoothing penalty method and show that, under some mild conditions, any cluster point of the sequence generated is a KKT point of our problem. Some numerical examples are given to implicitly illustrate the theoretical results and show the efficiency of the proposed algorithm for the constrained $L_p$-$L_1$ problem under different noises.
Tasks	Denoising
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00880v2
PDF	https://arxiv.org/pdf/1907.00880v2.pdf
PWC	https://paperswithcode.com/paper/the-constrained-l_p-l_q-basis-pursuit
Repo
Framework

Discourse Tagging for Scientific Evidence Extraction


Title	Discourse Tagging for Scientific Evidence Extraction
Authors	Xiangci Li, Gully Burns, Nanyun Peng
Abstract	The biomedical scientific literature comprises a crucial, sometimes life-saving, natural language resource whose size is accelerating over time. The information in this resource tends to follow a style of discourse that is intended to provide scientific explanations for various pieces of evidence derived from experimental findings. Studying the rhetorical structure of the narrative discourse could enable more powerful information extraction methods to automatically construct models of scientific argument from full-text papers. In this paper, we apply richly contextualized deep representation learning to the analysis of scientific discourse structures as a clause-tagging task. We improve the current state-of-the-art clause-level sequence tagging over text clauses for a set of discourse types (e.g. “hypothesis”, “result”, “implication”, etc.) on scientific paragraphs. Our model uses contextualized embeddings, word-to-clause encoder, and clause-level sequence tagging models and achieves F1 performance of 0.784.
Tasks	Representation Learning
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04758v1
PDF	https://arxiv.org/pdf/1909.04758v1.pdf
PWC	https://paperswithcode.com/paper/discourse-tagging-for-scientific-evidence
Repo
Framework

Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting


Title	Learning Landmark-Based Ensembles with Random Fourier Features and Gradient Boosting
Authors	Léo Gautheron, Pascal Germain, Amaury Habrard, Emilie Morvant, Marc Sebban, Valentina Zantedeschi
Abstract	We propose a Gradient Boosting algorithm for learning an ensemble of kernel functions adapted to the task at hand. Unlike state-of-the-art Multiple Kernel Learning techniques that make use of a pre-computed dictionary of kernel functions to select from, at each iteration we fit a kernel by approximating it as a weighted sum of Random Fourier Features (RFF) and by optimizing their barycenter. This allows us to obtain a more versatile method, easier to setup and likely to have better performance. Our study builds on a recent result showing one can learn a kernel from RFF by computing the minimum of a PAC-Bayesian bound on the kernel alignment generalization loss, which is obtained efficiently from a closed-form solution. We conduct an experimental analysis to highlight the advantages of our method w.r.t. both Boosting-based and kernel-learning state-of-the-art methods.
Tasks
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06203v1
PDF	https://arxiv.org/pdf/1906.06203v1.pdf
PWC	https://paperswithcode.com/paper/learning-landmark-based-ensembles-with-random
Repo
Framework

Coloring Big Graphs with AlphaGoZero


Title	Coloring Big Graphs with AlphaGoZero
Authors	Jiayi Huang, Mostofa Patwary, Gregory Diamos
Abstract	We show that recent innovations in deep reinforcement learning can effectively color very large graphs – a well-known NP-hard problem with clear commercial applications. Because the Monte Carlo Tree Search with Upper Confidence Bound algorithm used in AlphaGoZero can improve the performance of a given heuristic, our approach allows deep neural networks trained using high performance computing (HPC) technologies to transform computation into improved heuristics with zero prior knowledge. Key to our approach is the introduction of a novel deep neural network architecture (FastColorNet) that has access to the full graph context and requires $O(V)$ time and space to color a graph with $V$ vertices, which enables scaling to very large graphs that arise in real applications like parallel computing, compilers, numerical solvers, and design automation, among others. As a result, we are able to learn new state of the art heuristics for graph coloring.
Tasks
Published	2019-02-26
URL	https://arxiv.org/abs/1902.10162v3
PDF	https://arxiv.org/pdf/1902.10162v3.pdf
PWC	https://paperswithcode.com/paper/coloring-big-graphs-with-alphagozero
Repo
Framework

Minimax Semiparametric Learning With Approximate Sparsity


Title	Minimax Semiparametric Learning With Approximate Sparsity
Authors	Jelena Bradic, Victor Chernozhukov, Whitney K. Newey, Yinchu Zhu
Abstract	Many objects of interest can be expressed as a linear, mean square continuous functional of a least squares projection (regression). Often the regression may be high dimensional, depending on many variables. This paper gives minimal conditions for root-n consistent and efficient estimation of such objects when the regression and the Riesz representer of the functional are approximately sparse and the sum of the absolute value of the coefficients is bounded. The approximately sparse functions we consider are those where an approximation by some $t$ regressors has root mean square error less than or equal to $Ct^{-\xi}$ for $C,$ $\xi>0.$ We show that a necessary condition for efficient estimation is that the sparse approximation rate $\xi_{1}$ for the regression and the rate $\xi_{2}$ for the Riesz representer satisfy $\max{\xi_{1} ,\xi_{2}}>1/2.$ This condition is stronger than the corresponding condition $\xi_{1}+\xi_{2}>1/2$ for Holder classes of functions. We also show that Lasso based, cross-fit, debiased machine learning estimators are asymptotically efficient under these conditions. In addition we show efficiency of an estimator without cross-fitting when the functional depends on the regressors and the regression sparse approximation rate satisfies $\xi_{1}>1/2$.
Tasks
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12213v1
PDF	https://arxiv.org/pdf/1912.12213v1.pdf
PWC	https://paperswithcode.com/paper/minimax-semiparametric-learning-with
Repo
Framework

A Nonparametric Bayesian Model for Sparse Temporal Multigraphs


Title	A Nonparametric Bayesian Model for Sparse Temporal Multigraphs
Authors	Elahe Ghalebi, Hamidreza Mahyar, Radu Grosu, Graham W. Taylor, Sinead A. Williamson
Abstract	As the availability and importance of temporal interaction data–such as email communication–increases, it becomes increasingly important to understand the underlying structure that underpins these interactions. Often these interactions form a multigraph, where we might have multiple interactions between two entities. Such multigraphs tend to be sparse yet structured, and their distribution often evolves over time. Existing statistical models with interpretable parameters can capture some, but not all, of these properties. We propose a dynamic nonparametric model for interaction multigraphs that combines the sparsity of edge-exchangeable multigraphs with dynamic clustering patterns that tend to reinforce recent behavioral patterns. We show that our method yields improved held-out likelihood over stationary variants, and impressive predictive performance against a range of state-of-the-art dynamic graph models.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05098v1
PDF	https://arxiv.org/pdf/1910.05098v1.pdf
PWC	https://paperswithcode.com/paper/a-nonparametric-bayesian-model-for-sparse
Repo
Framework

Deep Learning-Based Intrusion Detection System for Advanced Metering Infrastructure


Title	Deep Learning-Based Intrusion Detection System for Advanced Metering Infrastructure
Authors	Zakaria El Mrabet, Mehdi Ezzari, Hassan Elghazi, Badr Abou El Majd
Abstract	Smart grid is an alternative solution of the conventional power grid which harnesses the power of the information technology to save the energy and meet today’s environment requirements. Due to the inherent vulnerabilities in the information technology, the smart grid is exposed to a wide variety of threats that could be translated into cyber-attacks. In this paper, we develop a deep learning-based intrusion detection system to defend against cyber-attacks in the advanced metering infrastructure network. The proposed machine learning approach is trained and tested extensively on an empirical industrial dataset which is composed of several attack categories including the scanning, buffer overflow, and denial of service attacks. Then, an experimental comparison in terms of detection accuracy is conducted to evaluate the performance of the proposed approach with Naive Bayes, Support Vector Machine, and Random Forest. The obtained results suggest that the proposed approaches produce optimal results comparing to the other algorithms. Finally, we propose a network architecture to deploy the proposed anomaly-based intrusion detection system across the Advanced Metering Infrastructure network. In addition, we propose a network security architecture composed of two types of Intrusion detection system types, Host and Network-based, deployed across the Advanced Metering Infrastructure network to inspect the traffic and detect the malicious one at all the levels.
Tasks	Intrusion Detection
Published	2019-12-31
URL	https://arxiv.org/abs/2001.00916v1
PDF	https://arxiv.org/pdf/2001.00916v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-intrusion-detection
Repo
Framework

Learning Dense Voxel Embeddings for 3D Neuron Reconstruction


Title	Learning Dense Voxel Embeddings for 3D Neuron Reconstruction
Authors	Kisuk Lee, Ran Lu, Kyle Luther, H. Sebastian Seung
Abstract	We show dense voxel embeddings learned via deep metric learning can be employed to produce a highly accurate segmentation of neurons from 3D electron microscopy images. A metric graph on an arbitrary set of short and long-range edges can be constructed from the dense embeddings generated by a convolutional network. Partitioning the metric graph with long-range affinities as repulsive constraints can produce an initial segmentation with high precision, with substantial improvements on very thin objects. The convolutional embedding net is reused without any modification to agglomerate the systematic splits caused by complex “self-touching”’ objects. Our proposed method achieves state-of-the-art accuracy on the challenging problem of 3D neuron reconstruction from the brain images acquired by serial section electron microscopy. Our alternative, object-centered representation could be more generally useful for other computational tasks in automated neural circuit reconstruction.
Tasks	Metric Learning
Published	2019-09-21
URL	https://arxiv.org/abs/1909.09872v1
PDF	https://arxiv.org/pdf/1909.09872v1.pdf
PWC	https://paperswithcode.com/paper/190909872
Repo
Framework


Title	Deep Slice Interpolation via Marginal Super-Resolution, Fusion and Refinement
Authors	Cheng Peng, Wei-An Lin, Haofu Liao, Rama Chellappa, S. Kevin Zhou
Abstract	We propose a marginal super-resolution (MSR) approach based on 2D convolutional neural networks (CNNs) for interpolating an anisotropic brain magnetic resonance scan along the highly under-sampled direction, which is assumed to axial without loss of generality. Previous methods for slice interpolation only consider data from pairs of adjacent 2D slices. The possibility of fusing information from the direction orthogonal to the 2D slices remains unexplored. Our approach performs MSR in both sagittal and coronal directions, which provides an initial estimate for slice interpolation. The interpolated slices are then fused and refined in the axial direction for improved consistency. Since MSR consists of only 2D operations, it is more feasible in terms of GPU memory consumption and requires fewer training samples compared to 3D CNNs. Our experiments demonstrate that the proposed method outperforms traditional linear interpolation and baseline 2D/3D CNN-based approaches. We conclude by showcasing the method’s practical utility in estimating brain volumes from under-sampled brain MR scans through semantic segmentation.
Tasks	Semantic Segmentation, Super-Resolution
Published	2019-08-15
URL	https://arxiv.org/abs/1908.05599v1
PDF	https://arxiv.org/pdf/1908.05599v1.pdf
PWC	https://paperswithcode.com/paper/deep-slice-interpolation-via-marginal-super
Repo
Framework

DeepRICH: Learning Deeply Cherenkov Detectors


Title	DeepRICH: Learning Deeply Cherenkov Detectors
Authors	Cristiano Fanelli, Jary Pomponi
Abstract	Imaging Cherenkov detectors are largely used for particle identification (PID) in nuclear and particle physics experiments, where developing fast reconstruction algorithms is becoming of paramount importance to allow for near real time calibration and data quality control, as well as to speed up offline analysis of large amount of data. In this paper we present DeepRICH, a novel deep learning algorithm for fast reconstruction which can be applied to different imaging Cherenkov detectors. The core of our architecture is a generative model which leverages on a custom Variational Auto-encoder (VAE) combined to Maximum Mean Discrepancy (MMD), with a Convolutional Neural Network (CNN) extracting features from the space of the latent variables for classification. A thorough comparison with the simulation/reconstruction package FastDIRC is discussed in the text. DeepRICH has the advantage to bypass low-level details needed to build a likelihood, allowing for a sensitive improvement in computation time at potentially the same reconstruction performance of other established reconstruction algorithms. In the conclusions, we address the implications and potentialities of this work, discussing possible future extensions and generalization.
Tasks	Calibration
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11717v2
PDF	https://arxiv.org/pdf/1911.11717v2.pdf
PWC	https://paperswithcode.com/paper/deeprich-learning-deeply-cherenkov-detectors
Repo
Framework

Learning to Localize: A 3D CNN Approach to User Positioning in Massive MIMO-OFDM Systems


Title	Learning to Localize: A 3D CNN Approach to User Positioning in Massive MIMO-OFDM Systems
Authors	Chi Wu, Xinping Yi, Wenjin Wang, Li You, Qing Huang, Xiqi Gao
Abstract	In this paper, we consider the user positioning problem in the massive multiple-input multiple-output (MIMO) orthogonal frequency-division multiplexing (OFDM) system with a uniform planner antenna (UPA) array. Taking advantage of the UPA array geometry and wide bandwidth, we advocate the use of the angle-delay channel power matrix (ADCPM) as a new type of fingerprint to replace the traditional ones. The ADCPM embeds the stable and stationary multipath characteristics, e.g. delay, power, and angle in the vertical and horizontal directions, which are beneficial to positioning. Taking ADCPM fingerprints as the inputs, we propose a novel three-dimensional (3D) convolution neural network (CNN) enabled learning method to localize users’ 3D positions. In particular, such a 3D CNN model consists of a convolution refinement module to refine the elementary feature maps from the ADCPM fingerprints, three extended Inception modules to extract the advanced feature maps, and a regression module to estimate the 3D positions. By intensive simulations, the proposed 3D CNN-enabled positioning method is demonstrated to achieve higher positioning accuracy than the traditional searching-based ones, with reduced computational complexity and storage overhead, and the ADCPM fingerprints are more robust to noise contamination.
Tasks
Published	2019-10-27
URL	https://arxiv.org/abs/1910.12378v2
PDF	https://arxiv.org/pdf/1910.12378v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-localize-a-3d-cnn-approach-to
Repo
Framework

Building Deep, Equivariant Capsule Networks


Title	Building Deep, Equivariant Capsule Networks
Authors	Sairaam Venkatraman, S. Balasubramanian, R. Raghunatha Sarma
Abstract	Capsule networks are constrained by the parameter-expensive nature of their layers, and the general lack of provable equivariance guarantees. We present a variation of capsule networks that aims to remedy this. We identify that learning all pair-wise part-whole relationships between capsules of successive layers is inefficient. Further, we also realise that the choice of prediction networks and the routing mechanism are both key to equivariance. Based on these, we propose an alternative framework for capsule networks that learns to projectively encode the manifold of pose-variations, termed the space-of-variation (SOV), for every capsule-type of each layer. This is done using a trainable, equivariant function defined over a grid of group-transformations. Thus, the prediction-phase of routing involves projection into the SOV of a deeper capsule using the corresponding function. As a specific instantiation of this idea, and also in order to reap the benefits of increased parameter-sharing, we use type-homogeneous group-equivariant convolutions of shallower capsules in this phase. We also introduce an equivariant routing mechanism based on degree-centrality. We show that this particular instance of our general model is equivariant, and hence preserves the compositional representation of an input under transformations. We conduct several experiments on standard object-classification datasets that showcase the increased transformation-robustness, as well as general performance, of our model to several capsule baselines.
Tasks	Object Classification
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01300v3
PDF	https://arxiv.org/pdf/1908.01300v3.pdf
PWC	https://paperswithcode.com/paper/building-deep-equivariant-capsule-networks
Repo
Framework

SGVAE: Sequential Graph Variational Autoencoder


Title	SGVAE: Sequential Graph Variational Autoencoder
Authors	Bowen Jing, Ethan A. Chi, Jillian Tang
Abstract	Generative models of graphs are well-known, but many existing models are limited in scalability and expressivity. We present a novel sequential graphical variational autoencoder operating directly on graphical representations of data. In our model, the encoding and decoding of a graph as is framed as a sequential deconstruction and construction process, respectively, enabling the the learning of a latent space. Experiments on a cycle dataset show promise, but highlight the need for a relaxation of the distribution over node permutations.
Tasks
Published	2019-12-17
URL	https://arxiv.org/abs/1912.07800v1
PDF	https://arxiv.org/pdf/1912.07800v1.pdf
PWC	https://paperswithcode.com/paper/sgvae-sequential-graph-variational
Repo
Framework