January 28, 2020

2874 words 14 mins read

Paper Group ANR 895

3D point cloud registration with shape constraint. ATTACK2VEC: Leveraging Temporal Word Embeddings to Understand the Evolution of Cyberattacks. Multitask Soft Option Learning. An Adaptive Weighted Deep Forest Classifier. Uniform Error Bounds for Gaussian Process Regression with Application to Safe Control. DEAM: Adaptive Momentum with Discriminativ …

3D point cloud registration with shape constraint


Title	3D point cloud registration with shape constraint
Authors	Swapna Agarwal, Brojeshwar Bhowmick
Abstract	In this paper, a shape-constrained iterative algorithm is proposed to register a rigid template point-cloud to a given reference point-cloud. The algorithm embeds a shape-based similarity constraint into the principle of gravitation. The shape-constrained gravitation, as induced by the reference, controls the movement of the template such that at each iteration, the template better aligns with the reference in terms of shape. This constraint enables the alignment in difficult conditions indtroduced by change (presence of outliers and/or missing parts), translation, rotation and scaling. We discuss efficient implementation techniques with least manual intervention. The registration is shown to be useful for change detection in the 3D point-cloud. The algorithm is compared with three state-of-the-art registration approaches. The experiments are done on both synthetic and real-world data. The proposed algorithm is shown to perform better in the presence of big rotation, structured and unstructured outliers and missing data.
Tasks	Point Cloud Registration
Published	2019-02-04
URL	http://arxiv.org/abs/1902.01061v1
PDF	http://arxiv.org/pdf/1902.01061v1.pdf
PWC	https://paperswithcode.com/paper/3d-point-cloud-registration-with-shape
Repo
Framework

ATTACK2VEC: Leveraging Temporal Word Embeddings to Understand the Evolution of Cyberattacks


Title	ATTACK2VEC: Leveraging Temporal Word Embeddings to Understand the Evolution of Cyberattacks
Authors	Yun Shen, Gianluca Stringhini
Abstract	Despite the fact that cyberattacks are constantly growing in complexity, the research community still lacks effective tools to easily monitor and understand them. In particular, there is a need for techniques that are able to not only track how prominently certain malicious actions, such as the exploitation of specific vulnerabilities, are exploited in the wild, but also (and more importantly) how these malicious actions factor in as attack steps in more complex cyberattacks. In this paper we present ATTACK2VEC, a system that uses temporal word embeddings to model how attack steps are exploited in the wild, and track how they evolve. We test ATTACK2VEC on a dataset of billions of security events collected from the customers of a commercial Intrusion Prevention System over a period of two years, and show that our approach is effective in monitoring the emergence of new attack strategies in the wild and in flagging which attack steps are often used together by attackers (e.g., vulnerabilities that are frequently exploited together). ATTACK2VEC provides a useful tool for researchers and practitioners to better understand cyberattacks and their evolution, and use this knowledge to improve situational awareness and develop proactive defenses.
Tasks	Word Embeddings
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12590v1
PDF	https://arxiv.org/pdf/1905.12590v1.pdf
PWC	https://paperswithcode.com/paper/attack2vec-leveraging-temporal-word
Repo
Framework

Multitask Soft Option Learning


Title	Multitask Soft Option Learning
Authors	Maximilian Igl, Andrew Gambardella, Jinke He, Nantas Nardelli, N. Siddharth, Wendelin Böhmer, Shimon Whiteson
Abstract	We present Multitask Soft Option Learning (MSOL), a hierarchical multitask framework based on Planning as Inference. MSOL extends the concept of options, using separate variational posteriors for each task, regularized by a shared prior. This allows fine-tuning of options for new tasks without forgetting their learned policies, leading to faster training without reducing the expressiveness of the hierarchical policy. MSOL avoids several instabilities during training in a multitask setting and provides a natural way to learn both intra-option policies and their terminations. We demonstrate empirically that MSOL significantly outperforms both hierarchical and flat transfer-learning baselines in challenging multi-task environments.
Tasks	Transfer Learning
Published	2019-04-01
URL	https://arxiv.org/abs/1904.01033v2
PDF	https://arxiv.org/pdf/1904.01033v2.pdf
PWC	https://paperswithcode.com/paper/multitask-soft-option-learning
Repo
Framework

An Adaptive Weighted Deep Forest Classifier


Title	An Adaptive Weighted Deep Forest Classifier
Authors	Lev V. Utkin, Andrei V. Konstantinov, Viacheslav S. Chukanov, Mikhail V. Kots, Anna A. Meldo
Abstract	A modification of the confidence screening mechanism based on adaptive weighing of every training instance at each cascade level of the Deep Forest is proposed. The idea underlying the modification is very simple and stems from the confidence screening mechanism idea proposed by Pang et al. to simplify the Deep Forest classifier by means of updating the training set at each level in accordance with the classification accuracy of every training instance. However, if the confidence screening mechanism just removes instances from training and testing processes, then the proposed modification is more flexible and assigns weights by taking into account the classification accuracy. The modification is similar to the AdaBoost to some extent. Numerical experiments illustrate good performance of the proposed modification in comparison with the original Deep Forest proposed by Zhou and Feng.
Tasks
Published	2019-01-04
URL	http://arxiv.org/abs/1901.01334v1
PDF	http://arxiv.org/pdf/1901.01334v1.pdf
PWC	https://paperswithcode.com/paper/an-adaptive-weighted-deep-forest-classifier
Repo
Framework

Uniform Error Bounds for Gaussian Process Regression with Application to Safe Control


Title	Uniform Error Bounds for Gaussian Process Regression with Application to Safe Control
Authors	Armin Lederer, Jonas Umlauft, Sandra Hirche
Abstract	Data-driven models are subject to model errors due to limited and noisy training data. Key to the application of such models in safety-critical domains is the quantification of their model error. Gaussian processes provide such a measure and uniform error bounds have been derived, which allow safe control based on these models. However, existing error bounds require restrictive assumptions. In this paper, we employ the Gaussian process distribution and continuity arguments to derive a novel uniform error bound under weaker assumptions. Furthermore, we demonstrate how this distribution can be used to derive probabilistic Lipschitz constants and analyze the asymptotic behavior of our bound. Finally, we derive safety conditions for the control of unknown dynamical systems based on Gaussian process models and evaluate them in simulations of a robotic manipulator.
Tasks	Gaussian Processes
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01376v2
PDF	https://arxiv.org/pdf/1906.01376v2.pdf
PWC	https://paperswithcode.com/paper/uniform-error-bounds-for-gaussian-process
Repo
Framework

DEAM: Adaptive Momentum with Discriminative Weight for Stochastic Optimization


Title	DEAM: Adaptive Momentum with Discriminative Weight for Stochastic Optimization
Authors	Jiyang Bai, Yuxiang Ren, Jiawei Zhang
Abstract	Optimization algorithms with momentum, e.g., (ADAM), have been widely used for building deep learning models due to the faster convergence rates compared with stochastic gradient descent (SGD). Momentum helps accelerate SGD in the relevant directions in parameter updating, which can minify the oscillations of parameters update route. However, there exist errors in some update steps in optimization algorithms with momentum like ADAM. The fixed momentum weight (e.g., \beta_1 in ADAM) will propagate errors in momentum computing. In this paper, we introduce a novel optimization algorithm, namely Discriminative wEight on Adaptive Momentum (DEAM). Instead of assigning the momentum term weight with a fixed hyperparameter, DEAM proposes to compute the momentum weight automatically based on the discriminative angle. In this way, DEAM involves fewer hyperparameters. DEAM also contains a novel backtrack term, which restricts redundant updates when the correction of the last step is needed. Extensive experiments demonstrate that DEAM can achieve a faster convergence rate than the existing optimization algorithms in training the deep learning models of both convex and non-convex situations.
Tasks	Stochastic Optimization
Published	2019-07-25
URL	https://arxiv.org/abs/1907.11307v2
PDF	https://arxiv.org/pdf/1907.11307v2.pdf
PWC	https://paperswithcode.com/paper/deam-accumulated-momentum-with-discriminative
Repo
Framework

KnowIT VQA: Answering Knowledge-Based Questions about Videos


Title	KnowIT VQA: Answering Knowledge-Based Questions about Videos
Authors	Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima
Abstract	We propose a novel video understanding task by fusing knowledge-based and video question answering. First, we introduce KnowIT VQA, a video dataset with 24,282 human-generated question-answer pairs about a popular sitcom. The dataset combines visual, textual and temporal coherence reasoning together with knowledge-based questions, which need of the experience obtained from the viewing of the series to be answered. Second, we propose a video understanding model by combining the visual and textual video content with specific knowledge about the show. Our main findings are: (i) the incorporation of knowledge produces outstanding improvements for VQA in video, and (ii) the performance on KnowIT VQA still lags well behind human accuracy, indicating its usefulness for studying current video modelling limitations.
Tasks	Question Answering, Video Question Answering, Video Understanding, Visual Question Answering
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10706v3
PDF	https://arxiv.org/pdf/1910.10706v3.pdf
PWC	https://paperswithcode.com/paper/knowit-vqa-answering-knowledge-based
Repo
Framework

Stochastic Lipschitz Q-Learning


Title	Stochastic Lipschitz Q-Learning
Authors	Xu Zhu, David Dunson
Abstract	In an episodic Markov Decision Process (MDP) problem, an online algorithm chooses from a set of actions in a sequence of $H$ trials, where $H$ is the episode length, in order to maximize the total payoff of the chosen actions. Q-learning, as the most popular model-free reinforcement learning (RL) algorithm, directly parameterizes and updates value functions without explicitly modeling the environment. Recently, [Jin et al. 2018] studies the sample complexity of Q-learning with finite states and actions. Their algorithm achieves nearly optimal regret, which shows that Q-learning can be made sample efficient. However, MDPs with large discrete states and actions [Silver et al. 2016] or continuous spaces [Mnih et al. 2013] cannot learn efficiently in this way. Hence, it is critical to develop new algorithms to solve this dilemma with provable guarantee on the sample complexity. With this motivation, we propose a novel algorithm that works for MDPs with a more general setting, which has infinitely many states and actions and assumes that the payoff function and transition kernel are Lipschitz continuous. We also provide corresponding theory justification for our algorithm. It achieves the regret $\tilde{\mathcal{O}}(K^{\frac{d+1}{d+2}}\sqrt{H^3}),$ where $K$ denotes the number of episodes and $d$ denotes the dimension of the joint space. To the best of our knowledge, this is the first analysis in the model-free setting whose established regret matches the lower bound up to a logarithmic factor.
Tasks	Q-Learning
Published	2019-04-24
URL	https://arxiv.org/abs/1904.10653v2
PDF	https://arxiv.org/pdf/1904.10653v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-lipschitz-q-learning
Repo
Framework

Left Ventricle Quantification Using Direct Regression with Segmentation Regularization and Ensembles of Pretrained 2D and 3D CNNs


Title	Left Ventricle Quantification Using Direct Regression with Segmentation Regularization and Ensembles of Pretrained 2D and 3D CNNs
Authors	Nils Gessert, Alexander Schlaefer
Abstract	Cardiac left ventricle (LV) quantification provides a tool for diagnosing cardiac diseases. Automatic calculation of all relevant LV indices from cardiac MR images is an intricate task due to large variations among patients and deformation during the cardiac cycle. Typical methods are based on segmentation of the myocardium or direct regression from MR images. To consider cardiac motion and deformation, recurrent neural networks and spatio-temporal convolutional neural networks (CNNs) have been proposed. We study an approach combining state-of-the-art models and emphasizing transfer learning to account for the small dataset provided for the LVQuan19 challenge. We compare 2D spatial and 3D spatio-temporal CNNs for LV indices regression and cardiac phase classification. To incorporate segmentation information, we propose an architecture-independent segmentation-based regularization. To improve the robustness further, we employ a search scheme that identifies the optimal ensemble from a set of architecture variants. Evaluating on the LVQuan19 Challenge training dataset with 5-fold cross-validation, we achieve mean absolute errors of 111 +- 76mm^2, 1.84 +- 0.9mm and 1.22 +- 0.6mm for area, dimension and regional wall thickness regression, respectively. The error rate for cardiac phase classification is 6.7%.
Tasks	Transfer Learning
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04181v1
PDF	https://arxiv.org/pdf/1908.04181v1.pdf
PWC	https://paperswithcode.com/paper/left-ventricle-quantification-using-direct
Repo
Framework

WSLLN: Weakly Supervised Natural Language Localization Networks


Title	WSLLN: Weakly Supervised Natural Language Localization Networks
Authors	Mingfei Gao, Larry S. Davis, Richard Socher, Caiming Xiong
Abstract	We propose weakly supervised language localization networks (WSLLN) to detect events in long, untrimmed videos given language queries. To learn the correspondence between visual segments and texts, most previous methods require temporal coordinates (start and end times) of events for training, which leads to high costs of annotation. WSLLN relieves the annotation burden by training with only video-sentence pairs without accessing to temporal locations of events. With a simple end-to-end structure, WSLLN measures segment-text consistency and conducts segment selection (conditioned on the text) simultaneously. Results from both are merged and optimized as a video-sentence matching problem. Experiments on ActivityNet Captions and DiDeMo demonstrate that WSLLN achieves state-of-the-art performance.
Tasks
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00239v1
PDF	https://arxiv.org/pdf/1909.00239v1.pdf
PWC	https://paperswithcode.com/paper/wslln-weakly-supervised-natural-language
Repo
Framework

Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks


Title	Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks
Authors	Zhu Zhang, Zhou Zhao, Zhijie Lin, Jingkuan Song, Xiaofei He
Abstract	Open-ended video question answering aims to automatically generate the natural-language answer from referenced video contents according to the given question. Currently, most existing approaches focus on short-form video question answering with multi-modal recurrent encoder-decoder networks. Although these works have achieved promising performance, they may still be ineffectively applied to long-form video question answering due to the lack of long-range dependency modeling and the suffering from the heavy computational cost. To tackle these problems, we propose a fast Hierarchical Convolutional Self-Attention encoder-decoder network(HCSA). Concretely, we first develop a hierarchical convolutional self-attention encoder to efficiently model long-form video contents, which builds the hierarchical structure for video sequences and captures question-aware long-range dependencies from video context. We then devise a multi-scale attentive decoder to incorporate multi-layer video representations for answer generation, which avoids the information missing of the top encoder layer. The extensive experiments show the effectiveness and efficiency of our method.
Tasks	Question Answering, Video Question Answering
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12158v1
PDF	https://arxiv.org/pdf/1906.12158v1.pdf
PWC	https://paperswithcode.com/paper/open-ended-long-form-video-question-answering
Repo
Framework

Building an Application Independent Natural Language Interface


Title	Building an Application Independent Natural Language Interface
Authors	Sahisnu Mazumder, Bing Liu, Shuai Wang, Sepideh Esmaeilpour
Abstract	Traditional approaches to building natural language (NL) interfaces typically use a semantic parser to parse the user command and convert it to a logical form, which is then translated to an executable action in an application. However, it is still challenging for a semantic parser to correctly parse natural language. For a different domain, the parser may need to be retrained or tuned, and a new translator also needs to be written to convert the logical forms to executable actions. In this work, we propose a novel and application independent approach to building NL interfaces that does not need a semantic parser or a translator. It is based on natural language to natural language matching and learning, where the representation of each action and each user command are both in natural language. To perform a user intended action, the system only needs to match the user command with the correct action representation, and then execute the corresponding action. The system also interactively learns new (paraphrased) commands for actions to expand the action representations over time. Our experimental results show the effectiveness of the proposed approach.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14084v1
PDF	https://arxiv.org/pdf/1910.14084v1.pdf
PWC	https://paperswithcode.com/paper/building-an-application-independent-natural
Repo
Framework

Improving Natural Language Interaction with Robots Using Advice


Title	Improving Natural Language Interaction with Robots Using Advice
Authors	Nikhil Mehta, Dan Goldwasser
Abstract	Over the last few years, there has been growing interest in learning models for physically grounded language understanding tasks, such as the popular blocks world domain. These works typically view this problem as a single-step process, in which a human operator gives an instruction and an automated agent is evaluated on its ability to execute it. In this paper we take the first step towards increasing the bandwidth of this interaction, and suggest a protocol for including advice, high-level observations about the task, which can help constrain the agent’s prediction. We evaluate our approach on the blocks world task, and show that even simple advice can help lead to significant performance improvements. To help reduce the effort involved in supplying the advice, we also explore model self-generated advice which can still improve results.
Tasks
Published	2019-05-12
URL	https://arxiv.org/abs/1905.04655v1
PDF	https://arxiv.org/pdf/1905.04655v1.pdf
PWC	https://paperswithcode.com/paper/improving-natural-language-interaction-with
Repo
Framework

An Efficient Sampling Algorithm for Non-smooth Composite Potentials


Title	An Efficient Sampling Algorithm for Non-smooth Composite Potentials
Authors	Wenlong Mou, Nicolas Flammarion, Martin J. Wainwright, Peter L. Bartlett
Abstract	We consider the problem of sampling from a density of the form $p(x) \propto \exp(-f(x)- g(x))$, where $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a smooth and strongly convex function and $g: \mathbb{R}^d \rightarrow \mathbb{R}$ is a convex and Lipschitz function. We propose a new algorithm based on the Metropolis-Hastings framework, and prove that it mixes to within TV distance $\varepsilon$ of the target density in at most $O(d \log (d/\varepsilon))$ iterations. This guarantee extends previous results on sampling from distributions with smooth log densities ($g = 0$) to the more general composite non-smooth case, with the same mixing time up to a multiple of the condition number. Our method is based on a novel proximal-based proposal distribution that can be efficiently computed for a large class of non-smooth functions $g$.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00551v1
PDF	https://arxiv.org/pdf/1910.00551v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-sampling-algorithm-for-non
Repo
Framework

Introspection Learning


Title	Introspection Learning
Authors	Chris R. Serrano, Michael A. Warren
Abstract	Traditional reinforcement learning agents learn from experience, past or present, gained through interaction with their environment. Our approach synthesizes experience, without requiring an agent to interact with their environment, by asking the policy directly “Are there situations X, Y, and Z, such that in these situations you would select actions A, B, and C?” In this paper we present Introspection Learning, an algorithm that allows for the asking of these types of questions of neural network policies. Introspection Learning is reinforcement learning algorithm agnostic and the states returned may be used as an indicator of the health of the policy or to shape the policy in a myriad of ways. We demonstrate the usefulness of this algorithm both in the context of speeding up training and improving robustness with respect to safety constraints.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10754v1
PDF	http://arxiv.org/pdf/1902.10754v1.pdf
PWC	https://paperswithcode.com/paper/introspection-learning
Repo
Framework