January 26, 2020

3191 words 15 mins read

Paper Group ANR 1521

Paper Group ANR 1521

GLA-Net: An Attention Network with Guided Loss for Mismatch Removal. Effectiveness Assessment of Cyber-Physical Systems. Learning Generalizable Physical Dynamics of 3D Rigid Objects. No-Regret Bayesian Optimization with Unknown Hyperparameters. Towards a Hypothesis on Visual Transformation based Self-Supervision. A Study of State Aliasing in Struct …

GLA-Net: An Attention Network with Guided Loss for Mismatch Removal

Title GLA-Net: An Attention Network with Guided Loss for Mismatch Removal
Authors Zhi Chen, Fan Yang, Wenbing Tao
Abstract Mismatch removal is a critical prerequisite in many feature-based tasks. Recent attempts cast the mismatch removal task as a binary classification problem and solve it through deep learning based methods. In these methods, the imbalance between positive and negative classes is important, which affects network performance, i.e., Fn-score. To establish the link between Fn-score and loss, we propose to guide the loss with the Fn-score directly. We theoretically demonstrate the direct link between our Guided Loss and Fn-score during training. Moreover, we discover that outliers often impair global context in mismatch removal networks. To address this issue, we introduce the attention mechanism to mismatch removal task and propose a novel Inlier Attention Block (IA Block). To evaluate the effectiveness of our loss and IA Block, we design an end-to-end network for mismatch removal, called GLA-Net \footnote{Our code will be available in Github later.}. Experiments have shown that our network achieves the state-of-the-art performance on benchmark datasets.
Tasks
Published 2019-09-28
URL https://arxiv.org/abs/1909.13092v1
PDF https://arxiv.org/pdf/1909.13092v1.pdf
PWC https://paperswithcode.com/paper/gla-net-an-attention-network-with-guided-loss
Repo
Framework

Effectiveness Assessment of Cyber-Physical Systems

Title Effectiveness Assessment of Cyber-Physical Systems
Authors Gérald Rocher, Jean-Yves Tigli, Stéphane Lavirotte, Nhan Le Thanh
Abstract By achieving their purposes through interactions with the physical world, Cyber-Physical Systems (CPS) pose new challenges in terms of dependability. Indeed, the evolution of the physical systems they control with transducers can be affected by surrounding physical processes over which they have no control and which may potentially hamper the achievement of their purposes. While it is illusory to hope for a comprehensive model of the physical environment at design time to anticipate and remove faults that may occur once these systems are deployed, it becomes necessary to evaluate their degree of effectiveness in vivo. In this paper, the degree of effectiveness is formally defined and generalized in the context of the measure theory. The measure is developed in the context of the Transferable Belief Model (TBM), an elaboration on the Dempster-Shafer Theory (DST) of evidence so as to handle epistemic and aleatory uncertainties respectively pertaining the users’ expectations and the natural variability of the physical environment. The TBM is used in conjunction with the Input/Output Hidden Markov Modeling framework (we denote by Ev-IOHMM) to specify the expected evolution of the physical system controlled by the CPS and the tolerances towards uncertainties. The measure of effectiveness is then obtained from the forward algorithm, leveraging the conflict entailed by the successive combinations of the beliefs obtained from observations of the physical system and the beliefs corresponding to its expected evolution. The proposed approach is applied to autonomous vehicles and show how the degree of effectiveness can be used for bench-marking their controller relative to the highway code speed limitations and passengers’ well-being constraints, both modeled through an Ev-IOHMM.
Tasks Autonomous Vehicles
Published 2019-01-10
URL https://arxiv.org/abs/1901.06343v4
PDF https://arxiv.org/pdf/1901.06343v4.pdf
PWC https://paperswithcode.com/paper/effectiveness-assessment-of-cyber-physical
Repo
Framework

Learning Generalizable Physical Dynamics of 3D Rigid Objects

Title Learning Generalizable Physical Dynamics of 3D Rigid Objects
Authors Davis Rempe, Srinath Sridhar, He Wang, Leonidas J. Guibas
Abstract Humans have a remarkable ability to predict the effect of physical interactions on the dynamics of objects. Endowing machines with this ability would allow important applications in areas like robotics and autonomous vehicles. In this work, we focus on predicting the dynamics of 3D rigid objects, in particular an object’s final resting position and total rotation when subjected to an impulsive force. Different from previous work, our approach is capable of generalizing to unseen object shapes - an important requirement for real-world applications. To achieve this, we represent object shape as a 3D point cloud that is used as input to a neural network, making our approach agnostic to appearance variation. The design of our network is informed by an understanding of physical laws. We train our model with data from a physics engine that simulates the dynamics of a large number of shapes. Experiments show that we can accurately predict the resting position and total rotation for unseen object geometries.
Tasks Autonomous Vehicles
Published 2019-01-02
URL http://arxiv.org/abs/1901.00466v1
PDF http://arxiv.org/pdf/1901.00466v1.pdf
PWC https://paperswithcode.com/paper/learning-generalizable-physical-dynamics-of
Repo
Framework

No-Regret Bayesian Optimization with Unknown Hyperparameters

Title No-Regret Bayesian Optimization with Unknown Hyperparameters
Authors Felix Berkenkamp, Angela P. Schoellig, Andreas Krause
Abstract Bayesian optimization (BO) based on Gaussian process models is a powerful paradigm to optimize black-box functions that are expensive to evaluate. While several BO algorithms provably converge to the global optimum of the unknown function, they assume that the hyperparameters of the kernel are known in advance. This is not the case in practice and misspecification often causes these algorithms to converge to poor local optima. In this paper, we present the first BO algorithm that is provably no-regret and converges to the optimum without knowledge of the hyperparameters. During optimization we slowly adapt the hyperparameters of stationary kernels and thereby expand the associated function class over time, so that the BO algorithm considers more complex function candidates. Based on the theoretical insights, we propose several practical algorithms that achieve the empirical sample efficiency of BO with online hyperparameter estimation, but retain theoretical convergence guarantees. We evaluate our method on several benchmark problems.
Tasks
Published 2019-01-10
URL http://arxiv.org/abs/1901.03357v2
PDF http://arxiv.org/pdf/1901.03357v2.pdf
PWC https://paperswithcode.com/paper/no-regret-bayesian-optimization-with-unknown
Repo
Framework

Towards a Hypothesis on Visual Transformation based Self-Supervision

Title Towards a Hypothesis on Visual Transformation based Self-Supervision
Authors Dipan K. Pal, Sreena Nallamothu, Marios Savvides
Abstract We propose the first qualitative hypothesis characterizing the behavior of visual transformation based self-supervision, called the VTSS hypothesis. Given a dataset upon which a self-supervised task is performed while predicting instantiations of a transformation, the hypothesis states that if the predicted instantiations of the transformations are already present in the dataset, then the representation learned will be less useful. The hypothesis was derived by observing a key constraint in the application of self-supervision using a particular transformation. This constraint, which we term the transformation conflict for this paper, forces a network learn degenerative features thereby reducing the usefulness of the representation. The VTSS hypothesis helps us identify transformations that have the potential to be effective as a self-supervision task. Further, it helps to generally predict whether a particular transformation based self-supervision technique would be effective or not for a particular dataset. We provide extensive evaluations on CIFAR 10, CIFAR 100, SVHN and FMNIST confirming the hypothesis and the trends it predicts. We also propose novel cost-effective self-supervision techniques based on translation and scale, which when combined with rotation outperforms all transformations applied individually. Overall, this paper aims to shed light on the phenomenon of visual transformation based self-supervision.
Tasks
Published 2019-11-24
URL https://arxiv.org/abs/1911.10594v2
PDF https://arxiv.org/pdf/1911.10594v2.pdf
PWC https://paperswithcode.com/paper/towards-a-hypothesis-on-visual-transformation
Repo
Framework

A Study of State Aliasing in Structured Prediction with RNNs

Title A Study of State Aliasing in Structured Prediction with RNNs
Authors Layla El Asri, Adam Trischler
Abstract End-to-end reinforcement learning agents learn a state representation and a policy at the same time. Recurrent neural networks (RNNs) have been trained successfully as reinforcement learning agents in settings like dialogue that require structured prediction. In this paper, we investigate the representations learned by RNN-based agents when trained with both policy gradient and value-based methods. We show through extensive experiments and analysis that, when trained with policy gradient, recurrent neural networks often fail to learn a state representation that leads to an optimal policy in settings where the same action should be taken at different states. To explain this failure, we highlight the problem of state aliasing, which entails conflating two or more distinct states in the representation space. We demonstrate that state aliasing occurs when several states share the same optimal action and the agent is trained via policy gradient. We characterize this phenomenon through experiments on a simple maze setting and a more complex text-based game, and make recommendations for training RNNs with reinforcement learning.
Tasks Structured Prediction
Published 2019-06-21
URL https://arxiv.org/abs/1906.09310v1
PDF https://arxiv.org/pdf/1906.09310v1.pdf
PWC https://paperswithcode.com/paper/a-study-of-state-aliasing-in-structured
Repo
Framework

CodeSwitch-Reddit: Exploration of Written Multilingual Discourse in Online Discussion Forums

Title CodeSwitch-Reddit: Exploration of Written Multilingual Discourse in Online Discussion Forums
Authors Ella Rabinovich, Masih Sultani, Suzanne Stevenson
Abstract In contrast to many decades of research on oral code-switching, the study of written multilingual productions has only recently enjoyed a surge of interest. Many open questions remain regarding the sociolinguistic underpinnings of written code-switching, and progress has been limited by a lack of suitable resources. We introduce a novel, large, and diverse dataset of written code-switched productions, curated from topical threads of multiple bilingual communities on the Reddit discussion platform, and explore questions that were mainly addressed in the context of spoken language thus far. We investigate whether findings in oral code-switching concerning content and style, as well as speaker proficiency, are carried over into written code-switching in discussion forums. The released dataset can further facilitate a range of research and practical activities.
Tasks
Published 2019-08-30
URL https://arxiv.org/abs/1908.11841v1
PDF https://arxiv.org/pdf/1908.11841v1.pdf
PWC https://paperswithcode.com/paper/codeswitch-reddit-exploration-of-written
Repo
Framework

The gradient complexity of linear regression

Title The gradient complexity of linear regression
Authors Mark Braverman, Elad Hazan, Max Simchowitz, Blake Woodworth
Abstract We investigate the computational complexity of several basic linear algebra primitives, including largest eigenvector computation and linear regression, in the computational model that allows access to the data via a matrix-vector product oracle. We show that for polynomial accuracy, $\Theta(d)$ calls to the oracle are necessary and sufficient even for a randomized algorithm. Our lower bound is based on a reduction to estimating the least eigenvalue of a random Wishart matrix. This simple distribution enables a concise proof, leveraging a few key properties of the random Wishart ensemble.
Tasks
Published 2019-11-06
URL https://arxiv.org/abs/1911.02212v1
PDF https://arxiv.org/pdf/1911.02212v1.pdf
PWC https://paperswithcode.com/paper/the-gradient-complexity-of-linear-regression
Repo
Framework

Statistical Analysis of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization

Title Statistical Analysis of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization
Authors Zhengling Qi, Ying Cui, Yufeng Liu, Jong-Shi Pang
Abstract This paper has two main goals: (a) establish several statistical properties—consistency, asymptotic distributions, and convergence rates—of stationary solutions and values of a class of coupled nonconvex and nonsmoothempirical risk minimization problems, and (b) validate these properties by a noisy amplitude-based phase retrieval problem, the latter being of much topical interest.Derived from available data via sampling, these empirical risk minimization problems are the computational workhorse of a population risk model which involves the minimization of an expected value of a random functional. When these minimization problems are nonconvex, the computation of their globally optimal solutions is elusive. Together with the fact that the expectation operator cannot be evaluated for general probability distributions, it becomes necessary to justify whether the stationary solutions of the empirical problems are practical approximations of the stationary solution of the population problem. When these two features, general distribution and nonconvexity, are coupled with nondifferentiability that often renders the problems “non-Clarke regular”, the task of the justification becomes challenging. Our work aims to address such a challenge within an algorithm-free setting. The resulting analysis is therefore different from the much of the analysis in the recent literature that is based on local search algorithms. Furthermore, supplementing the classical minimizer-centric analysis, our results offer a first step to close the gap between computational optimization and asymptotic analysis of coupled nonconvex nonsmooth statistical estimation problems, expanding the former with statistical properties of the practically obtained solution and providing the latter with a more practical focus pertaining to computational tractability.
Tasks
Published 2019-10-06
URL https://arxiv.org/abs/1910.02488v1
PDF https://arxiv.org/pdf/1910.02488v1.pdf
PWC https://paperswithcode.com/paper/statistical-analysis-of-stationary-solutions
Repo
Framework

Real-Time Driver State Monitoring Using a CNN Based Spatio-Temporal Approach

Title Real-Time Driver State Monitoring Using a CNN Based Spatio-Temporal Approach
Authors Neslihan Kose, Okan Kopuklu, Alexander Unnervik, Gerhard Rigoll
Abstract Many road accidents occur due to distracted drivers. Today, driver monitoring is essential even for the latest autonomous vehicles to alert distracted drivers in order to take over control of the vehicle in case of emergency. In this paper, a spatio-temporal approach is applied to classify drivers’ distraction level and movement decisions using convolutional neural networks (CNNs). We approach this problem as action recognition to benefit from temporal information in addition to spatial information. Our approach relies on features extracted from sparsely selected frames of an action using a pre-trained BN-Inception network. Experiments show that our approach outperforms the state-of-the art results on the Distracted Driver Dataset (96.31%), with an accuracy of 99.10% for 10-class classification while providing real-time performance. We also analyzed the impact of fusion using RGB and optical flow modalities with a very recent data level fusion strategy. The results on the Distracted Driver and Brain4Cars datasets show that fusion of these modalities further increases the accuracy.
Tasks Autonomous Vehicles, Optical Flow Estimation
Published 2019-07-18
URL https://arxiv.org/abs/1907.08009v1
PDF https://arxiv.org/pdf/1907.08009v1.pdf
PWC https://paperswithcode.com/paper/real-time-driver-state-monitoring-using-a-cnn
Repo
Framework

A Survey of Game Theoretic Approaches for Adversarial Machine Learning in Cybersecurity Tasks

Title A Survey of Game Theoretic Approaches for Adversarial Machine Learning in Cybersecurity Tasks
Authors Prithviraj Dasgupta, Joseph B. Collins
Abstract Machine learning techniques are currently used extensively for automating various cybersecurity tasks. Most of these techniques utilize supervised learning algorithms that rely on training the algorithm to classify incoming data into different categories, using data encountered in the relevant domain. A critical vulnerability of these algorithms is that they are susceptible to adversarial attacks where a malicious entity called an adversary deliberately alters the training data to misguide the learning algorithm into making classification errors. Adversarial attacks could render the learning algorithm unsuitable to use and leave critical systems vulnerable to cybersecurity attacks. Our paper provides a detailed survey of the state-of-the-art techniques that are used to make a machine learning algorithm robust against adversarial attacks using the computational framework of game theory. We also discuss open problems and challenges and possible directions for further research that would make deep machine learning-based systems more robust and reliable for cybersecurity tasks.
Tasks
Published 2019-12-04
URL https://arxiv.org/abs/1912.02258v1
PDF https://arxiv.org/pdf/1912.02258v1.pdf
PWC https://paperswithcode.com/paper/a-survey-of-game-theoretic-approaches-for
Repo
Framework

ConfigTron: Tackling network diversity with heterogeneous configurations

Title ConfigTron: Tackling network diversity with heterogeneous configurations
Authors Usama Naseer, Theophilus Benson
Abstract The web serving protocol stack is constantly changing and evolving to tackle technological shifts in networking infrastructure and website complexity. As a result of this evolution, the web serving stack includes a plethora of protocols and configuration parameters that enable the web serving stack to address a variety of realistic network conditions. Yet, today, most content providers have adopted a “one-size-fits-all” approach to configuring the networking stack of their user facing web servers (or at best employ moderate tuning), despite the significant diversity in end-user networks and devices. In this paper, we revisit this problem and ask a more fundamental question: Are there benefits to tuning the network stack? If so, what system design choices and algorithmic ensembles are required to enable modern content provider to dynamically and flexibly tune their protocol stacks. We demonstrate through substantial empirical evidence that this “one-size-fits-all” approach results in sub-optimal performance and argue for a novel framework that extends existing CDN architectures to provide programmatic control over the configuration options of the CDN serving stack. We designed ConfigTron a data-driven framework that leverages data from all connections to identify their network characteristics and learn the optimal configuration parameters to improve end-user performance. ConfigTron uses contextual multi-arm bandit-based learning algorithm to find optimal configurations in minimal time, enabling a content providers to systematically explore heterogeneous configurations while improving end-user page load time by as much as 19% (upto 750ms) on median.
Tasks
Published 2019-08-13
URL https://arxiv.org/abs/1908.04518v1
PDF https://arxiv.org/pdf/1908.04518v1.pdf
PWC https://paperswithcode.com/paper/configtron-tackling-network-diversity-with
Repo
Framework

An Improved Approach for Semantic Graph Composition with CCG

Title An Improved Approach for Semantic Graph Composition with CCG
Authors Austin Blodgett, Nathan Schneider
Abstract This paper builds on previous work using Combinatory Categorial Grammar (CCG) to derive a transparent syntax-semantics interface for Abstract Meaning Representation (AMR) parsing. We define new semantics for the CCG combinators that is better suited to deriving AMR graphs. In particular, we define relation-wise alternatives for the application and composition combinators: these require that the two constituents being combined overlap in one AMR relation. We also provide a new semantics for type raising, which is necessary for certain constructions. Using these mechanisms, we suggest an analysis of eventive nouns, which present a challenge for deriving AMR graphs. Our theoretical analysis will facilitate future work on robust and transparent AMR parsing using CCG.
Tasks Amr Parsing
Published 2019-03-28
URL http://arxiv.org/abs/1903.11770v2
PDF http://arxiv.org/pdf/1903.11770v2.pdf
PWC https://paperswithcode.com/paper/an-improved-approach-for-semantic-graph
Repo
Framework

Dimension independent bounds for general shallow networks

Title Dimension independent bounds for general shallow networks
Authors Hrushikesh N. Mhaskar
Abstract This paper proves an abstract theorem addressing in a unified manner two important problems in function approximation: avoiding curse of dimensionality and estimating the degree of approximation for out-of-sample extension in manifold learning. We consider an abstract (shallow) network that includes, for example, neural networks, radial basis function networks, and kernels on data defined manifolds used for function approximation in various settings. A deep network is obtained by a composition of the shallow networks according to a directed acyclic graph, representing the architecture of the deep network. In this paper, we prove dimension independent bounds for approximation by shallow networks in the very general setting of what we have called $G$-networks on a compact metric measure space, where the notion of dimension is defined in terms of the cardinality of maximal distinguishable sets, generalizing the notion of dimension of a cube or a manifold. Our techniques give bounds that improve without saturation with the smoothness of the kernel involved in an integral representation of the target function. In the context of manifold learning, our bounds provide estimates on the degree of approximation for an out-of-sample extension of the target function to the ambient space. One consequence of our theorem is that without the requirement of robust parameter selection, deep networks using a non-smooth activation function such as the ReLU, do not provide any significant advantage over shallow networks in terms of the degree of approximation alone.
Tasks
Published 2019-08-26
URL https://arxiv.org/abs/1908.09880v2
PDF https://arxiv.org/pdf/1908.09880v2.pdf
PWC https://paperswithcode.com/paper/dimension-independent-bounds-for-general
Repo
Framework

Switched linear projections for neural network interpretability

Title Switched linear projections for neural network interpretability
Authors Lech Szymanski, Brendan McCane, Craig Atkinson
Abstract We introduce switched linear projections for expressing the activity of a neuron in a deep neural network in terms of a single linear projection in the input space. The method works by isolating the active subnetwork, a series of linear transformations, that determine the entire computation of the network for a given input instance. With these projections we can decompose activity in any hidden layer into patterns detected in a given input instance. We also propose that in ReLU networks it is instructive and meaningful to examine patterns that deactivate the neurons in a hidden layer, something that is implicitly ignored by the existing interpretability methods tracking solely the active aspect of the network’s computation.
Tasks
Published 2019-09-25
URL https://arxiv.org/abs/1909.11275v3
PDF https://arxiv.org/pdf/1909.11275v3.pdf
PWC https://paperswithcode.com/paper/switched-linear-projections-and-inactive
Repo
Framework
comments powered by Disqus