April 3, 2020

3190 words 15 mins read

Paper Group ANR 76

Paper Group ANR 76

Fusion of Camera Model and Source Device Specific Forensic Methods for Improved Tamper Detection. Quantized Push-sum for Gossip and Decentralized Optimization over Directed Graphs. Generalisation error in learning with random features and the hidden manifold model. Markov Logic Networks with Complex Weights: Expressivity, Liftability and Fourier Tr …

Fusion of Camera Model and Source Device Specific Forensic Methods for Improved Tamper Detection

Title Fusion of Camera Model and Source Device Specific Forensic Methods for Improved Tamper Detection
Authors Ahmet Gökhan Poyraz, Ahmet Emir Dirik, Ahmet Karaküçük, Nasir Memon
Abstract PRNU based camera recognition method is widely studied in the image forensic literature. In recent years, CNN based camera model recognition methods have been developed. These two methods also provide solutions to tamper localization problem. In this paper, we propose their combination via a Neural Network to achieve better small-scale tamper detection performance. According to the results, the fusion method performs better than underlying methods even under high JPEG compression. For forgeries as small as 100$\times$100 pixel size, the proposed method outperforms the state-of-the-art, which validates the usefulness of fusion for localization of small-size image forgeries. We believe the proposed approach is feasible for any tamper-detection pipeline using the PRNU based methodology.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10123v1
PDF https://arxiv.org/pdf/2002.10123v1.pdf
PWC https://paperswithcode.com/paper/fusion-of-camera-model-and-source-device

Quantized Push-sum for Gossip and Decentralized Optimization over Directed Graphs

Title Quantized Push-sum for Gossip and Decentralized Optimization over Directed Graphs
Authors Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani
Abstract We consider a decentralized stochastic learning problem where data points are distributed among computing nodes communicating over a directed graph. As the model size gets large, decentralized learning faces a major bottleneck that is the heavy communication load due to each node transmitting large messages (model updates) to its neighbors. To tackle this bottleneck, we propose the quantized decentralized stochastic learning algorithm over directed graphs that is based on the push-sum algorithm in decentralized consensus optimization. More importantly, we prove that our algorithm achieves the same convergence rates of the decentralized stochastic learning algorithm with exact-communication for both convex and non-convex losses. A key technical challenge of the work is to prove exact convergence of the proposed decentralized learning algorithm in the presence of quantization noise with unbounded variance over directed graphs. We provide numerical evaluations that corroborate our main theoretical results and illustrate significant speed-up compared to the exact-communication methods.
Tasks Quantization
Published 2020-02-23
URL https://arxiv.org/abs/2002.09964v2
PDF https://arxiv.org/pdf/2002.09964v2.pdf
PWC https://paperswithcode.com/paper/quantized-push-sum-for-gossip-and

Generalisation error in learning with random features and the hidden manifold model

Title Generalisation error in learning with random features and the hidden manifold model
Authors Federica Gerace, Bruno Loureiro, Florent Krzakala, Marc Mézard, Lenka Zdeborová
Abstract We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model. We consider the high-dimensional regime and using the replica method from statistical physics, we provide a closed-form expression for the asymptotic generalisation performance in these problems, valid in both the under- and over-parametrised regimes and for a broad choice of generalised linear model loss functions. In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model. Beyond the interest in these particular problems, the theoretical formalism introduced in this manuscript provides a path to further extensions to more complex tasks.
Published 2020-02-21
URL https://arxiv.org/abs/2002.09339v1
PDF https://arxiv.org/pdf/2002.09339v1.pdf
PWC https://paperswithcode.com/paper/generalisation-error-in-learning-with-random

Markov Logic Networks with Complex Weights: Expressivity, Liftability and Fourier Transforms

Title Markov Logic Networks with Complex Weights: Expressivity, Liftability and Fourier Transforms
Authors Ondrej Kuzelka
Abstract We study expressivity of Markov logic networks (MLNs). We introduce complex MLNs, which use complex-valued weights, and we show that, unlike standard MLNs with real-valued weights, complex MLNs are fully expressive. We then observe that discrete Fourier transform can be computed using weighted first order model counting (WFOMC) with complex weights and use this observation to design an algorithm for computing relational marginal polytopes which needs substantially less calls to a WFOMC oracle than a recent algorithm.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10259v1
PDF https://arxiv.org/pdf/2002.10259v1.pdf
PWC https://paperswithcode.com/paper/markov-logic-networks-with-complex-weights

On Feature Interactions Identified by Shapley Values of Binary Classification Games

Title On Feature Interactions Identified by Shapley Values of Binary Classification Games
Authors Sandhya Tripathi, N. Hemachandra, Prashant Trivedi
Abstract For feature selection and related problems, we introduce the notion of classification game, a cooperative game, with features as players and hinge loss based characteristic function and relate a feature’s contribution to Shapley value based error apportioning (SVEA) of total training error. Our major contribution is ($\star$) to show that for any dataset the threshold 0 on SVEA value identifies feature subset whose joint interactions for label prediction is significant or those features that span a subspace where the data is predominantly lying. In addition, our scheme ($\star$) identifies the features on which Bayes classifier doesn’t depend but any surrogate loss function based finite sample classifier does; this contributes to the excess $0$-$1$ risk of such a classifier, ($\star$) estimates unknown true hinge risk of a feature, and ($\star$) relate the stability property of an allocation and negative valued SVEA by designing the analogue of core of classification game. Due to Shapley value’s computationally expensive nature, we build on a known Monte Carlo based approximation algorithm that computes characteristic function (Linear Programs) only when needed. We address the potential sample bias problem in feature selection by providing interval estimates for SVEA values obtained from multiple sub-samples. We illustrate all the above aspects on various synthetic and real datasets and show that our scheme achieves better results than existing recursive feature elimination technique and ReliefF in most cases. Our theoretically grounded classification game in terms of well defined characteristic function offers interpretability and explainability of our framework, including identification of important features.
Tasks Feature Selection
Published 2020-01-12
URL https://arxiv.org/abs/2001.03956v1
PDF https://arxiv.org/pdf/2001.03956v1.pdf
PWC https://paperswithcode.com/paper/on-feature-interactions-identified-by-shapley

Extended Batch Normalization

Title Extended Batch Normalization
Authors Chunjie Luo, Jianfeng Zhan, Lei Wang, Wanling Gao
Abstract Batch normalization (BN) has become a standard technique for training the modern deep networks. However, its effectiveness diminishes when the batch size becomes smaller, since the batch statistics estimation becomes inaccurate. That hinders batch normalization’s usage for 1) training larger model which requires small batches constrained by memory consumption, 2) training on mobile or embedded devices of which the memory resource is limited. In this paper, we propose a simple but effective method, called extended batch normalization (EBN). For NCHW format feature maps, extended batch normalization computes the mean along the (N, H, W) dimensions, as the same as batch normalization, to maintain the advantage of batch normalization. To alleviate the problem caused by small batch size, extended batch normalization computes the standard deviation along the (N, C, H, W) dimensions, thus enlarges the number of samples from which the standard deviation is computed. We compare extended batch normalization with batch normalization and group normalization on the datasets of MNIST, CIFAR-10/100, STL-10, and ImageNet, respectively. The experiments show that extended batch normalization alleviates the problem of batch normalization with small batch size while achieving close performances to batch normalization with large batch size.
Published 2020-03-12
URL https://arxiv.org/abs/2003.05569v1
PDF https://arxiv.org/pdf/2003.05569v1.pdf
PWC https://paperswithcode.com/paper/extended-batch-normalization

The Voltage Regulation of Boost Converters Using Dual Heuristic Programming

Title The Voltage Regulation of Boost Converters Using Dual Heuristic Programming
Authors Sepehr Saadatmand, Mohammadamir Kavousi, Sima Azizi
Abstract In this paper, a dual heuristic programming controller is proposed to control a boost converter. Conventional controllers such as proportional integral derivative (PID) or proportional integral (PI) are designed based on the linearized small-signal model near the operating point. Therefore, the performance of the controller during start up, load change, or input voltage variation is not optimal since the system model changes by varying the operating point. The dual heuristic programming controller optimally controls the boost converter by following the approximate dynamic programming. The advantage of the DHP is that the neural network based characteristic of the proposed controller enables boost converters to easily cope with large disturbances. A DHP with a well trained critic and action networks can perform as an optimal controller for the boost converter. To compare the effectiveness of the traditional PI based and the DHP boost converter, the simulation results are provided.
Published 2020-01-27
URL https://arxiv.org/abs/2001.10863v1
PDF https://arxiv.org/pdf/2001.10863v1.pdf
PWC https://paperswithcode.com/paper/the-voltage-regulation-of-boost-converters

Variance Loss in Variational Autoencoders

Title Variance Loss in Variational Autoencoders
Authors Andrea Asperti
Abstract In this article, we highlight what appears to be major issue of Variational Autoencoders, evinced from an extensive experimentation with different network architectures and datasets: the variance of generated data is sensibly lower than that of training data. Since generative models are usually evaluated with metrics such as the Frechet Inception Distance (FID) that compare the distributions of (features of) real versus generated images, the variance loss typically results in degraded scores. This problem is particularly relevant in a two stage setting, where we use a second VAE to sample in the latent space of the first VAE. The minor variance creates a mismatch between the actual distribution of latent variables and those generated by the second VAE, that hinders the beneficial effects of the second stage. Renormalizing the output of the second VAE towards the expected normal spherical distribution, we obtain a sudden burst in the quality of generated samples, as also testified in terms of FID.
Published 2020-02-23
URL https://arxiv.org/abs/2002.09860v1
PDF https://arxiv.org/pdf/2002.09860v1.pdf
PWC https://paperswithcode.com/paper/variance-loss-in-variational-autoencoders

HIN: Hierarchical Inference Network for Document-Level Relation Extraction

Title HIN: Hierarchical Inference Network for Document-Level Relation Extraction
Authors Hengzhu Tang, Yanan Cao, Zhenyu Zhang, Jiangxia Cao, Fang Fang, Shi Wang, Pengfei Yin
Abstract Document-level RE requires reading, inferring and aggregating over multiple sentences. From our point of view, it is necessary for document-level RE to take advantage of multi-granularity inference information: entity level, sentence level and document level. Thus, how to obtain and aggregate the inference information with different granularity is challenging for document-level RE, which has not been considered by previous work. In this paper, we propose a Hierarchical Inference Network (HIN) to make full use of the abundant information from entity level, sentence level and document level. Translation constraint and bilinear transformation are applied to target entity pair in multiple subspaces to get entity-level inference information. Next, we model the inference between entity-level information and sentence representation to achieve sentence-level inference information. Finally, a hierarchical aggregation approach is adopted to obtain the document-level inference information. In this way, our model can effectively aggregate inference information from these three different granularities. Experimental results show that our method achieves state-of-the-art performance on the large-scale DocRED dataset. We also demonstrate that using BERT representations can further substantially boost the performance.
Tasks Relation Extraction
Published 2020-03-28
URL https://arxiv.org/abs/2003.12754v1
PDF https://arxiv.org/pdf/2003.12754v1.pdf
PWC https://paperswithcode.com/paper/hin-hierarchical-inference-network-for

L2B: Learning to Balance the Safety-Efficiency Trade-off in Interactive Crowd-aware Robot Navigation

Title L2B: Learning to Balance the Safety-Efficiency Trade-off in Interactive Crowd-aware Robot Navigation
Authors Mai Nishimura, Ryo Yonetani
Abstract This work presents a deep reinforcement learning framework for interactive navigation in a crowded place. Our proposed approach, Learning to Balance (L2B) framework enables mobile robot agents to steer safely towards their destinations by avoiding collisions with a crowd, while actively clearing a path by asking nearby pedestrians to make room, if necessary, to keep their travel efficient. We observe that the safety and efficiency requirements in crowd-aware navigation have a trade-off in the presence of social dilemmas between the agent and the crowd. On the one hand, intervening in pedestrian paths too much to achieve instant efficiency will result in collapsing a natural crowd flow and may eventually put everyone, including the self, at risk of collisions. On the other hand, keeping in silence to avoid every single collision will lead to the agent’s inefficient travel. With this observation, our L2B framework augments the reward function used in learning an interactive navigation policy to penalize frequent active path clearing and passive collision avoidance, which substantially improves the balance of the safety-efficiency trade-off. We evaluate our L2B framework in a challenging crowd simulation and demonstrate its superiority, in terms of both navigation success and collision rate, over a state-of-the-art navigation approach.
Tasks Robot Navigation
Published 2020-03-20
URL https://arxiv.org/abs/2003.09207v1
PDF https://arxiv.org/pdf/2003.09207v1.pdf
PWC https://paperswithcode.com/paper/l2b-learning-to-balance-the-safety-efficiency

Agile Earth observation satellite scheduling over 20 years: formulations, methods and future directions

Title Agile Earth observation satellite scheduling over 20 years: formulations, methods and future directions
Authors Xinwei Wang, Guohua Wu, Lining Xing, Witold Pedrycz
Abstract Agile satellites with advanced attitude maneuvering capability are the new generation of Earth observation satellites (EOSs). The continuous improvement in satellite technology and decrease in launch cost have boosted the development of agile EOSs (AEOSs). To efficiently employ the increasing orbiting AEOSs, the AEOS scheduling problem (AEOSSP) aiming to maximize the entire observation profit while satisfying all complex operational constraints, has received much attention over the past 20 years. The objectives of this paper are thus to summarize current research on AEOSSP, identify main accomplishments and highlight potential future research directions. To this end, general definitions of AEOSSP with operational constraints are described initially, followed by its three typical variations including different definitions of observation profit, multi-objective function and autonomous model. A detailed literature review from 1997 up to 2019 is then presented in line with four different solution methods, i.e., exact method, heuristic, metaheuristic and machine learning. Finally, we discuss a number of topics worth pursuing in the future.
Published 2020-03-13
URL https://arxiv.org/abs/2003.06169v1
PDF https://arxiv.org/pdf/2003.06169v1.pdf
PWC https://paperswithcode.com/paper/agile-earth-observation-satellite-scheduling

Adaptive Distributed Stochastic Gradient Descent for Minimizing Delay in the Presence of Stragglers

Title Adaptive Distributed Stochastic Gradient Descent for Minimizing Delay in the Presence of Stragglers
Authors Serge Kas Hanna, Rawad Bitar, Parimal Parag, Venkat Dasari, Salim El Rouayheb
Abstract We consider the setting where a master wants to run a distributed stochastic gradient descent (SGD) algorithm on $n$ workers each having a subset of the data. Distributed SGD may suffer from the effect of stragglers, i.e., slow or unresponsive workers who cause delays. One solution studied in the literature is to wait at each iteration for the responses of the fastest $k<n$ workers before updating the model, where $k$ is a fixed parameter. The choice of the value of $k$ presents a trade-off between the runtime (i.e., convergence rate) of SGD and the error of the model. Towards optimizing the error-runtime trade-off, we investigate distributed SGD with adaptive $k$. We first design an adaptive policy for varying $k$ that optimizes this trade-off based on an upper bound on the error as a function of the wall-clock time which we derive. Then, we propose an algorithm for adaptive distributed SGD that is based on a statistical heuristic. We implement our algorithm and provide numerical simulations which confirm our intuition and theoretical analysis.
Published 2020-02-25
URL https://arxiv.org/abs/2002.11005v1
PDF https://arxiv.org/pdf/2002.11005v1.pdf
PWC https://paperswithcode.com/paper/adaptive-distributed-stochastic-gradient

An Accurate EEGNet-based Motor-Imagery Brain-Computer Interface for Low-Power Edge Computing

Title An Accurate EEGNet-based Motor-Imagery Brain-Computer Interface for Low-Power Edge Computing
Authors Xiaying Wang, Michael Hersche, Batuhan Tömekce, Burak Kaya, Michele Magno, Luca Benini
Abstract This paper presents an accurate and robust embedded motor-imagery brain-computer interface (MI-BCI). The proposed novel model, based on EEGNet, matches the requirements of memory footprint and computational resources of low-power microcontroller units (MCUs), such as the ARM Cortex-M family. Furthermore, the paper presents a set of methods, including temporal downsampling, channel selection, and narrowing of the classification window, to further scale down the model to relax memory requirements with negligible accuracy degradation. Experimental results on the Physionet EEG Motor Movement/Imagery Dataset show that standard EEGNet achieves 82.43%, 75.07%, and 65.07% classification accuracy on 2-, 3-, and 4-class MI tasks in global validation, outperforming the state-of-the-art (SoA) convolutional neural network (CNN) by 2.05%, 5.25%, and 5.48%. Our novel method further scales down the standard EEGNet at a negligible accuracy loss of 0.31% with 7.6x memory footprint reduction and a small accuracy loss of 2.51% with 15x reduction. The scaled models are deployed on a commercial Cortex-M4F MCU taking 101ms and consuming 1.9mJ per inference for operating the smallest model, and on a Cortex-M7 with 44ms and 6.2mJ per inference for the medium-sized model, enabling a fully autonomous, wearable, and accurate low-power BCI.
Tasks EEG
Published 2020-03-31
URL https://arxiv.org/abs/2004.00077v1
PDF https://arxiv.org/pdf/2004.00077v1.pdf
PWC https://paperswithcode.com/paper/an-accurate-eegnet-based-motor-imagery-brain

Coordinate-wise Armijo’s condition: General case

Title Coordinate-wise Armijo’s condition: General case
Authors Tuyen Trung Truong
Abstract Let $z=(x,y)$ be coordinates for the product space $\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}$. Let $f:\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}\rightarrow \mathbb{R}$ be a $C^1$ function, and $\nabla f=(\partial _xf,\partial _yf)$ its gradient. Fix $0<\alpha <1$. For a point $(x,y) \in \mathbb{R}^{m_1}\times \mathbb{R}^{m_2}$, a number $\delta >0$ satisfies Armijo’s condition at $(x,y)$ if the following inequality holds: \begin{eqnarray*} f(x-\delta \partial _xf,y-\delta \partial _yf)-f(x,y)\leq -\alpha \delta (\partial _xf^2+\partial _yf^2). \end{eqnarray*} In one previous paper, we proposed the following {\bf coordinate-wise} Armijo’s condition. Fix again $0<\alpha <1$. A pair of positive numbers $\delta _1,\delta _2>0$ satisfies the coordinate-wise variant of Armijo’s condition at $(x,y)$ if the following inequality holds: \begin{eqnarray*} [f(x-\delta _1\partial _xf(x,y), y-\delta _2\partial _y f(x,y))]-[f(x,y)]\leq -\alpha (\delta _1\partial _xf(x,y)^2+\delta _2\partial _yf(x,y)^2). \end{eqnarray*} Previously we applied this condition for functions of the form $f(x,y)=f(x)+g(y)$, and proved various convergent results for them. For a general function, it is crucial - for being able to do real computations - to have a systematic algorithm for obtaining $\delta _1$ and $\delta _2$ satisfying the coordinate-wise version of Armijo’s condition, much like Backtracking for the usual Armijo’s condition. In this paper we propose such an algorithm, and prove according convergent results. We then analyse and present experimental results for some functions such as $f(x,y)=ax+y$ (given by Asl and Overton in connection to Wolfe’s method), $f(x,y)=x^3 sin (1/x) + y^3 sin(1/y)$ and Rosenbrock’s function.
Published 2020-03-11
URL https://arxiv.org/abs/2003.05252v1
PDF https://arxiv.org/pdf/2003.05252v1.pdf
PWC https://paperswithcode.com/paper/coordinate-wise-armijos-condition-general

A Self-Supervised Learning Approach to Rapid Path Planning for Car-Like Vehicles Maneuvering in Urban Environment

Title A Self-Supervised Learning Approach to Rapid Path Planning for Car-Like Vehicles Maneuvering in Urban Environment
Authors Piotr Kicki, Tomasz Gawron, Piotr Skrzypczyński
Abstract An efficient path planner for autonomous car-like vehicles should handle the strong kinematic constraints, particularly in confined spaces commonly encountered while maneuvering in city traffic, and should enable rapid planning, as the city traffic scenarios are highly dynamic. State-of-the-art planning algorithms handle such difficult cases at high computational cost, often yielding non-deterministic results. However, feasible local paths can be quickly generated leveraging the past planning experience gained in the same or similar environment. While learning through supervised training is problematic for real traffic scenarios, we introduce in this paper a novel neural network-based method for path planning, which employs a gradient-based self-supervised learning algorithm to predict feasible paths. This approach strongly exploits the experience gained in the past and rapidly yields feasible maneuver plans for car-like vehicles with limited steering-angle. The effectiveness of such an approach has been confirmed by computational experiments.
Published 2020-03-02
URL https://arxiv.org/abs/2003.00946v1
PDF https://arxiv.org/pdf/2003.00946v1.pdf
PWC https://paperswithcode.com/paper/a-self-supervised-learning-approach-to-rapid
comments powered by Disqus