January 27, 2020

2842 words 14 mins read

Paper Group ANR 1283

An Efficient Intelligent System for the Classification of Electroencephalography (EEG) Brain Signals using Nuclear Features for Human Cognitive Tasks. McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds. A Full-Image Full-Resolution End-to-End-Trainable CNN Framework for Image Forgery Detection. Lower Dimensional Kernels …

An Efficient Intelligent System for the Classification of Electroencephalography (EEG) Brain Signals using Nuclear Features for Human Cognitive Tasks


Title	An Efficient Intelligent System for the Classification of Electroencephalography (EEG) Brain Signals using Nuclear Features for Human Cognitive Tasks
Authors	Emad-ul-Haq Qazi, Muhammad Hussain, Hatim Aboalsamh
Abstract	Representation and classification of Electroencephalography (EEG) brain signals are critical processes for their analysis in cognitive tasks. Particularly, extraction of discriminative features from raw EEG signals, without any pre-processing, is a challenging task. Motivated by nuclear norm, we observed that there is a significant difference between the variances of EEG signals captured from the same brain region when a subject performs different tasks. This observation lead us to use singular value decomposition for computing dominant variances of EEG signals captured from a certain brain region while performing a certain task and use them as features (nuclear features). A simple and efficient class means based minimum distance classifier (CMMDC) is enough to predict brain states. This approach results in the feature space of significantly small dimension and gives equally good classification results on clean as well as raw data. We validated the effectiveness and robustness of the technique using four datasets of different tasks: fluid intelligence clean data (FICD), fluid intelligence raw data (FIRD), memory recall task (MRT), and eyes open / eyes closed task (EOEC). For each task, we analyzed EEG signals over six (06) different brain regions with 8, 16, 20, 18, 18 and 100 electrodes. The nuclear features from frontal brain region gave the 100% prediction accuracy. The discriminant analysis of the nuclear features has been conducted using intra-class and inter-class variations. Comparisons with the state-of-the-art techniques showed the superiority of the proposed system.
Tasks	EEG
Published	2019-04-30
URL	http://arxiv.org/abs/1904.13228v1
PDF	http://arxiv.org/pdf/1904.13228v1.pdf
PWC	https://paperswithcode.com/paper/an-efficient-intelligent-system-for-the
Repo
Framework

McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds


Title	McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds
Authors	Rui Ray Zhang, Xingwu Liu, Yuyi Wang, Liwei Wang
Abstract	A crucial assumption in most statistical learning theory is that samples are independently and identically distributed (i.i.d.). However, for many real applications, the i.i.d. assumption does not hold. We consider learning problems in which examples are dependent and their dependency relation is characterized by a graph. To establish algorithm-dependent generalization theory for learning with non-i.i.d. data, we first prove novel McDiarmid-type concentration inequalities for Lipschitz functions of graph-dependent random variables. We show that concentration relies on the forest complexity of the graph, which characterizes the strength of the dependency. We demonstrate that for many types of dependent data, the forest complexity is small and thus implies good concentration. Based on our new inequalities we are able to build stability bounds for learning from graph-dependent data.
Tasks
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02330v2
PDF	https://arxiv.org/pdf/1909.02330v2.pdf
PWC	https://paperswithcode.com/paper/mcdiarmid-type-inequalities-for-graph
Repo
Framework

A Full-Image Full-Resolution End-to-End-Trainable CNN Framework for Image Forgery Detection


Title	A Full-Image Full-Resolution End-to-End-Trainable CNN Framework for Image Forgery Detection
Authors	Francesco Marra, Diego Gragnaniello, Luisa Verdoliva, Giovanni Poggi
Abstract	Due to limited computational and memory resources, current deep learning models accept only rather small images in input, calling for preliminary image resizing. This is not a problem for high-level vision problems, where discriminative features are barely affected by resizing. On the contrary, in image forensics, resizing tends to destroy precious high-frequency details, impacting heavily on performance. One can avoid resizing by means of patch-wise processing, at the cost of renouncing whole-image analysis. In this work, we propose a CNN-based image forgery detection framework which makes decisions based on full-resolution information gathered from the whole image. Thanks to gradient checkpointing, the framework is trainable end-to-end with limited memory resources and weak (image-level) supervision, allowing for the joint optimization of all parameters. Experiments on widespread image forensics datasets prove the good performance of the proposed approach, which largely outperforms all baselines and all reference methods.
Tasks
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06751v1
PDF	https://arxiv.org/pdf/1909.06751v1.pdf
PWC	https://paperswithcode.com/paper/a-full-image-full-resolution-end-to-end
Repo
Framework

Lower Dimensional Kernels for Video Discriminators


Title	Lower Dimensional Kernels for Video Discriminators
Authors	Emmanuel Kahembwe, Subramanian Ramamoorthy
Abstract	This work presents an analysis of the discriminators used in Generative Adversarial Networks (GANs) for Video. We show that unconstrained video discriminator architectures induce a loss surface with high curvature which make optimisation difficult. We also show that this curvature becomes more extreme as the maximal kernel dimension of video discriminators increases. With these observations in hand, we propose a family of efficient Lower-Dimensional Video Discriminators for GANs (LDVD GANs). The proposed family of discriminators improve the performance of video GAN models they are applied to and demonstrate good performance on complex and diverse datasets such as UCF-101. In particular, we show that they can double the performance of Temporal-GANs and provide for state-of-the-art performance on a single GPU.
Tasks	Video Generation
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08860v1
PDF	https://arxiv.org/pdf/1912.08860v1.pdf
PWC	https://paperswithcode.com/paper/lower-dimensional-kernels-for-video
Repo
Framework

Attention-based Transfer Learning for Brain-computer Interface


Title	Attention-based Transfer Learning for Brain-computer Interface
Authors	Chuanqi Tan, Fuchun Sun, Tao Kong, Bin Fang, Wenchang Zhang
Abstract	Different functional areas of the human brain play different roles in brain activity, which has not been paid sufficient research attention in the brain-computer interface (BCI) field. This paper presents a new approach for electroencephalography (EEG) classification that applies attention-based transfer learning. Our approach considers the importance of different brain functional areas to improve the accuracy of EEG classification, and provides an additional way to automatically identify brain functional areas associated with new activities without the involvement of a medical professional. We demonstrate empirically that our approach out-performs state-of-the-art approaches in the task of EEG classification, and the results of visualization indicate that our approach can detect brain functional areas related to a certain task.
Tasks	EEG, Transfer Learning
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11950v1
PDF	http://arxiv.org/pdf/1904.11950v1.pdf
PWC	https://paperswithcode.com/paper/attention-based-transfer-learning-for-brain
Repo
Framework

Matrix-Free Preconditioning in Online Learning


Title	Matrix-Free Preconditioning in Online Learning
Authors	Ashok Cutkosky, Tamas Sarlos
Abstract	We provide an online convex optimization algorithm with regret that interpolates between the regret of an algorithm using an optimal preconditioning matrix and one using a diagonal preconditioning matrix. Our regret bound is never worse than that obtained by diagonal preconditioning, and in certain setting even surpasses that of algorithms with full-matrix preconditioning. Importantly, our algorithm runs in the same time and space complexity as online gradient descent. Along the way we incorporate new techniques that mildly streamline and improve logarithmic factors in prior regret analyses. We conclude by benchmarking our algorithm on synthetic data and deep learning tasks.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12721v1
PDF	https://arxiv.org/pdf/1905.12721v1.pdf
PWC	https://paperswithcode.com/paper/matrix-free-preconditioning-in-online
Repo
Framework

Tripping through time: Efficient Localization of Activities in Videos


Title	Tripping through time: Efficient Localization of Activities in Videos
Authors	Meera Hahn, Asim Kadav, James M. Rehg, Hans Peter Graf
Abstract	Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video. Previous works have approached this task by processing the entire video, often more than once, to localize relevant activities. In the real world applications that this task lends itself to, such as surveillance, efficiency is a pivotal trait of a system. In this paper, we present TripNet, an end-to-end system that uses a gated attention architecture to model fine-grained textual and visual representations in order to align text and video content. Furthermore, TripNet uses reinforcement learning to efficiently localize relevant activity clips in long videos, by learning how to intelligently skip around the video. It extracts visual features for fewer frames to perform activity classification. In our evaluation over Charades-STA, ActivityNet Captions and the TACoS dataset, we find that TripNet achieves high accuracy and saves processing time by only looking at 32-41% of the entire video.
Tasks
Published	2019-04-22
URL	https://arxiv.org/abs/1904.09936v4
PDF	https://arxiv.org/pdf/1904.09936v4.pdf
PWC	https://paperswithcode.com/paper/tripping-through-time-efficient-localization
Repo
Framework

Autonomous Underwater Vehicle: Electronics and Software Implementation of the Proton AUV


Title	Autonomous Underwater Vehicle: Electronics and Software Implementation of the Proton AUV
Authors	Vivek Mange, Priyam Shah, Vishal Kothari
Abstract	The paper deals with the software and the electronics unit for an autonomous underwater vehicle. The implementation in the electronics unit is the connection and communication between SBC, pixhawk controller and other sensory hardware and actuators. The major implementation of the software unit is the algorithm for object detection based on Convolutional Neural Network (CNN) and its models. The Hyperparameters were tuned according to Odroid Xu4 for various models. The maneuvering algorithm uses the MAVLink protocol of the ArduSub project for movement and its simulation.
Tasks	Object Detection
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03472v1
PDF	https://arxiv.org/pdf/1909.03472v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-underwater-vehicle-electronics-and
Repo
Framework

Acceptable Planning: Influencing Individual Behavior to Reduce Transportation Energy Expenditure of a City


Title	Acceptable Planning: Influencing Individual Behavior to Reduce Transportation Energy Expenditure of a City
Authors	Shiwali Mohan, Hesham Rakha, Matthew Klenk
Abstract	Our research aims at developing intelligent systems to reduce the transportation-related energy expenditure of a large city by influencing individual behavior. We introduce COPTER - an intelligent travel assistant that evaluates multi-modal travel alternatives to find a plan that is acceptable to a person given their context and preferences. We propose a formulation for acceptable planning that brings together ideas from AI, machine learning, and economics. This formulation has been incorporated in COPTER that produces acceptable plans in real-time. We adopt a novel empirical evaluation framework that combines human decision data with a high fidelity multi-modal transportation simulation to demonstrate a 4% energy reduction and 20% delay reduction in a realistic deployment scenario in Los Angeles, California, USA.
Tasks
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10614v1
PDF	https://arxiv.org/pdf/1909.10614v1.pdf
PWC	https://paperswithcode.com/paper/acceptable-planning-influencing-individual
Repo
Framework

Sequential Mode Estimation with Oracle Queries


Title	Sequential Mode Estimation with Oracle Queries
Authors	Dhruti Shah, Tuhinangshu Choudhury, Nikhil Karamchandani, Aditya Gopalan
Abstract	We consider the problem of adaptively PAC-learning a probability distribution $\mathcal{P}$'s mode by querying an oracle for information about a sequence of i.i.d. samples $X_1, X_2, \ldots$ generated from $\mathcal{P}$. We consider two different query models: (a) each query is an index $i$ for which the oracle reveals the value of the sample $X_i$, (b) each query is comprised of two indices $i$ and $j$ for which the oracle reveals if the samples $X_i$ and $X_j$ are the same or not. For these query models, we give sequential mode-estimation algorithms which, at each time $t$, either make a query to the corresponding oracle based on past observations, or decide to stop and output an estimate for the distribution’s mode, required to be correct with a specified confidence. We analyze the query complexity of these algorithms for any underlying distribution $\mathcal{P}$, and derive corresponding lower bounds on the optimal query complexity under the two querying models.
Tasks
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08197v1
PDF	https://arxiv.org/pdf/1911.08197v1.pdf
PWC	https://paperswithcode.com/paper/sequential-mode-estimation-with-oracle
Repo
Framework

Efficient Memory Management for GPU-based Deep Learning Systems


Title	Efficient Memory Management for GPU-based Deep Learning Systems
Authors	Junzhe Zhang, Sai Ho Yeung, Yao Shu, Bingsheng He, Wei Wang
Abstract	GPU (graphics processing unit) has been used for many data-intensive applications. Among them, deep learning systems are one of the most important consumer systems for GPU nowadays. As deep learning applications impose deeper and larger models in order to achieve higher accuracy, memory management becomes an important research topic for deep learning systems, given that GPU has limited memory size. Many approaches have been proposed towards this issue, e.g., model compression and memory swapping. However, they either degrade the model accuracy or require a lot of manual intervention. In this paper, we propose two orthogonal approaches to reduce the memory cost from the system perspective. Our approaches are transparent to the models, and thus do not affect the model accuracy. They are achieved by exploiting the iterative nature of the training algorithm of deep learning to derive the lifetime and read/write order of all variables. With the lifetime semantics, we are able to implement a memory pool with minimal fragments. However, the optimization problem is NP-complete. We propose a heuristic algorithm that reduces up to 13.3% of memory compared with Nvidia’s default memory pool with equal time complexity. With the read/write semantics, the variables that are not in use can be swapped out from GPU to CPU to reduce the memory footprint. We propose multiple swapping strategies to automatically decide which variable to swap and when to swap out (in), which reduces the memory cost by up to 34.2% without communication overhead.
Tasks	Model Compression
Published	2019-02-19
URL	http://arxiv.org/abs/1903.06631v1
PDF	http://arxiv.org/pdf/1903.06631v1.pdf
PWC	https://paperswithcode.com/paper/efficient-memory-management-for-gpu-based
Repo
Framework

Robust Federated Learning in a Heterogeneous Environment


Title	Robust Federated Learning in a Heterogeneous Environment
Authors	Avishek Ghosh, Justin Hong, Dong Yin, Kannan Ramchandran
Abstract	We study a recently proposed large-scale distributed learning paradigm, namely Federated Learning, where the worker machines are end users’ own devices. Statistical and computational challenges arise in Federated Learning particularly in the presence of heterogeneous data distribution (i.e., data points on different devices belong to different distributions signifying different clusters) and Byzantine machines (i.e., machines that may behave abnormally, or even exhibit arbitrary and potentially adversarial behavior). To address the aforementioned challenges, first we propose a general statistical model for this problem which takes both the cluster structure of the users and the Byzantine machines into account. Then, leveraging the statistical model, we solve the robust heterogeneous Federated Learning problem \emph{optimally}; in particular our algorithm matches the lower bound on the estimation error in dimension and the number of data points. Furthermore, as a by-product, we prove statistical guarantees for an outlier-robust clustering algorithm, which can be considered as the Lloyd algorithm with robust estimation. Finally, we show via synthetic as well as real data experiments that the estimation error obtained by our proposed algorithm is significantly better than the non-Byzantine-robust algorithms; in particular, we gain at least by 53% and 33% for synthetic and real data experiments, respectively, in typical settings.
Tasks
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06629v2
PDF	https://arxiv.org/pdf/1906.06629v2.pdf
PWC	https://paperswithcode.com/paper/robust-federated-learning-in-a-heterogeneous
Repo
Framework

A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition


Title	A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition
Authors	Lilia Simeonova, Kiril Simov, Petya Osenova, Preslav Nakov
Abstract	We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information. While previous work has focused on learning from raw word input, using word and character embeddings only, we show that for morphologically rich languages, such as Bulgarian, access to POS information contributes more to the performance gains than the detailed morphological information. Thus, we show that named entity recognition needs only coarse-grained POS tags, but at the same time it can benefit from simultaneously using some POS information of different granularity. Our evaluation results over a standard dataset show sizable improvements over the state-of-the-art for Bulgarian NER.
Tasks	Named Entity Recognition, Word Embeddings
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10261v1
PDF	https://arxiv.org/pdf/1908.10261v1.pdf
PWC	https://paperswithcode.com/paper/a-morpho-syntactically-informed-lstm-crf
Repo
Framework

Dual affine moment invariants


Title	Dual affine moment invariants
Authors	You Hao, Hanlin Mo, Qi Li, He Zhang, Hua Li
Abstract	Affine transformation is one of the most common transformations in nature, which is an important issue in the field of computer vision and shape analysis. And affine transformations often occur in both shape and color space simultaneously, which can be termed as Dual-Affine Transformation (DAT). In general, we should derive invariants of different data formats separately, such as 2D color images, 3D color objects, or even higher-dimensional data. To the best of our knowledge, there is no general framework to derive invariants for all of these data formats. In this paper, we propose a general framework to derive moment invariants under DAT for objects in M-dimensional space with N channels, which can be called dual-affine moment invariants (DAMI). Following this framework, we present the generating formula of DAMI under DAT for 3D color objects. Then, we instantiated a complete set of DAMI for 3D color objects with orders and degrees no greater than 4. Finally, we analyze the characteristic of these DAMI and conduct classification experiments to evaluate the stability and discriminability of them. The results prove that DAMI is robust for DAT. Our derivation framework can be applied to data in any dimension with any number of channels.
Tasks
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08233v1
PDF	https://arxiv.org/pdf/1911.08233v1.pdf
PWC	https://paperswithcode.com/paper/dual-affine-moment-invariants
Repo
Framework

The Complexity of Finding Stationary Points with Stochastic Gradient Descent


Title	The Complexity of Finding Stationary Points with Stochastic Gradient Descent
Authors	Yoel Drori, Ohad Shamir
Abstract	We study the iteration complexity of stochastic gradient descent (SGD) for minimizing the gradient norm of smooth, possibly nonconvex functions. We provide several results, implying that the classical $\mathcal{O}(\epsilon^{-4})$ upper bound (for making the average gradient norm less than $\epsilon$) cannot be improved upon, unless a combination of additional assumptions is made. Notably, this holds even if we limit ourselves to convex quadratic functions. We also show that for nonconvex functions, the feasibility of minimizing gradients with SGD is surprisingly sensitive to the choice of optimality criteria.
Tasks
Published	2019-10-04
URL	https://arxiv.org/abs/1910.01845v1
PDF	https://arxiv.org/pdf/1910.01845v1.pdf
PWC	https://paperswithcode.com/paper/the-complexity-of-finding-stationary-points
Repo
Framework