October 19, 2019

2887 words 14 mins read

Paper Group ANR 142

Paper Group ANR 142

Evaluation of Preference of Multimedia Content using Deep Neural Networks for Electroencephalography. Deep Auxiliary Learning for Visual Localization and Odometry. Dial2Desc: End-to-end Dialogue Description Generation. Local Frequency Interpretation and Non-Local Self-Similarity on Graph for Point Cloud Inpainting. Data Poisoning Attacks against On …

Evaluation of Preference of Multimedia Content using Deep Neural Networks for Electroencephalography

Title Evaluation of Preference of Multimedia Content using Deep Neural Networks for Electroencephalography
Authors Seong-Eun Moon, Soobeom Jang, Jong-Seok Lee
Abstract Evaluation of quality of experience (QoE) based on electroencephalography (EEG) has received great attention due to its capability of real-time QoE monitoring of users. However, it still suffers from rather low recognition accuracy. In this paper, we propose a novel method using deep neural networks toward improved modeling of EEG and thereby improved recognition accuracy. In particular, we aim to model spatio-temporal characteristics relevant for QoE analysis within learning models. The results demonstrate the effectiveness of the proposed method.
Tasks EEG
Published 2018-09-11
URL http://arxiv.org/abs/1809.03650v2
PDF http://arxiv.org/pdf/1809.03650v2.pdf
PWC https://paperswithcode.com/paper/evaluation-of-preference-of-multimedia
Repo
Framework

Deep Auxiliary Learning for Visual Localization and Odometry

Title Deep Auxiliary Learning for Visual Localization and Odometry
Authors Abhinav Valada, Noha Radwan, Wolfram Burgard
Abstract Localization is an indispensable component of a robot’s autonomy stack that enables it to determine where it is in the environment, essentially making it a precursor for any action execution or planning. Although convolutional neural networks have shown promising results for visual localization, they are still grossly outperformed by state-of-the-art local feature-based techniques. In this work, we propose VLocNet, a new convolutional neural network architecture for 6-DoF global pose regression and odometry estimation from consecutive monocular images. Our multitask model incorporates hard parameter sharing, thus being compact and enabling real-time inference, in addition to being end-to-end trainable. We propose a novel loss function that utilizes auxiliary learning to leverage relative pose information during training, thereby constraining the search space to obtain consistent pose estimates. We evaluate our proposed VLocNet on indoor as well as outdoor datasets and show that even our single task model exceeds the performance of state-of-the-art deep architectures for global localization, while achieving competitive performance for visual odometry estimation. Furthermore, we present extensive experimental evaluations utilizing our proposed Geometric Consistency Loss that show the effectiveness of multitask learning and demonstrate that our model is the first deep learning technique to be on par with, and in some cases outperforms state-of-the-art SIFT-based approaches.
Tasks Auxiliary Learning, Visual Localization, Visual Odometry
Published 2018-03-09
URL http://arxiv.org/abs/1803.03642v1
PDF http://arxiv.org/pdf/1803.03642v1.pdf
PWC https://paperswithcode.com/paper/deep-auxiliary-learning-for-visual
Repo
Framework

Dial2Desc: End-to-end Dialogue Description Generation

Title Dial2Desc: End-to-end Dialogue Description Generation
Authors Haojie Pan, Junpei Zhou, Zhou Zhao, Yan Liu, Deng Cai, Min Yang
Abstract We first propose a new task named Dialogue Description (Dial2Desc). Unlike other existing dialogue summarization tasks such as meeting summarization, we do not maintain the natural flow of a conversation but describe an object or an action of what people are talking about. The Dial2Desc system takes a dialogue text as input, then outputs a concise description of the object or the action involved in this conversation. After reading this short description, one can quickly extract the main topic of a conversation and build a clear picture in his mind, without reading or listening to the whole conversation. Based on the existing dialogue dataset, we build a new dataset, which has more than one hundred thousand dialogue-description pairs. As a step forward, we demonstrate that one can get more accurate and descriptive results using a new neural attentive model that exploits the interaction between utterances from different speakers, compared with other baselines.
Tasks Meeting Summarization
Published 2018-11-01
URL http://arxiv.org/abs/1811.00185v1
PDF http://arxiv.org/pdf/1811.00185v1.pdf
PWC https://paperswithcode.com/paper/dial2desc-end-to-end-dialogue-description
Repo
Framework

Local Frequency Interpretation and Non-Local Self-Similarity on Graph for Point Cloud Inpainting

Title Local Frequency Interpretation and Non-Local Self-Similarity on Graph for Point Cloud Inpainting
Authors Zeqing Fu, Wei Hu, Zongming Guo
Abstract As 3D scanning devices and depth sensors mature, point clouds have attracted increasing attention as a format for 3D object representation, with applications in various fields such as tele-presence, navigation and heritage reconstruction. However, point clouds usually exhibit holes of missing data, mainly due to the limitation of acquisition techniques and complicated structure. Further, point clouds are defined on irregular non-Euclidean domains, which is challenging to address especially with conventional signal processing tools. Hence, leveraging on recent advances in graph signal processing, we propose an efficient point cloud inpainting method, exploiting both the local smoothness and the non-local self-similarity in point clouds. Specifically, we first propose a frequency interpretation in graph nodal domain, based on which we introduce the local graph-signal smoothness prior in order to describe the local smoothness of point clouds. Secondly, we explore the characteristics of non-local self-similarity, by globally searching for the most similar area to the missing region. The similarity metric between two areas is defined based on the direct component and the anisotropic graph total variation of normals in each area. Finally, we formulate the hole-filling step as an optimization problem based on the selected most similar area and regularized by the graph-signal smoothness prior. Besides, we propose voxelization and automatic hole detection methods for the point cloud prior to inpainting. Experimental results show that the proposed approach outperforms four competing methods significantly, both in objective and subjective quality.
Tasks
Published 2018-09-28
URL http://arxiv.org/abs/1810.03973v1
PDF http://arxiv.org/pdf/1810.03973v1.pdf
PWC https://paperswithcode.com/paper/181003973
Repo
Framework

Data Poisoning Attacks against Online Learning

Title Data Poisoning Attacks against Online Learning
Authors Yizhen Wang, Kamalika Chaudhuri
Abstract We consider data poisoning attacks, a class of adversarial attacks on machine learning where an adversary has the power to alter a small fraction of the training data in order to make the trained classifier satisfy certain objectives. While there has been much prior work on data poisoning, most of it is in the offline setting, and attacks for online learning, where training data arrives in a streaming manner, are not well understood. In this work, we initiate a systematic investigation of data poisoning attacks for online learning. We formalize the problem into two settings, and we propose a general attack strategy, formulated as an optimization problem, that applies to both with some modifications. We propose three solution strategies, and perform extensive experimental evaluation. Finally, we discuss the implications of our findings for building successful defenses.
Tasks data poisoning
Published 2018-08-27
URL http://arxiv.org/abs/1808.08994v1
PDF http://arxiv.org/pdf/1808.08994v1.pdf
PWC https://paperswithcode.com/paper/data-poisoning-attacks-against-online
Repo
Framework

Bayesian parameter estimation of miss-specified models

Title Bayesian parameter estimation of miss-specified models
Authors Johannes Oberpriller, T. A. Enßlin
Abstract Fitting a simplifying model with several parameters to real data of complex objects is a highly nontrivial task, but enables the possibility to get insights into the objects physics. Here, we present a method to infer the parameters of the model, the model error as well as the statistics of the model error. This method relies on the usage of many data sets in a simultaneous analysis in order to overcome the problems caused by the degeneracy between model parameters and model error. Errors in the modeling of the measurement instrument can be absorbed in the model error allowing for applications with complex instruments.
Tasks
Published 2018-12-19
URL http://arxiv.org/abs/1812.08194v1
PDF http://arxiv.org/pdf/1812.08194v1.pdf
PWC https://paperswithcode.com/paper/bayesian-parameter-estimation-of-miss
Repo
Framework

Theoretical Perspective of Convergence Complexity of Evolutionary Algorithms Adopting Optimal Mixing

Title Theoretical Perspective of Convergence Complexity of Evolutionary Algorithms Adopting Optimal Mixing
Authors Yu-Fan Tung, Tian-Li Yu
Abstract The optimal mixing evolutionary algorithms (OMEAs) have recently drawn much attention for their robustness, small size of required population, and efficiency in terms of number of function evaluations (NFE). In this paper, the performances and behaviors of OMEAs are studied by investigating the mechanism of optimal mixing (OM), the variation operator in OMEAs, under two scenarios – one-layer and two-layer masks. For the case of one-layer masks, the required population size is derived from the viewpoint of initial supply, while the convergence time is derived by analyzing the progress of sub-solution growth. NFE is then asymptotically bounded with rational probability by estimating the probability of performing evaluations. For the case of two-layer masks, empirical results indicate that the required population size is proportional to both the degree of cross competition and the results from the one-layer-mask case. The derived models also indicate that population sizing is decided by initial supply when disjoint masks are adopted, that the high selection pressure imposed by OM makes the composition of sub-problems impact little on NFE, and that the population size requirement for two-layer masks increases with the reverse-growth probability.
Tasks
Published 2018-07-24
URL http://arxiv.org/abs/1807.09203v1
PDF http://arxiv.org/pdf/1807.09203v1.pdf
PWC https://paperswithcode.com/paper/theoretical-perspective-of-convergence
Repo
Framework

Structured Disentangled Representations

Title Structured Disentangled Representations
Authors Babak Esmaeili, Hao Wu, Sarthak Jain, Alican Bozkurt, N. Siddharth, Brooks Paige, Dana H. Brooks, Jennifer Dy, Jan-Willem van de Meent
Abstract Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle discrete factors of variation. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks. We derive this objective as a generalization of the evidence lower bound, which allows us to explicitly represent the trade-offs between mutual information between data and representation, KL divergence between representation and prior, and coverage of the support of the empirical data distribution. Experiments on a variety of datasets demonstrate that our objective can not only disentangle discrete variables, but that doing so also improves disentanglement of other variables and, importantly, generalization even to unseen combinations of factors.
Tasks Latent Variable Models
Published 2018-04-06
URL http://arxiv.org/abs/1804.02086v4
PDF http://arxiv.org/pdf/1804.02086v4.pdf
PWC https://paperswithcode.com/paper/structured-disentangled-representations
Repo
Framework

Gradient-Based Low-Light Image Enhancement

Title Gradient-Based Low-Light Image Enhancement
Authors Masayuki Tanaka, Takashi Shibata, Masatoshi Okutomi
Abstract A low-light image enhancement is a highly demanded image processing technique, especially for consumer digital cameras and cameras on mobile phones. In this paper, a gradient-based low-light image enhancement algorithm is proposed. The key is to enhance the gradients of dark region, because the gradients are more sensitive for human visual system than absolute values. In addition, we involve the intensity-range constraints for the image integration. By using the intensity-range constraints, we can integrate the output image with enhanced gradients preserving the given gradient information while enforcing the intensity range of the output image within a certain intensity range. Experiments demonstrate that the proposed gradient-based low-light image enhancement can effectively enhance the low-light images.
Tasks Image Enhancement, Low-Light Image Enhancement
Published 2018-09-25
URL http://arxiv.org/abs/1809.09297v1
PDF http://arxiv.org/pdf/1809.09297v1.pdf
PWC https://paperswithcode.com/paper/gradient-based-low-light-image-enhancement
Repo
Framework

Action Categorization for Computationally Improved Task Learning and Planning

Title Action Categorization for Computationally Improved Task Learning and Planning
Authors Lakshmi Nair, Sonia Chernova
Abstract This paper explores the problem of task learning and planning, contributing the Action-Category Representation (ACR) to improve computational performance of both Planning and Reinforcement Learning (RL). ACR is an algorithm-agnostic, abstract data representation that maps objects to action categories (groups of actions), inspired by the psychological concept of action codes. We validate our approach in StarCraft and Lightworld domains; our results demonstrate several benefits of ACR relating to improved computational performance of planning and RL, by reducing the action space for the agent.
Tasks Starcraft
Published 2018-04-26
URL http://arxiv.org/abs/1804.09856v1
PDF http://arxiv.org/pdf/1804.09856v1.pdf
PWC https://paperswithcode.com/paper/action-categorization-for-computationally
Repo
Framework

Real-time stereo vision-based lane detection system

Title Real-time stereo vision-based lane detection system
Authors Rui Fan, Naim Dahnoun
Abstract The detection of multiple curved lane markings on a non-flat road surface is still a challenging task for automotive applications. To make an improvement, the depth information can be used to greatly enhance the robustness of the lane detection systems. The proposed system in this paper is developed from our previous work where the dense vanishing point Vp is estimated globally to assist the detection of multiple curved lane markings. However, the outliers in the optimal solution may severely affect the accuracy of the least squares fitting when estimating Vp. Therefore, in this paper we use Random Sample Consensus to update the inliers and outliers iteratively until the fraction of the number of inliers versus the total number exceeds our pre-set threshold. This significantly helps the system to overcome some suddenly changing conditions. Furthermore, we propose a novel lane position validation approach which provides a piecewise weight based on Vp and the gradient to reduce the gradient magnitude of the non-lane candidates. Then, we compute the energy of each possible solution and select all satisfying lane positions for visualisation. The proposed system is implemented on a heterogeneous system which consists of an Intel Core i7-4720HQ CPU and a NVIDIA GTX 970M GPU. A processing speed of 143 fps has been achieved, which is over 38 times faster than our previous work. Also, in order to evaluate the detection precision, we tested 2495 frames with 5361 lanes from the KITTI database (1637 lanes more than our previous experiment). It is shown that the overall successful detection rate is improved from 98.7% to 99.5%.
Tasks Lane Detection
Published 2018-07-08
URL http://arxiv.org/abs/1807.02752v1
PDF http://arxiv.org/pdf/1807.02752v1.pdf
PWC https://paperswithcode.com/paper/real-time-stereo-vision-based-lane-detection
Repo
Framework

Learning to See Forces: Surgical Force Prediction with RGB-Point Cloud Temporal Convolutional Networks

Title Learning to See Forces: Surgical Force Prediction with RGB-Point Cloud Temporal Convolutional Networks
Authors Cong Gao, Xingtong Liu, Michael Peven, Mathias Unberath, Austin Reiter
Abstract Robotic surgery has been proven to offer clear advantages during surgical procedures, however, one of the major limitations is obtaining haptic feedback. Since it is often challenging to devise a hardware solution with accurate force feedback, we propose the use of “visual cues” to infer forces from tissue deformation. Endoscopic video is a passive sensor that is freely available, in the sense that any minimally-invasive procedure already utilizes it. To this end, we employ deep learning to infer forces from video as an attractive low-cost and accurate alternative to typically complex and expensive hardware solutions. First, we demonstrate our approach in a phantom setting using the da Vinci Surgical System affixed with an OptoForce sensor. Second, we then validate our method on an ex vivo liver organ. Our method results in a mean absolute error of 0.814 N in the ex vivo study, suggesting that it may be a promising alternative to hardware based surgical force feedback in endoscopic procedures.
Tasks
Published 2018-07-31
URL http://arxiv.org/abs/1808.00057v1
PDF http://arxiv.org/pdf/1808.00057v1.pdf
PWC https://paperswithcode.com/paper/learning-to-see-forces-surgical-force
Repo
Framework

A Block Coordinate Descent Proximal Method for Simultaneous Filtering and Parameter Estimation

Title A Block Coordinate Descent Proximal Method for Simultaneous Filtering and Parameter Estimation
Authors Ramin Raziperchikolaei, Harish S. Bhat
Abstract We propose and analyze a block coordinate descent proximal algorithm (BCD-prox) for simultaneous filtering and parameter estimation of ODE models. As we show on ODE systems with up to d=40 dimensions, as compared to state-of-the-art methods, BCD-prox exhibits increased robustness (to noise, parameter initialization, and hyperparameters), decreased training times, and improved accuracy of both filtered states and estimated parameters. We show how BCD-prox can be used with multistep numerical discretizations, and we establish convergence of BCD-prox under hypotheses that include real systems of interest.
Tasks
Published 2018-10-16
URL https://arxiv.org/abs/1810.06759v2
PDF https://arxiv.org/pdf/1810.06759v2.pdf
PWC https://paperswithcode.com/paper/a-direct-method-to-learn-states-and
Repo
Framework

SUCAG: Stochastic Unbiased Curvature-aided Gradient Method for Distributed Optimization

Title SUCAG: Stochastic Unbiased Curvature-aided Gradient Method for Distributed Optimization
Authors Hoi-To Wai, Nikolaos M. Freris, Angelia Nedic, Anna Scaglione
Abstract We propose and analyze a new stochastic gradient method, which we call Stochastic Unbiased Curvature-aided Gradient (SUCAG), for finite sum optimization problems. SUCAG constitutes an unbiased total gradient tracking technique that uses Hessian information to accelerate con- vergence. We analyze our method under the general asynchronous model of computation, in which each function is selected infinitely often with possibly unbounded (but sublinear) delay. For strongly convex problems, we establish linear convergence for the SUCAG method. When the initialization point is sufficiently close to the optimal solution, the established convergence rate is only dependent on the condition number of the problem, making it strictly faster than the known rate for the SAGA method. Furthermore, we describe a Markov-driven approach of implementing the SUCAG method in a distributed asynchronous multi-agent setting, via gossiping along a random walk on an undirected communication graph. We show that our analysis applies as long as the graph is connected and, notably, establishes an asymptotic linear convergence rate that is robust to the graph topology. Numerical results demonstrate the merits of our algorithm over existing methods.
Tasks Distributed Optimization
Published 2018-03-22
URL http://arxiv.org/abs/1803.08198v2
PDF http://arxiv.org/pdf/1803.08198v2.pdf
PWC https://paperswithcode.com/paper/sucag-stochastic-unbiased-curvature-aided
Repo
Framework

Revisiting the Hierarchical Multiscale LSTM

Title Revisiting the Hierarchical Multiscale LSTM
Authors Ákos Kádár, Marc-Alexandre Côté, Grzegorz Chrupała, Afra Alishahi
Abstract Hierarchical Multiscale LSTM (Chung et al., 2016a) is a state-of-the-art language model that learns interpretable structure from character-level input. Such models can provide fertile ground for (cognitive) computational linguistics studies. However, the high complexity of the architecture, training procedure and implementations might hinder its applicability. We provide a detailed reproduction and ablation study of the architecture, shedding light on some of the potential caveats of re-purposing complex deep-learning architectures. We further show that simplifying certain aspects of the architecture can in fact improve its performance. We also investigate the linguistic units (segments) learned by various levels of the model, and argue that their quality does not correlate with the overall performance of the model on language modeling.
Tasks Language Modelling
Published 2018-07-10
URL http://arxiv.org/abs/1807.03595v1
PDF http://arxiv.org/pdf/1807.03595v1.pdf
PWC https://paperswithcode.com/paper/revisiting-the-hierarchical-multiscale-lstm
Repo
Framework
comments powered by Disqus