April 3, 2020

3070 words 15 mins read

Paper Group ANR 53

Paper Group ANR 53

MODRL/D-AM: Multiobjective Deep Reinforcement Learning Algorithm Using Decomposition and Attention Model for Multiobjective Optimization. Minimum entropy production in multipartite processes due to neighborhood constraints. GISNet: Graph-Based Information Sharing Network For Vehicle Trajectory Prediction. A Multi-Modal States based Vehicle Descript …

MODRL/D-AM: Multiobjective Deep Reinforcement Learning Algorithm Using Decomposition and Attention Model for Multiobjective Optimization

Title MODRL/D-AM: Multiobjective Deep Reinforcement Learning Algorithm Using Decomposition and Attention Model for Multiobjective Optimization
Authors Hong Wu, Jiahai Wang, Zizhen Zhang
Abstract Recently, a deep reinforcement learning method is proposed to solve multiobjective optimization problem. In this method, the multiobjective optimization problem is decomposed to a number of single-objective optimization subproblems and all the subproblems are optimized in a collaborative manner. Each subproblem is modeled with a pointer network and the model is trained with reinforcement learning. However, when pointer network extracts the features of an instance, it ignores the underlying structure information of the input nodes. Thus, this paper proposes a multiobjective deep reinforcement learning method using decomposition and attention model to solve multiobjective optimization problem. In our method, each subproblem is solved by an attention model, which can exploit the structure features as well as node features of input nodes. The experiment results on multiobjective travelling salesman problem show the proposed algorithm achieves better performance compared with the previous method.
Tasks Multiobjective Optimization
Published 2020-02-13
URL https://arxiv.org/abs/2002.05484v1
PDF https://arxiv.org/pdf/2002.05484v1.pdf
PWC https://paperswithcode.com/paper/modrld-am-multiobjective-deep-reinforcement

Minimum entropy production in multipartite processes due to neighborhood constraints

Title Minimum entropy production in multipartite processes due to neighborhood constraints
Authors David H Wolpert
Abstract I derive two lower bounds on the minimal achievable entropy production rate of a multipartite process when there are constraints on how the neighborhoods of the rate matrices of the subsystems can overlap. The first bound is based on constructing counterfactual rate matrices, in which all subsystems outside of a particular neighborhood are held fixed while those inside the neighborhood are allowed to evolve. This bound is related to the “learning rate” of stationary bipartite systems. The second bound is based on applying the inclusion-exclusion principle to the neighborhood overlaps.
Published 2020-01-07
URL https://arxiv.org/abs/2001.02205v2
PDF https://arxiv.org/pdf/2001.02205v2.pdf
PWC https://paperswithcode.com/paper/minimum-entropy-production-in-multipartite

GISNet: Graph-Based Information Sharing Network For Vehicle Trajectory Prediction

Title GISNet: Graph-Based Information Sharing Network For Vehicle Trajectory Prediction
Authors Ziyi Zhao, Haowen Fang, Zhao Jin, Qinru Qiu
Abstract The trajectory prediction is a critical and challenging problem in the design of an autonomous driving system. Many AI-oriented companies, such as Google Waymo, Uber and DiDi, are investigating more accurate vehicle trajectory prediction algorithms. However, the prediction performance is governed by lots of entangled factors, such as the stochastic behaviors of surrounding vehicles, historical information of self-trajectory, and relative positions of neighbors, etc. In this paper, we propose a novel graph-based information sharing network (GISNet) that allows the information sharing between the target vehicle and its surrounding vehicles. Meanwhile, the model encodes the historical trajectory information of all the vehicles in the scene. Experiments are carried out on the public NGSIM US-101 and I-80 Dataset and the prediction performance is measured by the Root Mean Square Error (RMSE). The quantitative and qualitative experimental results show that our model significantly improves the trajectory prediction accuracy, by up to 50.00%, compared to existing models.
Tasks Autonomous Driving, Trajectory Prediction
Published 2020-03-22
URL https://arxiv.org/abs/2003.11973v1
PDF https://arxiv.org/pdf/2003.11973v1.pdf
PWC https://paperswithcode.com/paper/gisnet-graph-based-information-sharing

A Multi-Modal States based Vehicle Descriptor and Dilated Convolutional Social Pooling for Vehicle Trajectory Prediction

Title A Multi-Modal States based Vehicle Descriptor and Dilated Convolutional Social Pooling for Vehicle Trajectory Prediction
Authors Huimin Zhang, Yafei Wang, Junjia Liu, Chengwei Li, Taiyuan Ma, Chengliang Yin
Abstract Precise trajectory prediction of surrounding vehicles is critical for decision-making of autonomous vehicles and learning-based approaches are well recognized for the robustness. However, state-of-the-art learning-based methods ignore 1) the feasibility of the vehicle’s multi-modal state information for prediction and 2) the mutual exclusive relationship between the global traffic scene receptive fields and the local position resolution when modeling vehicles’ interactions, which may influence prediction accuracy. Therefore, we propose a vehicle-descriptor based LSTM model with the dilated convolutional social pooling (VD+DCS-LSTM) to cope with the above issues. First, each vehicle’s multi-modal state information is employed as our model’s input and a new vehicle descriptor encoded by stacked sparse auto-encoders is proposed to reflect the deep interactive relationships between various states, achieving the optimal feature extraction and effective use of multi-modal inputs. Secondly, the LSTM encoder is used to encode the historical sequences composed of the vehicle descriptor and a novel dilated convolutional social pooling is proposed to improve modeling vehicles’ spatial interactions. Thirdly, the LSTM decoder is used to predict the probability distribution of future trajectories based on maneuvers. The validity of the overall model was verified over the NGSIM US-101 and I-80 datasets and our method outperforms the latest benchmark.
Tasks Autonomous Vehicles, Decision Making, Trajectory Prediction
Published 2020-03-07
URL https://arxiv.org/abs/2003.03480v1
PDF https://arxiv.org/pdf/2003.03480v1.pdf
PWC https://paperswithcode.com/paper/a-multi-modal-states-based-vehicle-descriptor

From W-Net to CDGAN: Bi-temporal Change Detection via Deep Learning Techniques

Title From W-Net to CDGAN: Bi-temporal Change Detection via Deep Learning Techniques
Authors Bin Hou, Qingjie Liu, Heng Wang, Yunhong Wang
Abstract Traditional change detection methods usually follow the image differencing, change feature extraction and classification framework, and their performance is limited by such simple image domain differencing and also the hand-crafted features. Recently, the success of deep convolutional neural networks (CNNs) has widely spread across the whole field of computer vision for their powerful representation abilities. In this paper, we therefore address the remote sensing image change detection problem with deep learning techniques. We firstly propose an end-to-end dual-branch architecture, termed as the W-Net, with each branch taking as input one of the two bi-temporal images as in the traditional change detection models. In this way, CNN features with more powerful representative abilities can be obtained to boost the final detection performance. Also, W-Net performs differencing in the feature domain rather than in the traditional image domain, which greatly alleviates loss of useful information for determining the changes. Furthermore, by reformulating change detection as an image translation problem, we apply the recently popular Generative Adversarial Network (GAN) in which our W-Net serves as the Generator, leading to a new GAN architecture for change detection which we call CDGAN. To train our networks and also facilitate future research, we construct a large scale dataset by collecting images from Google Earth and provide carefully manually annotated ground truths. Experiments show that our proposed methods can provide fine-grained change detection results superior to the existing state-of-the-art baselines.
Published 2020-03-14
URL https://arxiv.org/abs/2003.06583v1
PDF https://arxiv.org/pdf/2003.06583v1.pdf
PWC https://paperswithcode.com/paper/from-w-net-to-cdgan-bi-temporal-change

Exactly Computing the Local Lipschitz Constant of ReLU Networks

Title Exactly Computing the Local Lipschitz Constant of ReLU Networks
Authors Matt Jordan, Alexandros G. Dimakis
Abstract The Lipschitz constant of a neural network is a useful metric for provable robustness and generalization. We present a novel analytic result which relates gradient norms to Lipschitz constants for nondifferentiable functions. Next we prove hardness and inapproximability results for computing the local Lipschitz constant of ReLU neural networks. We develop a mixed-integer programming formulation to exactly compute the local Lipschitz constant for scalar and vector-valued networks. Finally, we apply our technique on networks trained on synthetic datasets and MNIST, drawing observations about the tightness of competing Lipschitz estimators and the effects of regularized training on Lipschitz constants.
Published 2020-03-02
URL https://arxiv.org/abs/2003.01219v1
PDF https://arxiv.org/pdf/2003.01219v1.pdf
PWC https://paperswithcode.com/paper/exactly-computing-the-local-lipschitz

Interpretable Deep Recurrent Neural Networks via Unfolding Reweighted $\ell_1$-$\ell_1$ Minimization: Architecture Design and Generalization Analysis

Title Interpretable Deep Recurrent Neural Networks via Unfolding Reweighted $\ell_1$-$\ell_1$ Minimization: Architecture Design and Generalization Analysis
Authors Huynh Van Luong, Boris Joukovsky, Nikos Deligiannis
Abstract Deep unfolding methods—for example, the learned iterative shrinkage thresholding algorithm (LISTA)—design deep neural networks as learned variations of optimization methods. These networks have been shown to achieve faster convergence and higher accuracy than the original optimization methods. In this line of research, this paper develops a novel deep recurrent neural network (coined reweighted-RNN) by the unfolding of a reweighted $\ell_1$-$\ell_1$ minimization algorithm and applies it to the task of sequential signal reconstruction. To the best of our knowledge, this is the first deep unfolding method that explores reweighted minimization. Due to the underlying reweighted minimization model, our RNN has a different soft-thresholding function (alias, different activation functions) for each hidden unit in each layer. Furthermore, it has higher network expressivity than existing deep unfolding RNN models due to the over-parameterizing weights. Importantly, we establish theoretical generalization error bounds for the proposed reweighted-RNN model by means of Rademacher complexity. The bounds reveal that the parameterization of the proposed reweighted-RNN ensures good generalization. We apply the proposed reweighted-RNN to the problem of video frame reconstruction from low-dimensional measurements, that is, sequential frame reconstruction. The experimental results on the moving MNIST dataset demonstrate that the proposed deep reweighted-RNN significantly outperforms existing RNN models.
Published 2020-03-18
URL https://arxiv.org/abs/2003.08334v1
PDF https://arxiv.org/pdf/2003.08334v1.pdf
PWC https://paperswithcode.com/paper/interpretable-deep-recurrent-neural-networks

Cooperative LIDAR Object Detection via Feature Sharing in Deep Networks

Title Cooperative LIDAR Object Detection via Feature Sharing in Deep Networks
Authors Ehsan Emad Marvasti, Arash Raftari, Amir Emad Marvasti, Yaser P. Fallah, Rui Guo, HongSheng Lu
Abstract The recent advancements in communication and computational systems has led to significant improvement of situational awareness in connected and autonomous vehicles. Computationally efficient neural networks and high speed wireless vehicular networks have been some of the main contributors to this improvement. However, scalability and reliability issues caused by inherent limitations of sensory and communication systems are still challenging problems. In this paper, we aim to mitigate the effects of these limitations by introducing the concept of feature sharing for cooperative object detection (FS-COD). In our proposed approach, a better understanding of the environment is achieved by sharing partially processed data between cooperative vehicles while maintaining a balance between computation and communication load. This approach is different from current methods of map sharing, or sharing of raw data which are not scalable. The performance of the proposed approach is verified through experiments on Volony dataset. It is shown that the proposed approach has significant performance superiority over the conventional single-vehicle object detection approaches.
Tasks Autonomous Vehicles, Object Detection
Published 2020-02-19
URL https://arxiv.org/abs/2002.08440v1
PDF https://arxiv.org/pdf/2002.08440v1.pdf
PWC https://paperswithcode.com/paper/cooperative-lidar-object-detection-via

General 3D Room Layout from a Single View by Render-and-Compare

Title General 3D Room Layout from a Single View by Render-and-Compare
Authors Sinisa Stekovic, Friedrich Fraundorfer, Vincent Lepetit
Abstract We present a novel method to reconstruct the 3D layout of a room – walls,floors, ceilings – from a single perspective view, even for the case of general configurations. This input view can consist of a color image only, but considering a depth map will result in a more accurate reconstruction. Our approach is based on solving a constrained discrete optimization problem, which selects the polygons which are part of the layout from a large set of potential polygons. In order to deal with occlusions between components of the layout, which is a problem ignored by previous works, we introduce an analysis-by-synthesis method to iteratively refine the 3D layout estimate. To the best of our knowledge, our method is the first that can estimate a layout in such general conditions from a single view. We additionally introduce a new annotation dataset made of 91 images from the ScanNet dataset and several metrics, in order to evaluate our results quantitatively.
Published 2020-01-07
URL https://arxiv.org/abs/2001.02149v1
PDF https://arxiv.org/pdf/2001.02149v1.pdf
PWC https://paperswithcode.com/paper/general-3d-room-layout-from-a-single-view-by

Empirical Policy Evaluation with Supergraphs

Title Empirical Policy Evaluation with Supergraphs
Authors Daniel Vial, Vijay Subramanian
Abstract We devise and analyze algorithms for the empirical policy evaluation problem in reinforcement learning. Our algorithms explore backward from high-cost states to find high-value ones, in contrast to forward approaches that work forward from all states. While several papers have demonstrated the utility of backward exploration empirically, we conduct rigorous analyses which show that our algorithms can reduce average-case sample complexity from $O(S \log S)$ to as low as $O(\log S)$.
Published 2020-02-18
URL https://arxiv.org/abs/2002.07905v1
PDF https://arxiv.org/pdf/2002.07905v1.pdf
PWC https://paperswithcode.com/paper/empirical-policy-evaluation-with-supergraphs

State Transition Modeling of the Smoking Behavior using LSTM Recurrent Neural Networks

Title State Transition Modeling of the Smoking Behavior using LSTM Recurrent Neural Networks
Authors Chrisogonas O. Odhiambo, Casey A. Cole, Alaleh Torkjazi, Homayoun Valafar
Abstract The use of sensors has pervaded everyday life in several applications including human activity monitoring, healthcare, and social networks. In this study, we focus on the use of smartwatch sensors to recognize smoking activity. More specifically, we have reformulated the previous work in detection of smoking to include in-context recognition of smoking. Our presented reformulation of the smoking gesture as a state-transition model that consists of the mini-gestures hand-to-lip, hand-on-lip, and hand-off-lip, has demonstrated improvement in detection rates nearing 100% using conventional neural networks. In addition, we have begun the utilization of Long-Short-Term Memory (LSTM) neural networks to allow for in-context detection of gestures with accuracy nearing 97%.
Published 2020-01-07
URL https://arxiv.org/abs/2001.02101v1
PDF https://arxiv.org/pdf/2001.02101v1.pdf
PWC https://paperswithcode.com/paper/state-transition-modeling-of-the-smoking

A Machine Consciousness architecture based on Deep Learning and Gaussian Processes

Title A Machine Consciousness architecture based on Deep Learning and Gaussian Processes
Authors Eduardo C. Garrido Merchán, Martín Molina
Abstract Recent developments in machine learning have pushed the tasks that machines can do outside the boundaries of what was thought to be possible years ago. Methodologies such as deep learning or generative models have achieved complex tasks such as generating art pictures or literature automatically. On the other hand, symbolic resources have also been developed further and behave well in problems such as the ones proposed by common sense reasoning. Machine Consciousness is a field that has been deeply studied and several theories based in the functionalism philosophical theory like the global workspace theory or information integration have been proposed that try to explain the ariseness of consciousness in machines. In this work, we propose an architecture that may arise consciousness in a machine based in the global workspace theory and in the assumption that consciousness appear in machines that has cognitive processes and exhibit conscious behaviour. This architecture is based in processes that use the recent developments in artificial intelligence models which output are these correlated activities. For every one of the modules of this architecture, we provide detailed explanations of the models involved and how they communicate with each other to create the cognitive architecture.
Tasks Common Sense Reasoning, Gaussian Processes
Published 2020-02-02
URL https://arxiv.org/abs/2002.00509v2
PDF https://arxiv.org/pdf/2002.00509v2.pdf
PWC https://paperswithcode.com/paper/a-machine-consciousness-architecture-based-on

A comparison of Vector Symbolic Architectures

Title A comparison of Vector Symbolic Architectures
Authors Kenny Schlegel, Peer Neubert, Peter Protzel
Abstract Vector Symbolic Architectures (VSAs) combine a high-dimensional vector space with a set of carefully designed operators in order to perform symbolic computations with large numerical vectors. Major goals are the exploitation of their representational power and ability to deal with fuzziness and ambiguity. Over the past years, VSAs have been applied to a broad range of tasks and several VSA implementations have been proposed. The available implementations differ in the underlying vector space (e.g., binary vectors or complex-valued vectors) and the particular implementations of the required VSA operators - with important ramifications for the properties of these architectures. For example, not every VSA is equally well suited to address each task, including complete incompatibility. In this paper, we give an overview of eight available VSA implementations and discuss their commonalities and differences in the underlying vector space, bundling, and binding/unbinding operations. We create a taxonomy of available binding/unbinding operations and show an important ramification for non self-inverse binding operation using an example from analogical reasoning. A main contribution is the experimental comparison of the available implementations regarding (1) the capacity of bundles, (2) the approximation quality of non-exact unbinding operations, and (3) the influence of combined binding and bundling operations on the query answering performance. We expect this systematization and comparison to be relevant for development and evaluation of new VSAs, but most importantly, to support the selection of an appropriate VSA for a particular task.
Published 2020-01-31
URL https://arxiv.org/abs/2001.11797v2
PDF https://arxiv.org/pdf/2001.11797v2.pdf
PWC https://paperswithcode.com/paper/a-comparison-of-vector-symbolic-architectures

Controllable Level Blending between Games using Variational Autoencoders

Title Controllable Level Blending between Games using Variational Autoencoders
Authors Anurag Sarkar, Zhihan Yang, Seth Cooper
Abstract Previous work explored blending levels from existing games to create levels for a new game that mixes properties of the original games. In this paper, we use Variational Autoencoders (VAEs) for improving upon such techniques. VAEs are artificial neural networks that learn and use latent representations of datasets to generate novel outputs. We train a VAE on level data from Super Mario Bros. and Kid Icarus, enabling it to capture the latent space spanning both games. We then use this space to generate level segments that combine properties of levels from both games. Moreover, by applying evolutionary search in the latent space, we evolve level segments satisfying specific constraints. We argue that these affordances make the VAE-based approach especially suitable for co-creative level design and compare its performance with similar generative models like the GAN and the VAE-GAN.
Published 2020-02-27
URL https://arxiv.org/abs/2002.11869v1
PDF https://arxiv.org/pdf/2002.11869v1.pdf
PWC https://paperswithcode.com/paper/controllable-level-blending-between-games

Using a Generative Adversarial Network for CT Normalization and its Impact on Radiomic Features

Title Using a Generative Adversarial Network for CT Normalization and its Impact on Radiomic Features
Authors Leihao Wei, Yannan Lin, William Hsu
Abstract Computer-Aided-Diagnosis (CADx) systems assist radiologists with identifying and classifying potentially malignant pulmonary nodules on chest CT scans using morphology and texture-based (radiomic) features. However, radiomic features are sensitive to differences in acquisitions due to variations in dose levels and slice thickness. This study investigates the feasibility of generating a normalized scan from heterogeneous CT scans as input. We obtained projection data from 40 low-dose chest CT scans, simulating acquisitions at 10%, 25% and 50% dose and reconstructing the scans at 1.0mm and 2.0mm slice thickness. A 3D generative adversarial network (GAN) was used to simultaneously normalize reduced dose, thick slice (2.0mm) images to normal dose (100%), thinner slice (1.0mm) images. We evaluated the normalized image quality using peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) and Learned Perceptual Image Patch Similarity (LPIPS). Our GAN improved perceptual similarity by 35%, compared to a baseline CNN method. Our analysis also shows that the GAN-based approach led to a significantly smaller error (p-value < 0.05) in nine studied radiomic features. These results indicated that GANs could be used to normalize heterogeneous CT images and reduce the variability in radiomic feature values.
Published 2020-01-22
URL https://arxiv.org/abs/2001.08741v1
PDF https://arxiv.org/pdf/2001.08741v1.pdf
PWC https://paperswithcode.com/paper/using-a-generative-adversarial-network-for-ct
comments powered by Disqus