April 3, 2020

3087 words 15 mins read

Paper Group ANR 19

Paper Group ANR 19

Topologically sensitive metaheuristics. Unsupervised Learning of Camera Pose with Compositional Re-estimation. Real Time Reasoning in OWL2 for GDPR Compliance. Dataset Search In Biodiversity Research: Do Metadata In Data Repositories Reflect Scholarly Information Needs?. Capacity of Continuous Channels with Memory via Directed Information Neural Es …

Topologically sensitive metaheuristics

Title Topologically sensitive metaheuristics
Authors Aleksandar Kartelj, Vladimir Filipović, Siniša Vrećica, Rade Živaljević
Abstract This paper proposes topologically sensitive metaheuristics, and describes conceptual design of topologically sensitive Variable Neighborhood Search method (TVNS) and topologically sensitive Electromagnetism Metaheuristic (TEM).
Published 2020-02-25
URL https://arxiv.org/abs/2002.11164v1
PDF https://arxiv.org/pdf/2002.11164v1.pdf
PWC https://paperswithcode.com/paper/topologically-sensitive-metaheuristics

Unsupervised Learning of Camera Pose with Compositional Re-estimation

Title Unsupervised Learning of Camera Pose with Compositional Re-estimation
Authors Seyed Shahabeddin Nabavi, Mehrdad Hosseinzadeh, Ramin Fahimi, Yang Wang
Abstract We consider the problem of unsupervised camera pose estimation. Given an input video sequence, our goal is to estimate the camera pose (i.e. the camera motion) between consecutive frames. Traditionally, this problem is tackled by placing strict constraints on the transformation vector or by incorporating optical flow through a complex pipeline. We propose an alternative approach that utilizes a compositional re-estimation process for camera pose estimation. Given an input, we first estimate a depth map. Our method then iteratively estimates the camera motion based on the estimated depth map. Our approach significantly improves the predicted camera motion both quantitatively and visually. Furthermore, the re-estimation resolves the problem of out-of-boundaries pixels in a novel and simple way. Another advantage of our approach is that it is adaptable to other camera pose estimation approaches. Experimental analysis on KITTI benchmark dataset demonstrates that our method outperforms existing state-of-the-art approaches in unsupervised camera ego-motion estimation.
Tasks Depth And Camera Motion, Motion Estimation
Published 2020-01-17
URL https://arxiv.org/abs/2001.06479v1
PDF https://arxiv.org/pdf/2001.06479v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-learning-of-camera-pose-with

Real Time Reasoning in OWL2 for GDPR Compliance

Title Real Time Reasoning in OWL2 for GDPR Compliance
Authors P. A. Bonatti, L. Ioffredo, I. Petrova, L. Sauro, I. R. Siahaan
Abstract This paper shows how knowledge representation and reasoning techniques can be used to support organizations in complying with the GDPR, that is, the new European data protection regulation. This work is carried out in a European H2020 project called SPECIAL. Data usage policies, the consent of data subjects, and selected fragments of the GDPR are encoded in a fragment of OWL2 called PL (policy language); compliance checking and policy validation are reduced to subsumption checking and concept consistency checking. This work proposes a satisfactory tradeoff between the expressiveness requirements on PL posed by the GDPR, and the scalability requirements that arise from the use cases provided by SPECIAL’s industrial partners. Real-time compliance checking is achieved by means of a specialized reasoner, called PLR, that leverages knowledge compilation and structural subsumption techniques. The performance of a prototype implementation of PLR is analyzed through systematic experiments, and compared with the performance of other important reasoners. Moreover, we show how PL and PLR can be extended to support richer ontologies, by means of import-by-query techniques. PL and its integration with OWL2’s profiles constitute new tractable fragments of OWL2. We prove also some negative results, concerning the intractability of unrestricted reasoning in PL, and the limitations posed on ontology import.
Published 2020-01-15
URL https://arxiv.org/abs/2001.05390v1
PDF https://arxiv.org/pdf/2001.05390v1.pdf
PWC https://paperswithcode.com/paper/real-time-reasoning-in-owl2-for-gdpr

Dataset Search In Biodiversity Research: Do Metadata In Data Repositories Reflect Scholarly Information Needs?

Title Dataset Search In Biodiversity Research: Do Metadata In Data Repositories Reflect Scholarly Information Needs?
Authors Felicitas Löffler, Valentin Wesp, Birgitta König-Ries, Friederike Klan
Abstract The increasing amount of research data provides the opportunity to link and integrate data to create novel hypotheses, to repeat experiments or to compare recent data to data collected at a different time or place. However, recent studies have shown that retrieving relevant data for data reuse is a time-consuming task in daily research practice. In this study, we explore what hampers dataset retrieval in biodiversity research, a field that produces a large amount of heterogeneous data. We analyze the primary source in dataset search - metadata - and determine if they reflect scholarly search interests. We examine if metadata standards provide elements corresponding to search interests, we inspect if selected data repositories use metadata standards representing scholarly interests, and we determine how many fields of the metadata standards used are filled. To determine search interests in biodiversity research, we gathered 169 questions that researchers aimed to answer with the help of retrieved data, identified biological entities and grouped them into 13 categories. Our findings indicate that environments, materials and chemicals, species, biological and chemical processes, locations, data parameters and data types are important search interests in biodiversity research. The comparison with existing metadata standards shows that domain-specific standards cover search interests quite well, whereas general standards do not explicitly contain elements that reflect search interests. We inspect metadata from five large data repositories. Our results confirm that metadata currently poorly reflect search interests in biodiversity research. From these findings, we derive recommendations for researchers and data repositories how to bridge the gap between search interest and metadata provided.
Published 2020-02-27
URL https://arxiv.org/abs/2002.12021v1
PDF https://arxiv.org/pdf/2002.12021v1.pdf
PWC https://paperswithcode.com/paper/dataset-search-in-biodiversity-research-do

Capacity of Continuous Channels with Memory via Directed Information Neural Estimator

Title Capacity of Continuous Channels with Memory via Directed Information Neural Estimator
Authors Ziv Aharoni, Dor Tsur, Ziv Goldfeld, Haim Henry Permuter
Abstract Calculating the capacity (with or without feedback) of channels with memory and continuous alphabets is a challenging task. It requires optimizing the directed information rate over all channel input distributions. The objective is a multi-letter expression, whose analytic solution is only known for a few specific cases. When no analytic solution is present or the channel model is unknown, there is no unified framework for calculating or even approximating capacity. This work proposes a novel capacity estimation algorithm that treats the channel as a `black-box’, both when feedback is or is not present. The algorithm has two main ingredients: (i) a neural distribution transformer (NDT) model that shapes a noise variable into the channel input distribution, which we are able to sample, and (ii) the directed information neural estimator (DINE) that estimates the communication rate of the current NDT model. These models are trained by an alternating maximization procedure to both estimate the channel capacity and obtain an NDT for the optimal input distribution. The method is demonstrated on the moving average additive Gaussian noise channel, where it is shown that both the capacity and feedback capacity are estimated without knowledge of the channel transition kernel. The proposed estimation framework opens the door to a myriad of capacity approximation results for continuous alphabet channels that were inaccessible until now. |
Published 2020-03-09
URL https://arxiv.org/abs/2003.04179v1
PDF https://arxiv.org/pdf/2003.04179v1.pdf
PWC https://paperswithcode.com/paper/capacity-of-continuous-channels-with-memory

A Freeform Dielectric Metasurface Modeling Approach Based on Deep Neural Networks

Title A Freeform Dielectric Metasurface Modeling Approach Based on Deep Neural Networks
Authors Sensong An, Bowen Zheng, Mikhail Y. Shalaginov, Hong Tang, Hang Li, Li Zhou, Jun Ding, Anuradha Murthy Agarwal, Clara Rivero-Baleine, Myungkoo Kang, Kathleen A. Richardson, Tian Gu, Juejun Hu, Clayton Fowler, Hualiang Zhang
Abstract Metasurfaces have shown promising potentials in shaping optical wavefronts while remaining compact compared to bulky geometric optics devices. Design of meta-atoms, the fundamental building blocks of metasurfaces, relies on trial-and-error method to achieve target electromagnetic responses. This process includes the characterization of an enormous amount of different meta-atom designs with different physical and geometric parameters, which normally demands huge computational resources. In this paper, a deep learning-based metasurface/meta-atom modeling approach is introduced to significantly reduce the characterization time while maintaining accuracy. Based on a convolutional neural network (CNN) structure, the proposed deep learning network is able to model meta-atoms with free-form 2D patterns and different lattice sizes, material refractive indexes and thicknesses. Moreover, the presented approach features the capability to predict meta-atoms’ wide spectrum responses in the timescale of milliseconds, which makes it attractive for applications such as fast meta-atom/metasurface on-demand designs and optimizations.
Published 2020-01-01
URL https://arxiv.org/abs/2001.00121v1
PDF https://arxiv.org/pdf/2001.00121v1.pdf
PWC https://paperswithcode.com/paper/a-freeform-dielectric-metasurface-modeling

3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior

Title 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior
Authors Xiaokang Chen, Kwan-Yee Lin, Chen Qian, Gang Zeng, Hongsheng Li
Abstract The goal of the Semantic Scene Completion (SSC) task is to simultaneously predict a completed 3D voxel representation of volumetric occupancy and semantic labels of objects in the scene from a single-view observation. Since the computational cost generally increases explosively along with the growth of voxel resolution, most current state-of-the-arts have to tailor their framework into a low-resolution representation with the sacrifice of detail prediction. Thus, voxel resolution becomes one of the crucial difficulties that lead to the performance bottleneck. In this paper, we propose to devise a new geometry-based strategy to embed depth information with low-resolution voxel representation, which could still be able to encode sufficient geometric information, e.g., room layout, object’s sizes and shapes, to infer the invisible areas of the scene with well structure-preserving details. To this end, we first propose a novel 3D sketch-aware feature embedding to explicitly encode geometric information effectively and efficiently. With the 3D sketch in hand, we further devise a simple yet effective semantic scene completion framework that incorporates a light-weight 3D Sketch Hallucination module to guide the inference of occupancy and the semantic labels via a semi-supervised structure prior learning strategy. We demonstrate that our proposed geometric embedding works better than the depth feature learning from habitual SSC frameworks. Our final model surpasses state-of-the-arts consistently on three public benchmarks, which only requires 3D volumes of 60 x 36 x 60 resolution for both input and output. The code and the supplementary material will be available at https://charlesCXK.github.io.
Published 2020-03-31
URL https://arxiv.org/abs/2003.14052v1
PDF https://arxiv.org/pdf/2003.14052v1.pdf
PWC https://paperswithcode.com/paper/3d-sketch-aware-semantic-scene-completion-via

Tourism Demand Forecasting with Tourist Attention: An Ensemble Deep Learning Approach

Title Tourism Demand Forecasting with Tourist Attention: An Ensemble Deep Learning Approach
Authors Shaolong Sun, Yanzhao Li, Shouyang Wang, Ju-e Guo
Abstract The large amount of tourism-related data presents a series of challenges for tourism demand forecasting, including data deficiencies, multicollinearity and long calculation times. A bagging-based multivariate ensemble deep learning approach integrating stacked autoencoders and kernel-based extreme learning machines (B-SAKE) is proposed to address these challenges in this study. We forecast tourist arrivals in Beijing from four countries by adopting historical data on tourist arrivals in Beijing, economic indicators and online tourist behavior variables. The results from the cases of four origin countries suggest that our proposed B-SAKE approach outperforms than benchmark models in terms of horizontal accuracy, directional accuracy and statistical significance. Both bagging and stacked autoencoder can improve the forecasting performance of the models. Moreover, the forecasting performance of the models is evaluated with consistent results by means of the multi-step-ahead forecasting scheme.
Published 2020-02-19
URL https://arxiv.org/abs/2002.07964v2
PDF https://arxiv.org/pdf/2002.07964v2.pdf
PWC https://paperswithcode.com/paper/tourism-demand-forecasting-with-tourist

Super Resolution Using Segmentation-Prior Self-Attention Generative Adversarial Network

Title Super Resolution Using Segmentation-Prior Self-Attention Generative Adversarial Network
Authors Yuxin Zhang, Zuquan Zheng, Roland Hu
Abstract Convolutional Neural Network (CNN) is intensively implemented to solve super resolution (SR) tasks because of its superior performance. However, the problem of super resolution is still challenging due to the lack of prior knowledge and small receptive field of CNN. We propose the Segmentation-Piror Self-Attention Generative Adversarial Network (SPSAGAN) to combine segmentation-priors and feature attentions into a unified framework. This combination is led by a carefully designed weighted addition to balance the influence of feature and segmentation attentions, so that the network can emphasize textures in the same segmentation category and meanwhile focus on the long-distance feature relationship. We also propose a lightweight skip connection architecture called Residual-in-Residual Sparse Block (RRSB) to further improve the super-resolution performance and save computation. Extensive experiments show that SPSAGAN can generate more realistic and visually pleasing textures compared to state-of-the-art SFTGAN and ESRGAN on many SR datasets.
Tasks Super-Resolution
Published 2020-03-07
URL https://arxiv.org/abs/2003.03489v1
PDF https://arxiv.org/pdf/2003.03489v1.pdf
PWC https://paperswithcode.com/paper/super-resolution-using-segmentation-prior

On Emergent Communication in Competitive Multi-Agent Teams

Title On Emergent Communication in Competitive Multi-Agent Teams
Authors Paul Pu Liang, Jeffrey Chen, Ruslan Salakhutdinov, Louis-Philippe Morency, Satwik Kottur
Abstract Several recent works have found the emergence of grounded compositional language in the communication protocols developed by mostly cooperative multi-agent systems when learned end-to-end to maximize performance on a downstream task. However, human populations learn to solve complex tasks involving communicative behaviors not only in fully cooperative settings but also in scenarios where competition acts as an additional external pressure for improvement. In this work, we investigate whether competition for performance from an external, similar agent team could act as a social influence that encourages multi-agent populations to develop better communication protocols for improved performance, compositionality, and convergence speed. We start from Task & Talk, a previously proposed referential game between two cooperative agents as our testbed and extend it into Task, Talk & Compete, a game involving two competitive teams each consisting of two aforementioned cooperative agents. Using this new setting, we provide an empirical study demonstrating the impact of competitive influence on multi-agent teams. Our results show that an external competitive influence leads to improved accuracy and generalization, as well as faster emergence of communicative languages that are more informative and compositional.
Published 2020-03-04
URL https://arxiv.org/abs/2003.01848v1
PDF https://arxiv.org/pdf/2003.01848v1.pdf
PWC https://paperswithcode.com/paper/on-emergent-communication-in-competitive

Gaze-Sensing LEDs for Head Mounted Displays

Title Gaze-Sensing LEDs for Head Mounted Displays
Authors Kaan Akşit, Jan Kautz, David Luebke
Abstract We introduce a new gaze tracker for Head Mounted Displays (HMDs). We modify two off-the-shelf HMDs to be gaze-aware using Light Emitting Diodes (LEDs). Our key contribution is to exploit the sensing capability of LEDs to create low-power gaze tracker for virtual reality (VR) applications. This yields a simple approach using minimal hardware to achieve good accuracy and low latency using light-weight supervised Gaussian Process Regression (GPR) running on a mobile device. With our hardware, we show that Minkowski distance measure based GPR implementation outperforms the commonly used radial basis function-based support vector regression (SVR) without the need to precisely determine free parameters. We show that our gaze estimation method does not require complex dimension reduction techniques, feature extraction, or distortion corrections due to off-axis optical paths. We demonstrate two complete HMD prototypes with a sample eye-tracked application, and report on a series of subjective tests using our prototypes.
Tasks Dimensionality Reduction, Gaze Estimation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08499v1
PDF https://arxiv.org/pdf/2003.08499v1.pdf
PWC https://paperswithcode.com/paper/gaze-sensing-leds-for-head-mounted-displays

Crowd Scene Analysis by Output Encoding

Title Crowd Scene Analysis by Output Encoding
Authors Yao Xue, Siming Liu, Yonghui Li, Xueming Qian
Abstract Crowd scene analysis receives growing attention due to its wide applications. Grasping the accurate crowd location (rather than merely crowd count) is important for spatially identifying high-risk regions in congested scenes. In this paper, we propose a Compressed Sensing based Output Encoding (CSOE) scheme, which casts detecting pixel coordinates of small objects into a task of signal regression in encoding signal space. CSOE helps to boost localization performance in circumstances where targets are highly crowded without huge scale variation. In addition, proper receptive field sizes are crucial for crowd analysis due to human size variations. We create Multiple Dilated Convolution Branches (MDCB) that offers a set of different receptive field sizes, to improve localization accuracy when objects sizes change drastically in an image. Also, we develop an Adaptive Receptive Field Weighting (ARFW) module, which further deals with scale variation issue by adaptively emphasizing informative channels that have proper receptive field size. Experiments demonstrate the effectiveness of the proposed method, which achieves state-of-the-art performance across four mainstream datasets, especially achieves excellent results in highly crowded scenes. More importantly, experiments support our insights that it is crucial to tackle target size variation issue in crowd analysis task, and casting crowd localization as regression in encoding signal space is quite effective for crowd analysis.
Published 2020-01-27
URL https://arxiv.org/abs/2001.09556v1
PDF https://arxiv.org/pdf/2001.09556v1.pdf
PWC https://paperswithcode.com/paper/crowd-scene-analysis-by-output-encoding

Model-theoretic Characterizations of Existential Rule Languages

Title Model-theoretic Characterizations of Existential Rule Languages
Authors Heng Zhang, Yan Zhang, Guifei Jiang
Abstract Existential rules, a.k.a. dependencies in databases, and Datalog+/- in knowledge representation and reasoning recently, are a family of important logical languages widely used in computer science and artificial intelligence. Towards a deep understanding of these languages in model theory, we establish model-theoretic characterizations for a number of existential rule languages such as (disjunctive) embedded dependencies, tuple-generating dependencies (TGDs), (frontier-)guarded TGDs and linear TGDs. All these characterizations hold for arbitrary structures, and most of them also work on the class of finite structures. As a natural application of these characterizations, complexity bounds for the rewritability of above languages are also identified.
Published 2020-01-23
URL https://arxiv.org/abs/2001.08688v1
PDF https://arxiv.org/pdf/2001.08688v1.pdf
PWC https://paperswithcode.com/paper/model-theoretic-characterizations-of

Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

Title Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement
Authors Sai Bi, Kalyan Sunkavalli, Federico Perazzi, Eli Shechtman, Vladimir Kim, Ravi Ramamoorthi
Abstract We present a method to improve the visual realism of low-quality, synthetic images, e.g. OpenGL renderings. Training an unpaired synthetic-to-real translation network in image space is severely under-constrained and produces visible artifacts. Instead, we propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image. Our two-stage pipeline first learns to predict accurate shading in a supervised fashion using physically-based renderings as targets, and further increases the realism of the textures and shading with an improved CycleGAN network. Extensive evaluations on the SUNCG indoor scene dataset demonstrate that our approach yields more realistic images compared to other state-of-the-art approaches. Furthermore, networks trained on our generated “real” images predict more accurate depth and normals than domain adaptation approaches, suggesting that improving the visual realism of the images can be more effective than imposing task-specific losses.
Tasks Domain Adaptation, Synthetic-to-Real Translation
Published 2020-03-27
URL https://arxiv.org/abs/2003.12649v1
PDF https://arxiv.org/pdf/2003.12649v1.pdf
PWC https://paperswithcode.com/paper/deep-cg2real-synthetic-to-real-translation-1

How Not to Give a FLOP: Combining Regularization and Pruning for Efficient Inference

Title How Not to Give a FLOP: Combining Regularization and Pruning for Efficient Inference
Authors Tai Vu, Emily Wen, Roy Nehoran
Abstract The challenge of speeding up deep learning models during the deployment phase has been a large, expensive bottleneck in the modern tech industry. In this paper, we examine the use of both regularization and pruning for reduced computational complexity and more efficient inference in Deep Neural Networks (DNNs). In particular, we apply mixup and cutout regularizations and soft filter pruning to the ResNet architecture, focusing on minimizing floating point operations (FLOPs). Furthermore, by using regularization in conjunction with network pruning, we show that such a combination makes a substantial improvement over each of the two techniques individually.
Tasks Network Pruning
Published 2020-03-30
URL https://arxiv.org/abs/2003.13593v1
PDF https://arxiv.org/pdf/2003.13593v1.pdf
PWC https://paperswithcode.com/paper/how-not-to-give-a-flop-combining
comments powered by Disqus