April 3, 2020

3485 words 17 mins read

Paper Group AWR 8

Exploring vestibulo-ocular adaptation in a closed-loop neuro-robotic experiment using STDP. A simulation study. Neuroevolution of Self-Interpretable Agents. Teddy: A System for Interactive Review Analysis. Multilinear Compressive Learning with Prior Knowledge. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution. Weak Supe …

Exploring vestibulo-ocular adaptation in a closed-loop neuro-robotic experiment using STDP. A simulation study


Title	Exploring vestibulo-ocular adaptation in a closed-loop neuro-robotic experiment using STDP. A simulation study
Authors	Francisco Naveros, Jesus A. Garrido, Angelo Arleo, Eduardo Ros, Niceto R. Luque
Abstract	Studying and understanding the computational primitives of our neural system requires for a diverse and complementary set of techniques. In this work, we use the Neuro-robotic Platform (NRP)to evaluate the vestibulo ocular cerebellar adaptatIon (Vestibulo-ocular reflex, VOR)mediated by two STDP mechanisms located at the cerebellar molecular layer and the vestibular nuclei respectively. This simulation study adopts an experimental setup (rotatory VOR)widely used by neuroscientists to better understand the contribution of certain specific cerebellar properties (i.e. distributed STDP, neural properties, coding cerebellar topology, etc.)to r-VOR adaptation. The work proposes and describes an embodiment solution for which we endow a simulated humanoid robot (iCub)with a spiking cerebellar model by means of the NRP, and we face the humanoid to an r-VOR task. The results validate the adaptive capabilities of the spiking cerebellar model (with STDP)in a perception-action closed-loop (r- VOR)causing the simulated iCub robot to mimic a human behavior.
Tasks
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01445v1
PDF	https://arxiv.org/pdf/2003.01445v1.pdf
PWC	https://paperswithcode.com/paper/exploring-vestibulo-ocular-adaptation-in-a
Repo	https://github.com/EduardoRosLab/VOR_in_neurorobotics
Framework	none

Neuroevolution of Self-Interpretable Agents


Title	Neuroevolution of Self-Interpretable Agents
Authors	Yujin Tang, Duong Nguyen, David Ha
Abstract	Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight. It is a consequence of the selective attention in perception that lets us remain focused on important parts of our world without distraction from irrelevant details. Motivated by selective attention, we study the properties of artificial agents that perceive the world through the lens of a self-attention bottleneck. By constraining access to only a small fraction of the visual input, we show that their policies are directly interpretable in pixel space. We find neuroevolution ideal for training self-attention architectures for vision-based reinforcement learning (RL) tasks, allowing us to incorporate modules that can include discrete, non-differentiable operations which are useful for our agent. We argue that self-attention has similar properties as indirect encoding, in the sense that large implicit weight matrices are generated from a small number of key-query parameters, thus enabling our agent to solve challenging vision based tasks with at least 1000x fewer parameters than existing methods. Since our agent attends to only task critical visual hints, they are able to generalize to environments where task irrelevant elements are modified while conventional methods fail. Videos of our results and source code available at https://attentionagent.github.io/
Tasks
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08165v1
PDF	https://arxiv.org/pdf/2003.08165v1.pdf
PWC	https://paperswithcode.com/paper/neuroevolution-of-self-interpretable-agents
Repo	https://github.com/google/brain-tokyo-workshop
Framework	none

Teddy: A System for Interactive Review Analysis


Title	Teddy: A System for Interactive Review Analysis
Authors	Xiong Zhang, Jonathan Engel, Sara Evensen, Yuliang Li, Çağatay Demiralp, Wang-Chiew Tan
Abstract	Reviews are integral to e-commerce services and products. They contain a wealth of information about the opinions and experiences of users, which can help better understand consumer decisions and improve user experience with products and services. Today, data scientists analyze reviews by developing rules and models to extract, aggregate, and understand information embedded in the review text. However, working with thousands of reviews, which are typically noisy incomplete text, can be daunting without proper tools. Here we first contribute results from an interview study that we conducted with fifteen data scientists who work with review text, providing insights into their practices and challenges. Results suggest data scientists need interactive systems for many review analysis tasks. In response we introduce Teddy, an interactive system that enables data scientists to quickly obtain insights from reviews and improve their extraction and modeling pipelines.
Tasks
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05171v1
PDF	https://arxiv.org/pdf/2001.05171v1.pdf
PWC	https://paperswithcode.com/paper/teddy-a-system-for-interactive-review
Repo	https://github.com/megagonlabs/teddy
Framework	none

Multilinear Compressive Learning with Prior Knowledge


Title	Multilinear Compressive Learning with Prior Knowledge
Authors	Dat Thanh Tran, Moncef Gabbouj, Alexandros Iosifidis
Abstract	The recently proposed Multilinear Compressive Learning (MCL) framework combines Multilinear Compressive Sensing and Machine Learning into an end-to-end system that takes into account the multidimensional structure of the signals when designing the sensing and feature synthesis components. The key idea behind MCL is the assumption of the existence of a tensor subspace which can capture the essential features from the signal for the downstream learning task. Thus, the ability to find such a discriminative tensor subspace and optimize the system to project the signals onto that data manifold plays an important role in Multilinear Compressive Learning. In this paper, we propose a novel solution to address both of the aforementioned requirements, i.e., How to find those tensor subspaces in which the signals of interest are highly separable? and How to optimize the sensing and feature synthesis components to transform the original signals to the data manifold found in the first question? In our proposal, the discovery of a high-quality data manifold is conducted by training a nonlinear compressive learning system on the inference task. Its knowledge of the data manifold of interest is then progressively transferred to the MCL components via multi-stage supervised training with the supervisory information encoding how the compressed measurements, the synthesized features, and the predictions should be like. The proposed knowledge transfer algorithm also comes with a semi-supervised adaption that enables compressive learning models to utilize unlabeled data effectively. Extensive experiments demonstrate that the proposed knowledge transfer method can effectively train MCL models to compressively sense and synthesize better features for the learning tasks with improved performances, especially when the complexity of the learning task increases.
Tasks	Compressive Sensing, Transfer Learning
Published	2020-02-17
URL	https://arxiv.org/abs/2002.07203v1
PDF	https://arxiv.org/pdf/2002.07203v1.pdf
PWC	https://paperswithcode.com/paper/multilinear-compressive-learning-with-prior
Repo	https://github.com/viebboy/MultilinearCompressiveLearningWithPrior
Framework	tf

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution


Title	Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
Authors	Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach, Chenliang Xu
Abstract	In this paper, we explore the space-time video super-resolution task, which aims to generate a high-resolution (HR) slow-motion video from a low frame rate (LFR), low-resolution (LR) video. A simple solution is to split it into two sub-tasks: video frame interpolation (VFI) and video super-resolution (VSR). However, temporal interpolation and spatial super-resolution are intra-related in this task. Two-stage methods cannot fully take advantage of the natural property. In addition, state-of-the-art VFI or VSR networks require a large frame-synthesis or reconstruction module for predicting high-quality video frames, which makes the two-stage methods have large model sizes and thus be time-consuming. To overcome the problems, we propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video. Rather than synthesizing missing LR video frames as VFI networks do, we firstly temporally interpolate LR frame features in missing LR video frames capturing local temporal contexts by the proposed feature temporal interpolation network. Then, we propose a deformable ConvLSTM to align and aggregate temporal information simultaneously for better leveraging global temporal contexts. Finally, a deep reconstruction network is adopted to predict HR slow-motion video frames. Extensive experiments on benchmark datasets demonstrate that the proposed method not only achieves better quantitative and qualitative performance but also is more than three times faster than recent two-stage state-of-the-art methods, e.g., DAIN+EDVR and DAIN+RBPN.
Tasks	Super-Resolution, Video Frame Interpolation, Video Super-Resolution
Published	2020-02-26
URL	https://arxiv.org/abs/2002.11616v1
PDF	https://arxiv.org/pdf/2002.11616v1.pdf
PWC	https://paperswithcode.com/paper/zooming-slow-mo-fast-and-accurate-one-stage
Repo	https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020
Framework	pytorch

Weak Supervision in Convolutional Neural Network for Semantic Segmentation of Diffuse Lung Diseases Using Partially Annotated Dataset


Title	Weak Supervision in Convolutional Neural Network for Semantic Segmentation of Diffuse Lung Diseases Using Partially Annotated Dataset
Authors	Yuki Suzuki, Kazuki Yamagata, Yanagawa Masahiro, Shoji Kido, Noriyuki Tomiyama
Abstract	Computer-aided diagnosis system for diffuse lung diseases (DLDs) is necessary for the objective assessment of the lung diseases. In this paper, we develop semantic segmentation model for 5 kinds of DLDs. DLDs considered in this work are consolidation, ground glass opacity, honeycombing, emphysema, and normal. Convolutional neural network (CNN) is one of the most promising technique for semantic segmentation among machine learning algorithms. While creating annotated dataset for semantic segmentation is laborious and time consuming, creating partially annotated dataset, in which only one chosen class is annotated for each image, is easier since annotators only need to focus on one class at a time during the annotation task. In this paper, we propose a new weak supervision technique that effectively utilizes partially annotated dataset. The experiments using partially annotated dataset composed 372 CT images demonstrated that our proposed technique significantly improved segmentation accuracy.
Tasks	Semantic Segmentation
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11936v2
PDF	https://arxiv.org/pdf/2002.11936v2.pdf
PWC	https://paperswithcode.com/paper/weak-supervision-in-convolutional-neural
Repo	https://github.com/yk-szk/SPIE2020
Framework	none

Heterogeneous Graph Transformer


Title	Heterogeneous Graph Transformer
Authors	Ziniu Hu, Yuxiao Dong, Kuansan Wang, Yizhou Sun
Abstract	Recent years have witnessed the emerging success of graph neural networks (GNNs) for modeling structured data. However, most GNNs are designed for homogeneous graphs, in which all nodes and edges belong to the same types, making them infeasible to represent heterogeneous structures. In this paper, we present the Heterogeneous Graph Transformer (HGT) architecture for modeling Web-scale heterogeneous graphs. To model heterogeneity, we design node- and edge-type dependent parameters to characterize the heterogeneous attention over each edge, empowering HGT to maintain dedicated representations for different types of nodes and edges. To handle dynamic heterogeneous graphs, we introduce the relative temporal encoding technique into HGT, which is able to capture the dynamic structural dependency with arbitrary durations. To handle Web-scale graph data, we design the heterogeneous mini-batch graph sampling algorithm—HGSampling—for efficient and scalable training. Extensive experiments on the Open Academic Graph of 179 million nodes and 2 billion edges show that the proposed HGT model consistently outperforms all the state-of-the-art GNN baselines by 9%–21% on various downstream tasks.
Tasks
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01332v1
PDF	https://arxiv.org/pdf/2003.01332v1.pdf
PWC	https://paperswithcode.com/paper/heterogeneous-graph-transformer
Repo	https://github.com/acbull/pyHGT
Framework	pytorch

Semixup: In- and Out-of-Manifold Regularization for Deep Semi-Supervised Knee Osteoarthritis Severity Grading from Plain Radiographs


Title	Semixup: In- and Out-of-Manifold Regularization for Deep Semi-Supervised Knee Osteoarthritis Severity Grading from Plain Radiographs
Authors	Huy Hoang Nguyen, Simo Saarakkala, Matthew Blaschko, Aleksei Tiulpin
Abstract	Knee osteoarthritis (OA) is one of the highest disability factors in the world in humans. This musculoskeletal disorder is assessed from clinical symptoms, and typically confirmed via radiographic assessment. This visual assessment done by a radiologist requires experience, and suffers from high inter-observer variability. The recent development in the literature has shown that deep learning (DL) methods can reliably perform the OA severity assessment according to the gold standard Kellgren-Lawrence (KL) grading system. However, these methods require large amounts of labeled data, which are costly to obtain. In this study, we propose the Semixup algorithm, a semi-supervised learning (SSL) approach to leverage unlabeled data. Semixup relies on consistency regularization using in- and out-of-manifold samples, together with interpolated consistency. On an independent test set, our method significantly outperformed other state-of-the-art SSL methods in most cases, and even achieved a comparable performance to a well-tuned fully supervised learning (SL) model that required over 12 times more labeled data.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.01944v2
PDF	https://arxiv.org/pdf/2003.01944v2.pdf
PWC	https://paperswithcode.com/paper/textitsemixup-in-and-out-of-manifold
Repo	https://github.com/MIPT-Oulu/semixup
Framework	pytorch

Learning Nonparametric Human Mesh Reconstruction from a Single Image without Ground Truth Meshes


Title	Learning Nonparametric Human Mesh Reconstruction from a Single Image without Ground Truth Meshes
Authors	Kevin Lin, Lijuan Wang, Ying Jin, Zicheng Liu, Ming-Ting Sun
Abstract	Nonparametric approaches have shown promising results on reconstructing 3D human mesh from a single monocular image. Unlike previous approaches that use a parametric human model like skinned multi-person linear model (SMPL), and attempt to regress the model parameters, nonparametric approaches relax the heavy reliance on the parametric space. However, existing nonparametric methods require ground truth meshes as their regression target for each vertex, and obtaining ground truth mesh labels is very expensive. In this paper, we propose a novel approach to learn human mesh reconstruction without any ground truth meshes. This is made possible by introducing two new terms into the loss function of a graph convolutional neural network (Graph CNN). The first term is the Laplacian prior that acts as a regularizer on the reconstructed mesh. The second term is the part segmentation loss that forces the projected region of the reconstructed mesh to match the part segmentation. Experimental results on multiple public datasets show that without using 3D ground truth meshes, the proposed approach outperforms the previous state-of-the-art approaches that require ground truth meshes for training.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2003.00052v1
PDF	https://arxiv.org/pdf/2003.00052v1.pdf
PWC	https://paperswithcode.com/paper/learning-nonparametric-human-mesh
Repo	https://github.com/chingswy/HumanPoseMemo
Framework	pytorch


Title	Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement
Authors	Zehao Yu, Shenghua Gao
Abstract	Almost all previous deep learning-based multi-view stereo (MVS) approaches focus on improving reconstruction quality. Besides quality, efficiency is also a desirable feature for MVS in real scenarios. Towards this end, this paper presents a Fast-MVSNet, a novel sparse-to-dense coarse-to-fine framework, for fast and accurate depth estimation in MVS. Specifically, in our Fast-MVSNet, we first construct a sparse cost volume for learning a sparse and high-resolution depth map. Then we leverage a small-scale convolutional neural network to encode the depth dependencies for pixels within a local region to densify the sparse high-resolution depth map. At last, a simple but efficient Gauss-Newton layer is proposed to further optimize the depth map. On one hand, the high-resolution depth map, the data-adaptive propagation method and the Gauss-Newton layer jointly guarantee the effectiveness of our method. On the other hand, all modules in our Fast-MVSNet are lightweight and thus guarantee the efficiency of our approach. Besides, our approach is also memory-friendly because of the sparse depth representation. Extensive experimental results show that our method is 5$\times$ and 14$\times$ faster than Point-MVSNet and R-MVSNet, respectively, while achieving comparable or even better results on the challenging Tanks and Temples dataset as well as the DTU dataset. Code is available at https://github.com/svip-lab/FastMVSNet.
Tasks	Depth Estimation
Published	2020-03-29
URL	https://arxiv.org/abs/2003.13017v1
PDF	https://arxiv.org/pdf/2003.13017v1.pdf
PWC	https://paperswithcode.com/paper/fast-mvsnet-sparse-to-dense-multi-view-stereo
Repo	https://github.com/svip-lab/FastMVSNet
Framework	pytorch

High-Order Residual Network for Light Field Super-Resolution


Title	High-Order Residual Network for Light Field Super-Resolution
Authors	Nan Meng, Xiaofei Wu, Jianzhuang Liu, Edmund Y. Lam
Abstract	Plenoptic cameras usually sacrifice the spatial resolution of their SAIs to acquire geometry information from different viewpoints. Several methods have been proposed to mitigate such spatio-angular trade-off, but seldom make use of the structural properties of the light field (LF) data efficiently. In this paper, we propose a novel high-order residual network to learn the geometric features hierarchically from the LF for reconstruction. An important component in the proposed network is the high-order residual block (HRB), which learns the local geometric features by considering the information from all input views. After fully obtaining the local features learned from each HRB, our model extracts the representative geometric features for spatio-angular upsampling through the global residual learning. Additionally, a refinement network is followed to further enhance the spatial details by minimizing a perceptual loss. Compared with previous work, our model is tailored to the rich structure inherent in the LF, and therefore can reduce the artifacts near non-Lambertian and occlusion regions. Experimental results show that our approach enables high-quality reconstruction even in challenging regions and outperforms state-of-the-art single image or LF reconstruction methods with both quantitative measurements and visual evaluation.
Tasks	Super-Resolution
Published	2020-03-29
URL	https://arxiv.org/abs/2003.13094v1
PDF	https://arxiv.org/pdf/2003.13094v1.pdf
PWC	https://paperswithcode.com/paper/high-order-residual-network-for-light-field
Repo	https://github.com/monaen/LightFieldReconstruction
Framework	tf

Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning


Title	Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
Authors	Tianlong Chen, Sijia Liu, Shiyu Chang, Yu Cheng, Lisa Amini, Zhangyang Wang
Abstract	Pretrained models from self-supervision are prevalently used in fine-tuning downstream tasks faster or for better accuracy. However, gaining robustness from pretraining is left unexplored. We introduce adversarial training into self-supervision, to provide general-purpose robust pre-trained models for the first time. We find these robust pre-trained models can benefit the subsequent fine-tuning in two ways: i) boosting final model robustness; ii) saving the computation cost, if proceeding towards adversarial fine-tuning. We conduct extensive experiments to demonstrate that the proposed framework achieves large performance margins (eg, 3.83% on robust accuracy and 1.3% on standard accuracy, on the CIFAR-10 dataset), compared with the conventional end-to-end adversarial training baseline. Moreover, we find that different self-supervised pre-trained models have a diverse adversarial vulnerability. It inspires us to ensemble several pretraining tasks, which boosts robustness more. Our ensemble strategy contributes to a further improvement of 3.59% on robust accuracy, while maintaining a slightly higher standard accuracy on CIFAR-10. Our codes are available at https://github.com/TAMU-VITA/Adv-SS-Pretraining.
Tasks
Published	2020-03-28
URL	https://arxiv.org/abs/2003.12862v1
PDF	https://arxiv.org/pdf/2003.12862v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-robustness-from-self-supervised
Repo	https://github.com/TAMU-VITA/Adv-SS-Pretraining
Framework	pytorch

Ontology for Scenarios for the Assessment of Automated Vehicles


Title	Ontology for Scenarios for the Assessment of Automated Vehicles
Authors	E. de Gelder, J. -P. Paardekooper, A. Khabbaz Saberi, H. Elrofai, O. Op den Camp., J. Ploeg, L. Friedmann, B. De Schutter
Abstract	The development of assessment methods for the performance of Automated Vehicles (AVs) is essential to enable and speed up the deployment of automated driving technologies, due to the complex operational domain of AVs. As traditional methods for assessing vehicles are not applicable for AVs, other approaches have been proposed. Among these, real-world scenario-based assessment is widely supported by many players in the automotive field. In this approach, test cases are derived from real-world scenarios that are obtained from driving data. To minimize any ambiguity regarding these test cases and scenarios, a clear definition of the notion of scenario is required. In this paper, we propose a more concrete definition of scenario, compared to what is known to the authors from the literature. This is achieved by proposing an ontology in which the quantitative building blocks of a scenario are defined. An example illustrates that the presented ontology is applicable for scenario-based assessment of AVs.
Tasks
Published	2020-01-30
URL	https://arxiv.org/abs/2001.11507v2
PDF	https://arxiv.org/pdf/2001.11507v2.pdf
PWC	https://paperswithcode.com/paper/ontology-for-scenarios-for-the-assessment-of
Repo	https://github.com/ErwindeGelder/ScenarioDomainModel
Framework	none

Toward Tag-free Aspect Based Sentiment Analysis: A Multiple Attention Network Approach


Title	Toward Tag-free Aspect Based Sentiment Analysis: A Multiple Attention Network Approach
Authors	Yao Qiang, Xin Li, Dongxiao Zhu
Abstract	Existing aspect based sentiment analysis (ABSA) approaches leverage various neural network models to extract the aspect sentiments via learning aspect-specific feature representations. However, these approaches heavily rely on manual tagging of user reviews according to the predefined aspects as the input, a laborious and time-consuming process. Moreover, the underlying methods do not explain how and why the opposing aspect level polarities in a user review lead to the overall polarity. In this paper, we tackle these two problems by designing and implementing a new Multiple-Attention Network (MAN) approach for more powerful ABSA without the need for aspect tags using two new tag-free data sets crawled directly from TripAdvisor ({https://www.tripadvisor.com}). With the Self- and Position-Aware attention mechanism, MAN is capable of extracting both aspect level and overall sentiments from the text reviews using the aspect level and overall customer ratings, and it can also detect the vital aspect(s) leading to the overall sentiment polarity among different aspects via a new aspect ranking scheme. We carry out extensive experiments to demonstrate the strong performance of MAN compared to other state-of-the-art ABSA approaches and the explainability of our approach by visualizing and interpreting attention weights in case studies.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2020-03-22
URL	https://arxiv.org/abs/2003.09986v1
PDF	https://arxiv.org/pdf/2003.09986v1.pdf
PWC	https://paperswithcode.com/paper/toward-tag-free-aspect-based-sentiment
Repo	https://github.com/qiangyao1988/Toward-tag-free-ABSA
Framework	none

Novel Entity Discovery from Web Tables


Title	Novel Entity Discovery from Web Tables
Authors	Shuo Zhang, Edgar Meij, Krisztian Balog, Ridho Reinanda
Abstract	When working with any sort of knowledge base (KB) one has to make sure it is as complete and also as up-to-date as possible. Both tasks are non-trivial as they require recall-oriented efforts to determine which entities and relationships are missing from the KB. As such they require a significant amount of labor. Tables on the Web, on the other hand, are abundant and have the distinct potential to assist with these tasks. In particular, we can leverage the content in such tables to discover new entities, properties, and relationships. Because web tables typically only contain raw textual content we first need to determine which cells refer to which known entities—a task we dub table-to-KB matching. This first task aims to infer table semantics by linking table cells and heading columns to elements of a KB. Then second task builds upon these linked entities and properties to not only identify novel ones in the same table but also to bootstrap their type and additional relationships. We refer to this process as novel entity discovery and, to the best of our knowledge, it is the first endeavor on mining the unlinked cells in web tables. Our method identifies not only out-of-KB (`novel'') information but also novel aliases for in-KB (`known’') entities. When evaluated using three purpose-built test collections, we find that our proposed approaches obtain a marked improvement in terms of precision over our baselines whilst keeping recall stable.
Tasks
Published	2020-02-01
URL	https://arxiv.org/abs/2002.00206v1
PDF	https://arxiv.org/pdf/2002.00206v1.pdf
PWC	https://paperswithcode.com/paper/novel-entity-discovery-from-web-tables
Repo	https://github.com/wikipedia2vec/wikipedia2vec
Framework	none