April 2, 2020

3452 words 17 mins read

Paper Group ANR 365

Determination of Latent Dimensionality in International Trade Flow. Jelly Bean World: A Testbed for Never-Ending Learning. Bringing freedom in variable choice when searching counter-examples in floating point programs. Using AI for Mitigating the Impact of Network Delay in Cloud-based Intelligent Traffic Signal Control. Deceiving Image-to-Image Tra …

Determination of Latent Dimensionality in International Trade Flow


Title	Determination of Latent Dimensionality in International Trade Flow
Authors	Duc P. Truong, Erik Skau, Vladimir I. Valtchinov, Boian S. Alexandrov
Abstract	Currently, high-dimensional data is ubiquitous in data science, which necessitates the development of techniques to decompose and interpret such multidimensional (aka tensor) datasets. Finding a low dimensional representation of the data, that is, its inherent structure, is one of the approaches that can serve to understand the dynamics of low dimensional latent features hidden in the data. Nonnegative RESCAL is one such technique, particularly well suited to analyze self-relational data, such as dynamic networks found in international trade flows. Nonnegative RESCAL computes a low dimensional tensor representation by finding the latent space containing multiple modalities. Estimating the dimensionality of this latent space is crucial for extracting meaningful latent features. Here, to determine the dimensionality of the latent space with nonnegative RESCAL, we propose a latent dimension determination method which is based on clustering of the solutions of multiple realizations of nonnegative RESCAL decompositions. We demonstrate the performance of our model selection method on synthetic data and then we apply our method to decompose a network of international trade flows data from International Monetary Fund and validate the resulting features against empirical facts from economic literature.
Tasks	Model Selection
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00129v1
PDF	https://arxiv.org/pdf/2003.00129v1.pdf
PWC	https://paperswithcode.com/paper/determination-of-latent-dimensionality-in
Repo
Framework

Jelly Bean World: A Testbed for Never-Ending Learning


Title	Jelly Bean World: A Testbed for Never-Ending Learning
Authors	Emmanouil Antonios Platanios, Abulhair Saparov, Tom Mitchell
Abstract	Machine learning has shown growing success in recent years. However, current machine learning systems are highly specialized, trained for particular problems or domains, and typically on a single narrow dataset. Human learning, on the other hand, is highly general and adaptable. Never-ending learning is a machine learning paradigm that aims to bridge this gap, with the goal of encouraging researchers to design machine learning systems that can learn to perform a wider variety of inter-related tasks in more complex environments. To date, there is no environment or testbed to facilitate the development and evaluation of never-ending learning systems. To this end, we propose the Jelly Bean World testbed. The Jelly Bean World allows experimentation over two-dimensional grid worlds which are filled with items and in which agents can navigate. This testbed provides environments that are sufficiently complex and where more generally intelligent algorithms ought to perform better than current state-of-the-art reinforcement learning approaches. It does so by producing non-stationary environments and facilitating experimentation with multi-task, multi-agent, multi-modal, and curriculum learning settings. We hope that this new freely-available software will prompt new research and interest in the development and evaluation of never-ending learning systems and more broadly, general intelligence systems.
Tasks
Published	2020-02-15
URL	https://arxiv.org/abs/2002.06306v1
PDF	https://arxiv.org/pdf/2002.06306v1.pdf
PWC	https://paperswithcode.com/paper/jelly-bean-world-a-testbed-for-never-ending-1
Repo
Framework

Bringing freedom in variable choice when searching counter-examples in floating point programs


Title	Bringing freedom in variable choice when searching counter-examples in floating point programs
Authors	Heytem Zitoun, Claude Michel, Laurent Michel, Michel Rueher
Abstract	Program verification techniques typically focus on finding counter-examples that violate properties of a program. Constraint programming offers a convenient way to verify programs by modeling their state transformations and specifying searches that seek counter-examples. Floating-point computations present additional challenges for verification given the semantic subtleties of floating point arithmetic. % This paper focuses on search strategies for CSPs using floating point numbers constraint systems and dedicated to program verification. It introduces a new search heuristic based on the global number of occurrences that outperforms state-of-the-art strategies. More importantly, it demonstrates that a new technique that only branches on input variables of the verified program improve performance. It composes with a diversification technique that prevents the selection of the same variable within a fixed horizon further improving performances and reduces disparities between various variable choice heuristics. The result is a robust methodology that can tailor the search strategy according to the sought properties of the counter example.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12447v1
PDF	https://arxiv.org/pdf/2002.12447v1.pdf
PWC	https://paperswithcode.com/paper/bringing-freedom-in-variable-choice-when
Repo
Framework

Using AI for Mitigating the Impact of Network Delay in Cloud-based Intelligent Traffic Signal Control


Title	Using AI for Mitigating the Impact of Network Delay in Cloud-based Intelligent Traffic Signal Control
Authors	Rusheng Zhang, Xinze Zhou, Ozan K. Tonguz
Abstract	The recent advancements in cloud services, Internet of Things (IoT) and Cellular networks have made cloud computing an attractive option for intelligent traffic signal control (ITSC). Such a method significantly reduces the cost of cables, installation, number of devices used, and maintenance. ITSC systems based on cloud computing lower the cost of the ITSC systems and make it possible to scale the system by utilizing the existing powerful cloud platforms. While such systems have significant potential, one of the critical problems that should be addressed is the network delay. It is well known that network delay in message propagation is hard to prevent, which could potentially degrade the performance of the system or even create safety issues for vehicles at intersections. In this paper, we introduce a new traffic signal control algorithm based on reinforcement learning, which performs well even under severe network delay. The framework introduced in this paper can be helpful for all agent-based systems using remote computing resources where network delay could be a critical concern. Extensive simulation results obtained for different scenarios show the viability of the designed algorithm to cope with network delay.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08303v2
PDF	https://arxiv.org/pdf/2002.08303v2.pdf
PWC	https://paperswithcode.com/paper/using-ai-for-mitigating-the-impact-of-network
Repo
Framework

Deceiving Image-to-Image Translation Networks for Autonomous Driving with Adversarial Perturbations


Title	Deceiving Image-to-Image Translation Networks for Autonomous Driving with Adversarial Perturbations
Authors	Lin Wang, Wonjune Cho, Kuk-Jin Yoon
Abstract	Deep neural networks (DNNs) have achieved impressive performance on handling computer vision problems, however, it has been found that DNNs are vulnerable to adversarial examples. For such reason, adversarial perturbations have been recently studied in several respects. However, most previous works have focused on image classification tasks, and it has never been studied regarding adversarial perturbations on Image-to-image (Im2Im) translation tasks, showing great success in handling paired and/or unpaired mapping problems in the field of autonomous driving and robotics. This paper examines different types of adversarial perturbations that can fool Im2Im frameworks for autonomous driving purpose. We propose both quasi-physical and digital adversarial perturbations that can make Im2Im models yield unexpected results. We then empirically analyze these perturbations and show that they generalize well under both paired for image synthesis and unpaired settings for style transfer. We also validate that there exist some perturbation thresholds over which the Im2Im mapping is disrupted or impossible. The existence of these perturbations reveals that there exist crucial weaknesses in Im2Im models. Lastly, we show that our methods illustrate how perturbations affect the quality of outputs, pioneering the improvement of the robustness of current SOTA networks for autonomous driving.
Tasks	Autonomous Driving, Image Classification, Image Generation, Image-to-Image Translation, Style Transfer
Published	2020-01-06
URL	https://arxiv.org/abs/2001.01506v1
PDF	https://arxiv.org/pdf/2001.01506v1.pdf
PWC	https://paperswithcode.com/paper/deceiving-image-to-image-translation-networks
Repo
Framework

RGB-D Odometry and SLAM


Title	RGB-D Odometry and SLAM
Authors	Javier Civera, Seong Hun Lee
Abstract	The emergence of modern RGB-D sensors had a significant impact in many application fields, including robotics, augmented reality (AR) and 3D scanning. They are low-cost, low-power and low-size alternatives to traditional range sensors such as LiDAR. Moreover, unlike RGB cameras, RGB-D sensors provide the additional depth information that removes the need of frame-by-frame triangulation for 3D scene reconstruction. These merits have made them very popular in mobile robotics and AR, where it is of great interest to estimate ego-motion and 3D scene structure. Such spatial understanding can enable robots to navigate autonomously without collisions and allow users to insert virtual entities consistent with the image stream. In this chapter, we review common formulations of odometry and Simultaneous Localization and Mapping (known by its acronym SLAM) using RGB-D stream input. The two topics are closely related, as the former aims to track the incremental camera motion with respect to a local map of the scene, and the latter to jointly estimate the camera trajectory and the global map with consistency. In both cases, the standard approaches minimize a cost function using nonlinear optimization techniques. This chapter consists of three main parts: In the first part, we introduce the basic concept of odometry and SLAM and motivate the use of RGB-D sensors. We also give mathematical preliminaries relevant to most odometry and SLAM algorithms. In the second part, we detail the three main components of SLAM systems: camera pose tracking, scene mapping and loop closing. For each component, we describe different approaches proposed in the literature. In the final part, we provide a brief discussion on advanced research topics with the references to the state-of-the-art.
Tasks	3D Scene Reconstruction, Pose Tracking, Simultaneous Localization and Mapping
Published	2020-01-19
URL	https://arxiv.org/abs/2001.06875v1
PDF	https://arxiv.org/pdf/2001.06875v1.pdf
PWC	https://paperswithcode.com/paper/rgb-d-odometry-and-slam
Repo
Framework

Marine life through You Only Look Once’s perspective


Title	Marine life through You Only Look Once’s perspective
Authors	Herman Stavelin, Adil Rasheed, Omer San, Arne Johan Hestnes
Abstract	With the rise of focus on man made changes to our planet and wildlife therein, more and more emphasis is put on sustainable and responsible gathering of resources. In an effort to preserve maritime wildlife the Norwegian government has decided that it is necessary to create an overview over the presence and abundance of various species of wildlife in the Norwegian fjords and oceans. In this paper we apply and analyze an object detection scheme that detects fish in camera images. The data is sampled from a submerged data station at Fulehuk in Norway. We implement You Only Look Once (YOLO) version 3 and create a dataset consisting of 99,961 images with a mAP of $\sim 0.88$. We also investigate intermediate results within YOLO, gaining insight into how it performs object detection.
Tasks	Object Detection
Published	2020-02-11
URL	https://arxiv.org/abs/2003.00836v1
PDF	https://arxiv.org/pdf/2003.00836v1.pdf
PWC	https://paperswithcode.com/paper/marine-life-through-you-only-look-onces
Repo
Framework

Improving Place Recognition Using Dynamic Object Detection


Title	Improving Place Recognition Using Dynamic Object Detection
Authors	Juan Pablo Munoz, Scott Dexter
Abstract	Traditional appearance-based place recognition algorithms based on handcrafted features have proven inadequate in environments with a significant presence of dynamic objects – objects that may or may not be present in an agent’s subsequent visits. Place representations from features extracted using Deep Learning approaches have gained popularity for their robustness and because the algorithms that used them yield better accuracy. Nevertheless, handcrafted features are still popular in devices that have limited resources. This article presents a novel approach that improves place recognition in environments populated by dynamic objects by incorporating the very knowledge of these objects to improve the overall quality of the representations of places used for matching. The proposed approach fuses object detection and place description, Deep Learning and handcrafted features, with the significance of reducing memory and storage requirements. This article demonstrates that the proposed approach yields improved place recognition accuracy, and was evaluated using both synthetic and real-world datasets. The adoption of the proposed approach will significantly improve place recognition results in environments populated by dynamic objects, and explored by devices with limited resources, with particular utility in both indoor and outdoor environments.
Tasks	Object Detection
Published	2020-02-11
URL	https://arxiv.org/abs/2002.04698v1
PDF	https://arxiv.org/pdf/2002.04698v1.pdf
PWC	https://paperswithcode.com/paper/improving-place-recognition-using-dynamic
Repo
Framework

Category-wise Attack: Transferable Adversarial Examples for Anchor Free Object Detection


Title	Category-wise Attack: Transferable Adversarial Examples for Anchor Free Object Detection
Authors	Quanyu Liao, Xin Wang, Bin Kong, Siwei Lyu, Youbing Yin, Qi Song, Xi Wu
Abstract	Deep neural networks have been demonstrated to be vulnerable to adversarial attacks: sutle perturbations can completely change the classification results. Their vulnerability has led to a surge of research in this direction. However, most works dedicated to attacking anchor-based object detection models. In this work, we aim to present an effective and efficient algorithm to generate adversarial examples to attack anchor-free object models based on two approaches. First, we conduct category-wise instead of instance-wise attacks on the object detectors. Second, we leverage the high-level semantic information to generate the adversarial examples. Surprisingly, the generated adversarial examples it not only able to effectively attack the targeted anchor-free object detector but also to be transferred to attack other object detectors, even anchor-based detectors such as Faster R-CNN.
Tasks	Object Detection
Published	2020-02-10
URL	https://arxiv.org/abs/2003.04367v1
PDF	https://arxiv.org/pdf/2003.04367v1.pdf
PWC	https://paperswithcode.com/paper/category-wise-attack-transferable-adversarial
Repo
Framework

Towards detection and classification of microscopic foraminifera using transfer learning


Title	Towards detection and classification of microscopic foraminifera using transfer learning
Authors	Thomas Haugland Johansen, Steffen Aagaard Sørensen
Abstract	Foraminifera are single-celled marine organisms, which may have a planktic or benthic lifestyle. During their life cycle they construct shells consisting of one or more chambers, and these shells remain as fossils in marine sediments. Classifying and counting these fossils have become an important tool in e.g. oceanography and climatology. Currently the process of identifying and counting microfossils is performed manually using a microscope and is very time consuming. Developing methods to automate this process is therefore considered important across a range of research fields. The first steps towards developing a deep learning model that can detect and classify microscopic foraminifera are proposed. The proposed model is based on a VGG16 model that has been pretrained on the ImageNet dataset, and adapted to the foraminifera task using transfer learning. Additionally, a novel image dataset consisting of microscopic foraminifera and sediments from the Barents Sea region is introduced.
Tasks	Transfer Learning
Published	2020-01-14
URL	https://arxiv.org/abs/2001.04782v1
PDF	https://arxiv.org/pdf/2001.04782v1.pdf
PWC	https://paperswithcode.com/paper/towards-detection-and-classification-of
Repo
Framework

Service Selection using Predictive Models and Monte-Carlo Tree Search


Title	Service Selection using Predictive Models and Monte-Carlo Tree Search
Authors	Cliff Laschet, Jorn op den Buijs, Mark H. M. Winands, Steffen Pauws
Abstract	This article proposes a method for automated service selection to improve treatment efficacy and reduce re-hospitalization costs. A predictive model is developed using the National Home and Hospice Care Survey (NHHCS) dataset to quantify the effect of care services on the risk of re-hospitalization. By taking the patient’s characteristics and other selected services into account, the model is able to indicate the overall effectiveness of a combination of services for a specific NHHCS patient. The developed model is incorporated in Monte-Carlo Tree Search (MCTS) to determine optimal combinations of services that minimize the risk of emergency re-hospitalization. MCTS serves as a risk minimization algorithm in this case, using the predictive model for guidance during the search. Using this method on the NHHCS dataset, a significant reduction in risk of re-hospitalization is observed compared to the original selections made by clinicians. An 11.89 percentage points risk reduction is achieved on average. Higher reductions of roughly 40 percentage points on average are observed for NHHCS patients in the highest risk categories. These results seem to indicate that there is enormous potential for improving service selection in the near future.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.04852v1
PDF	https://arxiv.org/pdf/2002.04852v1.pdf
PWC	https://paperswithcode.com/paper/service-selection-using-predictive-models-and
Repo
Framework

Further Boosting BERT-based Models by Duplicating Existing Layers: Some Intriguing Phenomena inside BERT


Title	Further Boosting BERT-based Models by Duplicating Existing Layers: Some Intriguing Phenomena inside BERT
Authors	Wei-Tsung Kao, Tsung-Han Wu, Po-Han Chi, Chun-Cheng Hsieh, Hung-Yi Lee
Abstract	Although Bidirectional Encoder Representations from Transformers (BERT) have achieved tremendous success in many natural language processing (NLP) tasks, it remains a black box, so much previous work has tried to lift the veil of BERT and understand the functionality of each layer. In this paper, we found that removing or duplicating most layers in BERT would not change their outputs. This fact remains true across a wide variety of BERT-based models. Based on this observation, we propose a quite simple method to boost the performance of BERT. By duplicating some layers in the BERT-based models to make it deeper (no extra training required in this step), they obtain better performance in the down-stream tasks after fine-tuning.
Tasks
Published	2020-01-25
URL	https://arxiv.org/abs/2001.09309v1
PDF	https://arxiv.org/pdf/2001.09309v1.pdf
PWC	https://paperswithcode.com/paper/further-boosting-bert-based-models-by
Repo
Framework

Algorithm-hardware Co-design for Deformable Convolution


Title	Algorithm-hardware Co-design for Deformable Convolution
Authors	Qijing Huang, Dequan Wang, Yizhao Gao, Yaohui Cai, Zhen Dong, Bichen Wu, Kurt Keutzer, John Wawrzynek
Abstract	FPGAs provide a flexible and efficient platform to accelerate rapidly-changing algorithms for computer vision. The majority of existing work focuses on accelerating image classification, while other fundamental vision problems, including object detection and instance segmentation, have not been adequately addressed. Compared with image classification, detection problems are more sensitive to the spatial variance of objects, and therefore, require specialized convolutions to aggregate spatial information. To address this, recent work proposes dynamic deformable convolution to augment regular convolutions. Regular convolutions process a fixed grid of pixels across all the spatial locations in an image, while dynamic deformable convolutions may access arbitrary pixels in the image and the access pattern is input-dependent and varies per spatial location. These properties lead to inefficient memory accesses of inputs with existing hardware. In this work, we first investigate the overhead of the deformable convolution on embedded FPGA SoCs, and then show the accuracy-latency tradeoffs for a set of algorithm modifications including full versus depthwise, fixed-shape, and limited-range. These modifications benefit the energy efficiency for embedded devices in general as they reduce the compute complexity. We then build an efficient object detection network with modified deformable convolutions and quantize the network using state-of-the-art quantization methods. We implement a unified hardware engine on FPGA to support all the operations in the network. Preliminary experiments show that little accuracy is compromised and speedup can be achieved with our co-design optimization for the deformable convolution.
Tasks	Image Classification, Instance Segmentation, Object Detection, Quantization, Semantic Segmentation
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08357v1
PDF	https://arxiv.org/pdf/2002.08357v1.pdf
PWC	https://paperswithcode.com/paper/algorithm-hardware-co-design-for-deformable
Repo
Framework

FourierNet: Compact mask representation for instance segmentation using differentiable shape decoders


Title	FourierNet: Compact mask representation for instance segmentation using differentiable shape decoders
Authors	Nuri Benbarka, Hamd ul Moqeet Riaz, Andreas Zell
Abstract	We present FourierNet a single shot, anchor-free, fully convolutional instance segmentation method, which predicts a shape vector that is converted into contour points using a numerical transformation. Compared to previous methods, we introduce a new training technique, where we utilize a differentiable shape decoder, which achieves automatic weight balancing of the shape vector’s coefficients. Fourier series was utilized as a shape encoder because of its coefficient interpretability and fast implementation. By using its lower frequencies we were able to retrieve smooth and compact masks. FourierNet shows promising results compared to polygon representation methods, achieving 30.6 mAP on the MS COCO 2017 benchmark. At lower image resolutions, it runs at 26.6 FPS with 24.3 mAP. It achieves 23.3 mAP using just 8 parameters to represent the mask, which is double the amount of parameters to predict a bounding box. Code will be available at: github.com/cogsys-tuebingen/FourierNet.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02709v1
PDF	https://arxiv.org/pdf/2002.02709v1.pdf
PWC	https://paperswithcode.com/paper/fouriernet-compact-mask-representation-for
Repo
Framework

Pose-Aware Instance Segmentation Framework from Cone Beam CT Images for Tooth Segmentation


Title	Pose-Aware Instance Segmentation Framework from Cone Beam CT Images for Tooth Segmentation
Authors	Minyoung Chung, Minkyung Lee, Jioh Hong, Sanguk Park, Jusang Lee, Jingyu Lee, Jeongjin Lee, Yeong-Gil Shin
Abstract	Individual tooth segmentation from cone beam computed tomography (CBCT) images is an essential prerequisite for an anatomical understanding of orthodontic structures in several applications, such as tooth reformation planning and implant guide simulations. However, the presence of severe metal artifacts in CBCT images hinders the accurate segmentation of each individual tooth. In this study, we propose a neural network for pixel-wise labeling to exploit an instance segmentation framework that is robust to metal artifacts. Our method comprises of three steps: 1) image cropping and realignment by pose regressions, 2) metal-robust individual tooth detection, and 3) segmentation. We first extract the alignment information of the patient by pose regression neural networks to attain a volume-of-interest (VOI) region and realign the input image, which reduces the inter-overlapping area between tooth bounding boxes. Then, individual tooth regions are localized within a VOI realigned image using a convolutional detector. We improved the accuracy of the detector by employing non-maximum suppression and multiclass classification metrics in the region proposal network. Finally, we apply a convolutional neural network (CNN) to perform individual tooth segmentation by converting the pixel-wise labeling task to a distance regression task. Metal-intensive image augmentation is also employed for a robust segmentation of metal artifacts. The result shows that our proposed method outperforms other state-of-the-art methods, especially for teeth with metal artifacts. The primary significance of the proposed method is two-fold: 1) an introduction of pose-aware VOI realignment followed by a robust tooth detection and 2) a metal-robust CNN framework for accurate tooth segmentation.
Tasks	Image Augmentation, Image Cropping, Instance Segmentation, Semantic Segmentation
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02143v1
PDF	https://arxiv.org/pdf/2002.02143v1.pdf
PWC	https://paperswithcode.com/paper/pose-aware-instance-segmentation-framework
Repo
Framework