April 2, 2020

2969 words 14 mins read

Paper Group ANR 127

Paper Group ANR 127

Proceedings of the AAAI-20 Workshop on Intelligent Process Automation (IPA-20). Uncertainty based Class Activation Maps for Visual Question Answering. Force-Ultrasound Fusion: Bringing Spine Robotic-US to the Next “Level”. GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet. Phylogenetic signal in phonotactics. Revisiting Convolutional Neural …

Proceedings of the AAAI-20 Workshop on Intelligent Process Automation (IPA-20)

Title Proceedings of the AAAI-20 Workshop on Intelligent Process Automation (IPA-20)
Authors Dell Zhang, Andre Freitas, Dacheng Tao, Dawn Song
Abstract This is the Proceedings of the AAAI-20 Workshop on Intelligent Process Automation (IPA-20) which took place in New York, NY, USA on February 7th 2020.
Published 2020-01-15
URL https://arxiv.org/abs/2001.05214v3
PDF https://arxiv.org/pdf/2001.05214v3.pdf
PWC https://paperswithcode.com/paper/proceedings-of-the-aaai-20-workshop-on

Uncertainty based Class Activation Maps for Visual Question Answering

Title Uncertainty based Class Activation Maps for Visual Question Answering
Authors Badri N. Patro, Mayank Lunayach, Vinay P. Namboodiri
Abstract Understanding and explaining deep learning models is an imperative task. Towards this, we propose a method that obtains gradient-based certainty estimates that also provide visual attention maps. Particularly, we solve for visual question answering task. We incorporate modern probabilistic deep learning methods that we further improve by using the gradients for these estimates. These have two-fold benefits: a) improvement in obtaining the certainty estimates that correlate better with misclassified samples and b) improved attention maps that provide state-of-the-art results in terms of correlation with human attention regions. The improved attention maps result in consistent improvement for various methods for visual question answering. Therefore, the proposed technique can be thought of as a recipe for obtaining improved certainty estimates and explanations for deep learning models. We provide detailed empirical analysis for the visual question answering task on all standard benchmarks and comparison with state of the art methods.
Tasks Question Answering, Visual Question Answering
Published 2020-01-23
URL https://arxiv.org/abs/2002.10309v1
PDF https://arxiv.org/pdf/2002.10309v1.pdf
PWC https://paperswithcode.com/paper/uncertainty-based-class-activation-maps-for

Force-Ultrasound Fusion: Bringing Spine Robotic-US to the Next “Level”

Title Force-Ultrasound Fusion: Bringing Spine Robotic-US to the Next “Level”
Authors Maria Tirindelli, Maria Victorova, Javier Esteban, Seong Tae Kim, David Navarro-Alarcon, Yong Ping Zheng, Nassir Navab
Abstract Spine injections are commonly performed in several clinical procedures. The localization of the target vertebral level (i.e. the position of a vertebra in a spine) is typically done by back palpation or under X-ray guidance, yielding either higher chances of procedure failure or exposure to ionizing radiation. Preliminary studies have been conducted in the literature, suggesting that ultrasound imaging may be a precise and safe alternative to X-ray for spine level detection. However, ultrasound data are noisy and complicated to interpret. In this study, a robotic-ultrasound approach for automatic vertebral level detection is introduced. The method relies on the fusion of ultrasound and force data, thus providing both “tactile” and visual feedback during the procedure, which results in higher performances in presence of data corruption. A robotic arm automatically scans the volunteer’s back along the spine by using force-ultrasound data to locate vertebral levels. The occurrences of vertebral levels are visible on the force trace as peaks, which are enhanced by properly controlling the force applied by the robot on the patient back. Ultrasound data are processed with a Deep Learning method to extract a 1D signal modelling the probabilities of having a vertebra at each location along the spine. Processed force and ultrasound data are fused using a 1D Convolutional Network to compute the location of the vertebral levels. The method is compared to pure image and pure force-based methods for vertebral level counting, showing improved performance. In particular, the fusion method is able to correctly classify 100% of the vertebral levels in the test set, while pure image and pure force-based method could only classify 80% and 90% vertebrae, respectively. The potential of the proposed method is evaluated in an exemplary simulated clinical application.
Published 2020-02-26
URL https://arxiv.org/abs/2002.11404v1
PDF https://arxiv.org/pdf/2002.11404v1.pdf
PWC https://paperswithcode.com/paper/force-ultrasound-fusion-bringing-spine

GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet

Title GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet
Authors Shan You, Tao Huang, Mingmin Yang, Fei Wang, Chen Qian, Changshui Zhang
Abstract Training a supernet matters for one-shot neural architecture search (NAS) methods since it serves as a basic performance estimator for different architectures (paths). Current methods mainly hold the assumption that a supernet should give a reasonable ranking over all paths. They thus treat all paths equally, and spare much effort to train paths. However, it is harsh for a single supernet to evaluate accurately on such a huge-scale search space (e.g., $7^{21}$). In this paper, instead of covering all paths, we ease the burden of supernet by encouraging it to focus more on evaluation of those potentially-good ones, which are identified using a surrogate portion of validation data. Concretely, during training, we propose a multi-path sampling strategy with rejection, and greedily filter the weak paths. The training efficiency is thus boosted since the training space has been greedily shrunk from all paths to those potentially-good ones. Moreover, we further adopt an exploration and exploitation policy by introducing an empirical candidate path pool. Our proposed method GreedyNAS is easy-to-follow, and experimental results on ImageNet dataset indicate that it can achieve better Top-1 accuracy under same search space and FLOPs or latency level, but with only $\sim$60% of supernet training cost. By searching on a larger space, our GreedyNAS can also obtain new state-of-the-art architectures.
Tasks Image Classification, Neural Architecture Search
Published 2020-03-25
URL https://arxiv.org/abs/2003.11236v1
PDF https://arxiv.org/pdf/2003.11236v1.pdf
PWC https://paperswithcode.com/paper/greedynas-towards-fast-one-shot-nas-with

Phylogenetic signal in phonotactics

Title Phylogenetic signal in phonotactics
Authors Jayden L. Macklin-Cordes, Claire Bowern, Erich R. Round
Abstract Phylogenetic methods have broad potential in linguistics beyond tree inference. Here, we show how a phylogenetic approach opens the possibility of gaining historical insights from entirely new kinds of linguistic data–in this instance, statistical phonotactics. We extract phonotactic data from 128 Pama-Nyungan vocabularies and apply tests for phylogenetic signal, quantifying the degree to which the data reflect phylogenetic history. We test three datasets: (1) binary variables recording the presence or absence of biphones (two-segment sequences) in a lexicon (2) frequencies of transitions between segments, and (3) frequencies of transitions between natural sound classes. Australian languages have been characterised as having a high degree of phonotactic homogeneity. Nevertheless, we detect phylogenetic signal in all datasets. Phylogenetic signal is higher in finer-grained frequency data than in binary data, and highest in natural-class-based data. These results demonstrate the viability of employing a new source of readily extractable data in historical and comparative linguistics.
Published 2020-02-03
URL https://arxiv.org/abs/2002.00527v1
PDF https://arxiv.org/pdf/2002.00527v1.pdf
PWC https://paperswithcode.com/paper/phylogenetic-signal-in-phonotactics

Revisiting Convolutional Neural Networks for Urban Flow Analytics

Title Revisiting Convolutional Neural Networks for Urban Flow Analytics
Authors Yuxuan Liang, Kun Ouyang, Junbo Zhang, Yu Zheng, David S. Rosenblum
Abstract Convolutional Neural Networks (CNNs) have been widely adopted in raster-based urban flow analytics by virtue of their capability in capturing nearby spatial context. By revisiting CNN-based methods for different analytics tasks, we expose two common critical drawbacks in the existing uses: 1) inefficiency in learning global context, and 2) overlooking latent region functions. To tackle these challenges, in this paper we present a novel framework entitled DeepLGR that can be easily generalized to address various urban flow analytics problems. This framework consists of three major parts: 1) a local context module to learn local representations of each region; 2) a global context module to extract global contextual priors and upsample them to generate the global features; and 3) a region-specific predictor based on tensor decomposition to provide customized predictions for each region, which is very parameter-efficient compared to previous methods. Extensive experiments on two typical urban analytics tasks demonstrate the effectiveness, stability, and generality of our framework.
Published 2020-02-28
URL https://arxiv.org/abs/2003.00895v1
PDF https://arxiv.org/pdf/2003.00895v1.pdf
PWC https://paperswithcode.com/paper/revisiting-convolutional-neural-networks-for

Compiling Neural Networks for a Computational Memory Accelerator

Title Compiling Neural Networks for a Computational Memory Accelerator
Authors Kornilios Kourtis, Martino Dazzi, Nikolas Ioannou, Tobias Grosser, Abu Sebastian, Evangelos Eleftheriou
Abstract Computational memory (CM) is a promising approach for accelerating inference on neural networks (NN) by using enhanced memories that, in addition to storing data, allow computations on them. One of the main challenges of this approach is defining a hardware/software interface that allows a compiler to map NN models for efficient execution on the underlying CM accelerator. This is a non-trivial task because efficiency dictates that the CM accelerator is explicitly programmed as a dataflow engine where the execution of the different NN layers form a pipeline. In this paper, we present our work towards a software stack for executing ML models on such a multi-core CM accelerator. We describe an architecture for the hardware and software, and focus on the problem of implementing the appropriate control logic so that data dependencies are respected. We propose a solution to the latter that is based on polyhedral compilation.
Published 2020-03-05
URL https://arxiv.org/abs/2003.04293v1
PDF https://arxiv.org/pdf/2003.04293v1.pdf
PWC https://paperswithcode.com/paper/compiling-neural-networks-for-a-computational

DFKI Cabin Simulator: A Test Platform for Visual In-Cabin Monitoring Functions

Title DFKI Cabin Simulator: A Test Platform for Visual In-Cabin Monitoring Functions
Authors Hartmut Feld, Bruno Mirbach, Jigyasa Katrolia, Mohamed Selim, Oliver Wasenmüller, Didier Stricker
Abstract We present a test platform for visual in-cabin scene analysis and occupant monitoring functions. The test platform is based on a driving simulator developed at the DFKI, consisting of a realistic in-cabin mock-up and a wide-angle projection system for a realistic driving experience. The platform has been equipped with a wide-angle 2D/3D camera system monitoring the entire interior of the vehicle mock-up of the simulator. It is also supplemented with a ground truth reference sensor system that allows to track and record the occupant’s body movements synchronously with the 2D and 3D video streams of the camera. Thus, the resulting test platform will serve as a basis to validate numerous in-cabin monitoring functions, which are important for the realization of novel human-vehicle interfaces, advanced driver assistant systems, and automated driving. Among the considered functions are occupant presence detection, size and 3D-pose estimation and driver intention recognition. In addition, our platform will be the basis for the creation of large-scale in-cabin benchmark datasets.
Tasks 3D Pose Estimation, Intent Detection, Pose Estimation
Published 2020-01-28
URL https://arxiv.org/abs/2002.03749v2
PDF https://arxiv.org/pdf/2002.03749v2.pdf
PWC https://paperswithcode.com/paper/dfki-cabin-simulator-a-test-platform-for

Towards Unconstrained Palmprint Recognition on Consumer Devices: a Literature Review

Title Towards Unconstrained Palmprint Recognition on Consumer Devices: a Literature Review
Authors Adrian-S. Ungureanu, Saqib Salahuddin, Peter Corcoran
Abstract As a biometric palmprints have been largely under-utilized, but they offer some advantages over fingerprints and facial biometrics. Recent improvements in imaging capabilities on handheld and wearable consumer devices have re-awakened interest in the use fo palmprints. The aim of this paper is to provide a comprehensive review of state-of-the-art methods for palmprint recognition including Region of Interest extraction methods, feature extraction approaches and matching algorithms along with overview of available palmprint datasets in order to understand the latest trends and research dynamics in the palmprint recognition field.
Published 2020-03-02
URL https://arxiv.org/abs/2003.00737v1
PDF https://arxiv.org/pdf/2003.00737v1.pdf
PWC https://paperswithcode.com/paper/towards-unconstrained-palmprint-recognition

Detection and Description of Change in Visual Streams

Title Detection and Description of Change in Visual Streams
Authors Davis Gilton, Ruotian Luo, Rebecca Willett, Greg Shakhnarovich
Abstract This paper presents a framework for the analysis of changes in visual streams: ordered sequences of images, possibly separated by significant time gaps. We propose a new approach to incorporating unlabeled data into training to generate natural language descriptions of change. We also develop a framework for estimating the time of change in visual stream. We use learned representations for change evidence and consistency of perceived change, and combine these in a regularized graph cut based change detector. Experimental evaluation on visual stream datasets, which we release as part of our contribution, shows that representation learning driven by natural language descriptions significantly improves change detection accuracy, compared to methods that do not rely on language.
Tasks Representation Learning
Published 2020-03-27
URL https://arxiv.org/abs/2003.12633v1
PDF https://arxiv.org/pdf/2003.12633v1.pdf
PWC https://paperswithcode.com/paper/detection-and-description-of-change-in-visual

Detecting Fake News with Capsule Neural Networks

Title Detecting Fake News with Capsule Neural Networks
Authors Mohammad Hadi Goldani, Saeedeh Momtazi, Reza Safabakhsh
Abstract Fake news is dramatically increased in social media in recent years. This has prompted the need for effective fake news detection algorithms. Capsule neural networks have been successful in computer vision and are receiving attention for use in Natural Language Processing (NLP). This paper aims to use capsule neural networks in the fake news detection task. We use different embedding models for news items of different lengths. Static word embedding is used for short news items, whereas non-static word embeddings that allow incremental up-training and updating in the training phase are used for medium length or large news statements. Moreover, we apply different levels of n-grams for feature extraction. Our proposed architectures are evaluated on two recent well-known datasets in the field, namely ISOT and LIAR. The results show encouraging performance, outperforming the state-of-the-art methods by 7.8% on ISOT and 3.1% on the validation set, and 1% on the test set of the LIAR dataset.
Tasks Fake News Detection, Word Embeddings
Published 2020-02-03
URL https://arxiv.org/abs/2002.01030v1
PDF https://arxiv.org/pdf/2002.01030v1.pdf
PWC https://paperswithcode.com/paper/detecting-fake-news-with-capsule-neural

Scalable and Customizable Benchmark Problems for Many-Objective Optimization

Title Scalable and Customizable Benchmark Problems for Many-Objective Optimization
Authors Ivan Reinaldo Meneghini, Marcos Antonio Alves, António Gaspar-Cunha, Frederico Gadelha Guimarães
Abstract Solving many-objective problems (MaOPs) is still a significant challenge in the multi-objective optimization (MOO) field. One way to measure algorithm performance is through the use of benchmark functions (also called test functions or test suites), which are artificial problems with a well-defined mathematical formulation, known solutions and a variety of features and difficulties. In this paper we propose a parameterized generator of scalable and customizable benchmark problems for MaOPs. It is able to generate problems that reproduce features present in other benchmarks and also problems with some new features. We propose here the concept of generative benchmarking, in which one can generate an infinite number of MOO problems, by varying parameters that control specific features that the problem should have: scalability in the number of variables and objectives, bias, deceptiveness, multimodality, robust and non-robust solutions, shape of the Pareto front, and constraints. The proposed Generalized Position-Distance (GPD) tunable benchmark generator uses the position-distance paradigm, a basic approach to building test functions, used in other benchmarks such as Deb, Thiele, Laumanns and Zitzler (DTLZ), Walking Fish Group (WFG) and others. It includes scalable problems in any number of variables and objectives and it presents Pareto fronts with different characteristics. The resulting functions are easy to understand and visualize, easy to implement, fast to compute and their Pareto optimal solutions are known.
Published 2020-01-26
URL https://arxiv.org/abs/2001.11591v2
PDF https://arxiv.org/pdf/2001.11591v2.pdf
PWC https://paperswithcode.com/paper/scalable-and-customizable-benchmark-problems

Machine Learning-aided Design of Thinned Antenna Arrays for Optimized Network Level Performance

Title Machine Learning-aided Design of Thinned Antenna Arrays for Optimized Network Level Performance
Authors Mattia Lecci, Paolo Testolina, Mattia Rebato, Alberto Testolin, Michele Zorzi
Abstract With the advent of millimeter wave (mmWave) communications, the combination of a detailed 5G network simulator with an accurate antenna radiation model is required to analyze the realistic performance of complex cellular scenarios. However, due to the complexity of both electromagnetic and network models, the design and optimization of antenna arrays is generally infeasible due to the required computational resources and simulation time. In this paper, we propose a Machine Learning framework that enables a simulation-based optimization of the antenna design. We show how learning methods are able to emulate a complex simulator with a modest dataset obtained from it, enabling a global numerical optimization over a vast multi-dimensional parameter space in a reasonable amount of time. Overall, our results show that the proposed methodology can be successfully applied to the optimization of thinned antenna arrays.
Published 2020-01-25
URL https://arxiv.org/abs/2001.09335v1
PDF https://arxiv.org/pdf/2001.09335v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-aided-design-of-thinned

An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions

Title An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions
Authors Samuele Tosatto, Riad Akrour, Jan Peters
Abstract The Nadaraya-Watson kernel estimator is among the most popular nonparameteric regression technique thanks to its simplicity. Its asymptotic bias has been studied by Rosenblatt in 1969 and has been reported in a number of related literature. However, Rosenblatt’s analysis is only valid for infinitesimal bandwidth. In contrast, we propose in this paper an upper bound of the bias which holds for finite bandwidths. Moreover, contrarily to the classic analysis we allow for discontinuous first order derivative of the regression function, we extend our bounds for multidimensional domains and we include the knowledge of the bound of the regression function when it exists and if it is known, to obtain a tighter bound. We believe that this work has potential applications in those fields where some hard guarantees on the error are needed
Published 2020-01-29
URL https://arxiv.org/abs/2001.10972v2
PDF https://arxiv.org/pdf/2001.10972v2.pdf
PWC https://paperswithcode.com/paper/an-upper-bound-of-the-bias-of-nadaraya-watson

Butterfly detection and classification based on integrated YOLO algorithm

Title Butterfly detection and classification based on integrated YOLO algorithm
Authors Bohan Liang, Shangxi Wu, Kaiyuan Xu, Jingyu Hao
Abstract Insects are abundant species on the earth, and the task of identification and identification of insects is complex and arduous. How to apply artificial intelligence technology and digital image processing methods to automatic identification of insect species is a hot issue in current research. In this paper, the problem of automatic detection and classification recognition of butterfly photographs is studied, and a method of bio-labeling suitable for butterfly classification is proposed. On the basis of YOLO algorithm, by synthesizing the results of YOLO models with different training mechanisms, a butterfly automatic detection and classification recognition algorithm based on YOLO algorithm is proposed. It greatly improves the generalization ability of YOLO algorithm and makes it have better ability to solve small sample problems. The experimental results show that the proposed annotation method and integrated YOLO algorithm have high accuracy and recognition rate in butterfly automatic detection and recognition.
Published 2020-01-02
URL https://arxiv.org/abs/2001.00361v1
PDF https://arxiv.org/pdf/2001.00361v1.pdf
PWC https://paperswithcode.com/paper/butterfly-detection-and-classification-based
comments powered by Disqus