April 2, 2020

3276 words 16 mins read

Paper Group ANR 386

End-to-end Learning of Object Motion Estimation from Retinal Events for Event-based Object Tracking. Detecting Pancreatic Adenocarcinoma in Multi-phase CT Scans via Alignment Ensemble. Landmark2Vec: An Unsupervised Neural Network-Based Landmark Positioning Method. Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized Neural Netwo …

End-to-end Learning of Object Motion Estimation from Retinal Events for Event-based Object Tracking


Title	End-to-end Learning of Object Motion Estimation from Retinal Events for Event-based Object Tracking
Authors	Haosheng Chen, David Suter, Qiangqiang Wu, Hanzi Wang
Abstract	Event cameras, which are asynchronous bio-inspired vision sensors, have shown great potential in computer vision and artificial intelligence. However, the application of event cameras to object-level motion estimation or tracking is still in its infancy. The main idea behind this work is to propose a novel deep neural network to learn and regress a parametric object-level motion/transform model for event-based object tracking. To achieve this goal, we propose a synchronous Time-Surface with Linear Time Decay (TSLTD) representation, which effectively encodes the spatio-temporal information of asynchronous retinal events into TSLTD frames with clear motion patterns. We feed the sequence of TSLTD frames to a novel Retinal Motion Regression Network (RMRNet) to perform an end-to-end 5-DoF object motion regression. Our method is compared with state-of-the-art object tracking methods, that are based on conventional cameras or event cameras. The experimental results show the superiority of our method in handling various challenging environments such as fast motion and low illumination conditions.
Tasks	Motion Estimation, Object Tracking
Published	2020-02-14
URL	https://arxiv.org/abs/2002.05911v1
PDF	https://arxiv.org/pdf/2002.05911v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-learning-of-object-motion
Repo
Framework

Detecting Pancreatic Adenocarcinoma in Multi-phase CT Scans via Alignment Ensemble


Title	Detecting Pancreatic Adenocarcinoma in Multi-phase CT Scans via Alignment Ensemble
Authors	Yingda Xia, Qihang Yu, Wei Shen, Yuyin Zhou, Alan L. Yuille
Abstract	Pancreatic ductal adenocarcinoma (PDAC) is one of the most lethal cancers among population. Screening for PDACs in dynamic contrast-enhanced CT is beneficial for early diagnose. In this paper, we investigate the problem of automated detecting PDACs in multi-phase (arterial and venous) CT scans. Multiple phases provide more information than single phase, but they are unaligned and inhomogeneous in texture, making it difficult to combine cross-phase information seamlessly. We study multiple phase alignment strategies, i.e., early alignment (image registration), late alignment (high-level feature registration) and slow alignment (multi-level feature registration), and suggest an ensemble of all these alignments as a promising way to boost the performance of PDAC detection. We provide an extensive empirical evaluation on two PDAC datasets and show that the proposed alignment ensemble significantly outperforms previous state-of-the-art approaches, illustrating strong potential for clinical use.
Tasks	Image Registration
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08441v1
PDF	https://arxiv.org/pdf/2003.08441v1.pdf
PWC	https://paperswithcode.com/paper/detecting-pancreatic-adenocarcinoma-in-multi
Repo
Framework

Landmark2Vec: An Unsupervised Neural Network-Based Landmark Positioning Method


Title	Landmark2Vec: An Unsupervised Neural Network-Based Landmark Positioning Method
Authors	Alireza Razavi
Abstract	A Neural Network-based method for unsupervised landmarks map estimation from measurements taken from landmarks is introduced. The measurements needed for training the network are the signals observed/received from landmarks by an agent. The definition of landmarks, agent, and the measurements taken by agent from landmarks is rather broad here: landmarks can be visual objects, e.g., poles along a road, with measurements being the size of landmark in a visual sensor mounted on a vehicle (agent), or they can be radio transmitters, e.g., WiFi access points inside a building, with measurements being the Received Signal Strength (RSS) heard from them by a mobile device carried by a person (agent). The goal of the map estimation is then to find the positions of landmarks up to a scale, rotation, and shift (i.e., the topological map of the landmarks). Assuming that there are $L$ landmarks, the measurements will be $L \times 1$ vectors collected over the area. A shallow network then will be trained to learn the map without any ground truth information.
Tasks
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10568v1
PDF	https://arxiv.org/pdf/2001.10568v1.pdf
PWC	https://paperswithcode.com/paper/landmark2vec-an-unsupervised-neural-network
Repo
Framework

Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized Neural Networks


Title	Towards Explainable Bit Error Tolerance of Resistive RAM-Based Binarized Neural Networks
Authors	Sebastian Buschjäger, Jian-Jia Chen, Kuan-Hsun Chen, Mario Günzel, Christian Hakert, Katharina Morik, Rodion Novkin, Lukas Pfahler, Mikail Yayla
Abstract	Non-volatile memory, such as resistive RAM (RRAM), is an emerging energy-efficient storage, especially for low-power machine learning models on the edge. It is reported, however, that the bit error rate of RRAMs can be up to 3.3% in the ultra low-power setting, which might be crucial for many use cases. Binary neural networks (BNNs), a resource efficient variant of neural networks (NNs), can tolerate a certain percentage of errors without a loss in accuracy and demand lower resources in computation and storage. The bit error tolerance (BET) in BNNs can be achieved by flipping the weight signs during training, as proposed by Hirtzlin et al., but their method has a significant drawback, especially for fully connected neural networks (FCNN): The FCNNs overfit to the error rate used in training, which leads to low accuracy under lower error rates. In addition, the underlying principles of BET are not investigated. In this work, we improve the training for BET of BNNs and aim to explain this property. We propose straight-through gradient approximation to improve the weight-sign-flip training, by which BNNs adapt less to the bit error rates. To explain the achieved robustness, we define a metric that aims to measure BET without fault injection. We evaluate the metric and find that it correlates with accuracy over error rate for all FCNNs tested. Finally, we explore the influence of a novel regularizer that optimizes with respect to this metric, with the aim of providing a configurable trade-off in accuracy and BET.
Tasks
Published	2020-02-03
URL	https://arxiv.org/abs/2002.00909v1
PDF	https://arxiv.org/pdf/2002.00909v1.pdf
PWC	https://paperswithcode.com/paper/towards-explainable-bit-error-tolerance-of
Repo
Framework

Leveraging Vision and Kinematics Data to Improve Realism of Biomechanic Soft-tissue Simulation for Robotic Surgery


Title	Leveraging Vision and Kinematics Data to Improve Realism of Biomechanic Soft-tissue Simulation for Robotic Surgery
Authors	Jie Ying Wu, Peter Kazanzides, Mathias Unberath
Abstract	Purpose Surgical simulations play an increasingly important role in surgeon education and developing algorithms that enable robots to perform surgical subtasks. To model anatomy, Finite Element Method (FEM) simulations have been held as the gold standard for calculating accurate soft-tissue deformation. Unfortunately, their accuracy is highly dependent on the simulation parameters, which can be difficult to obtain. Methods In this work, we investigate how live data acquired during any robotic endoscopic surgical procedure may be used to correct for inaccurate FEM simulation results. Since FEMs are calculated from initial parameters and cannot directly incorporate observations, we propose to add a correction factor that accounts for the discrepancy between simulation and observations. We train a network to predict this correction factor. Results To evaluate our method, we use an open-source da Vinci Surgical System to probe a soft-tissue phantom and replay the interaction in simulation. We train the network to correct for the difference between the predicted mesh position and the measured point cloud. This results in 15-30% improvement in the mean distance, demonstrating the effectiveness of our approach across a large range of simulation parameters. Conclusion We show a first step towards a framework that synergistically combines the benefits of model-based simulation and real-time observations. It corrects discrepancies between simulation and the scene that results from inaccurate modeling parameters. This can provide a more accurate simulation environment for surgeons and better data with which to train algorithms.
Tasks
Published	2020-03-14
URL	https://arxiv.org/abs/2003.06518v1
PDF	https://arxiv.org/pdf/2003.06518v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-vision-and-kinematics-data-to
Repo
Framework

Rembrandts and Robots: Using Neural Networks to Explore Authorship in Painting


Title	Rembrandts and Robots: Using Neural Networks to Explore Authorship in Painting
Authors	Steven J. Frank, Andrea M. Frank
Abstract	We use convolutional neural networks to analyze authorship questions surrounding works of representational art. Trained on the works of an artist under study and visually comparable works of other artists, our system can identify forgeries and provide attributions. Our system can also assign classification probabilities within a painting, revealing mixed authorship and identifying regions painted by different hands.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05107v1
PDF	https://arxiv.org/pdf/2002.05107v1.pdf
PWC	https://paperswithcode.com/paper/rembrandts-and-robots-using-neural-networks
Repo
Framework


Title	Artificial Intelligence for Social Good: A Survey
Authors	Zheyuan Ryan Shi, Claire Wang, Fei Fang
Abstract	Artificial intelligence for social good (AI4SG) is a research theme that aims to use and advance artificial intelligence to address societal issues and improve the well-being of the world. AI4SG has received lots of attention from the research community in the past decade with several successful applications. Building on the most comprehensive collection of the AI4SG literature to date with over 1000 contributed papers, we provide a detailed account and analysis of the work under the theme in the following ways. (1) We quantitatively analyze the distribution and trend of the AI4SG literature in terms of application domains and AI techniques used. (2) We propose three conceptual methods to systematically group the existing literature and analyze the eight AI4SG application domains in a unified framework. (3) We distill five research topics that represent the common challenges in AI4SG across various application domains. (4) We discuss five issues that, we hope, can shed light on the future development of the AI4SG research.
Tasks
Published	2020-01-07
URL	https://arxiv.org/abs/2001.01818v1
PDF	https://arxiv.org/pdf/2001.01818v1.pdf
PWC	https://paperswithcode.com/paper/artificial-intelligence-for-social-good-a
Repo
Framework

PyCARL: A PyNN Interface for Hardware-Software Co-Simulation of Spiking Neural Network


Title	PyCARL: A PyNN Interface for Hardware-Software Co-Simulation of Spiking Neural Network
Authors	Adarsha Balaji, Prathyusha Adiraju, Hirak J. Kashyap, Anup Das, Jeffrey L. Krichmar, Nikil D. Dutt, Francky Catthoor
Abstract	We present PyCARL, a PyNN-based common Python programming interface for hardware-software co-simulation of spiking neural network (SNN). Through PyCARL, we make the following two key contributions. First, we provide an interface of PyNN to CARLsim, a computationally-efficient, GPU-accelerated and biophysically-detailed SNN simulator. PyCARL facilitates joint development of machine learning models and code sharing between CARLsim and PyNN users, promoting an integrated and larger neuromorphic community. Second, we integrate cycle-accurate models of state-of-the-art neuromorphic hardware such as TrueNorth, Loihi, and DynapSE in PyCARL, to accurately model hardware latencies that delay spikes between communicating neurons and degrade performance. PyCARL allows users to analyze and optimize the performance difference between software-only simulation and hardware-software co-simulation of their machine learning models. We show that system designers can also use PyCARL to perform design-space exploration early in the product development stage, facilitating faster time-to-deployment of neuromorphic products. We evaluate the memory usage and simulation time of PyCARL using functionality tests, synthetic SNNs, and realistic applications. Our results demonstrate that for large SNNs, PyCARL does not lead to any significant overhead compared to CARLsim. We also use PyCARL to analyze these SNNs for a state-of-the-art neuromorphic hardware and demonstrate a significant performance deviation from software-only simulations. PyCARL allows to evaluate and minimize such differences early during model development.
Tasks
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09696v1
PDF	https://arxiv.org/pdf/2003.09696v1.pdf
PWC	https://paperswithcode.com/paper/pycarl-a-pynn-interface-for-hardware-software
Repo
Framework

How well do U-Net-based segmentation trained on adult cardiac magnetic resonance imaging data generalise to rare congenital heart diseases for surgical planning?


Title	How well do U-Net-based segmentation trained on adult cardiac magnetic resonance imaging data generalise to rare congenital heart diseases for surgical planning?
Authors	Sven Koehler, Animesh Tandon, Tarique Hussain, Heiner Latus, Thomas Pickardt, Samir Sarikouch, Philipp Beerbaum, Gerald Greil, Sandy Engelhardt, Ivo Wolf
Abstract	Planning the optimal time of intervention for pulmonary valve replacement surgery in patients with the congenital heart disease Tetralogy of Fallot (TOF) is mainly based on ventricular volume and function according to current guidelines. Both of these two biomarkers are most reliably assessed by segmentation of 3D cardiac magnetic resonance (CMR) images. In several grand challenges in the last years, U-Net architectures have shown impressive results on the provided data. However, in clinical practice, data sets are more diverse considering individual pathologies and image properties derived from different scanner properties. Additionally, specific training data for complex rare diseases like TOF is scarce. For this work, 1) we assessed the accuracy gap when using a publicly available labelled data set (the Automatic Cardiac Diagnosis Challenge (ACDC) data set) for training and subsequent applying it to CMR data of TOF patients and vice versa and 2) whether we can achieve similar results when applying the model to a more heterogeneous data base. Multiple deep learning models were trained with four-fold cross validation. Afterwards they were evaluated on the respective unseen CMR images from the other collection. Our results confirm that current deep learning models can achieve excellent results (left ventricle dice of $0.951\pm{0.003}$/$0.941\pm{0.007}$ train/validation) within a single data collection. But once they are applied to other pathologies, it becomes apparent how much they overfit to the training pathologies (dice score drops between $0.072\pm{0.001}$ for the left and $0.165\pm{0.001}$ for the right ventricle).
Tasks
Published	2020-02-10
URL	https://arxiv.org/abs/2002.04392v1
PDF	https://arxiv.org/pdf/2002.04392v1.pdf
PWC	https://paperswithcode.com/paper/how-well-do-u-net-based-segmentation-trained
Repo
Framework

Bayesian Quantile and Expectile Optimisation


Title	Bayesian Quantile and Expectile Optimisation
Authors	Léonard Torossian, Victor Picheny, Nicolas Durrande
Abstract	Bayesian optimisation is widely used to optimise stochastic black box functions. While most strategies are focused on optimising conditional expectations, a large variety of applications require risk-averse decisions and alternative criteria accounting for the distribution tails need to be considered. In this paper, we propose new variational models for Bayesian quantile and expectile regression that are well-suited for heteroscedastic settings. Our models consist of two latent Gaussian processes accounting respectively for the conditional quantile (or expectile) and variance that are chained through asymmetric likelihood functions. Furthermore, we propose two Bayesian optimisation strategies, either derived from a GP-UCB or Thompson sampling, that are tailored to such models and that can accommodate large batches of points. As illustrated in the experimental section, the proposed approach clearly outperforms the state of the art.
Tasks	Bayesian Optimisation, Gaussian Processes
Published	2020-01-12
URL	https://arxiv.org/abs/2001.04833v1
PDF	https://arxiv.org/pdf/2001.04833v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-quantile-and-expectile-optimisation
Repo
Framework

Fast Monte Carlo Dropout and Error Correction for Radio Transmitter Classification


Title	Fast Monte Carlo Dropout and Error Correction for Radio Transmitter Classification
Authors	Liangping Ma, John Kaewell
Abstract	Monte Carlo dropout may effectively capture model uncertainty in deep learning, where a measure of uncertainty is obtained by using multiple instances of dropout at test time. However, Monte Carlo dropout is applied across the whole network and thus significantly increases the computational complexity, proportional to the number of instances. To reduce the computational complexity, at test time we enable dropout layers only near the output of the neural network and reuse the computation from prior layers while keeping, if any, other dropout layers disabled. Additionally, we leverage the side information about the ideal distributions for various input samples to do `error correction’ on the predictions. We apply these techniques to the radio frequency (RF) transmitter classification problem and show that the proposed algorithm is able to provide better prediction uncertainty than the simple ensemble average algorithm and can be used to effectively identify transmitters that are not in the training data set while correctly classifying transmitters it has been trained on. \|
Tasks
Published	2020-01-31
URL	https://arxiv.org/abs/2001.11963v1
PDF	https://arxiv.org/pdf/2001.11963v1.pdf
PWC	https://paperswithcode.com/paper/fast-monte-carlo-dropout-and-error-correction
Repo
Framework

Overview of chemical ontologies


Title	Overview of chemical ontologies
Authors	Christian Pachl, Nils Frank, Jan Breitbart, Stefan Bräse
Abstract	Ontologies order and interconnect knowledge of a certain field in a formal and semantic way so that they are machine-parsable. They try to define allwhere acceptable definition of concepts and objects, classify them, provide properties as well as interconnect them with relations (e.g. “A is a special case of B”). More precisely, Tom Gruber defines Ontologies as a “specification of a conceptualization; […] a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents.” [1] An Ontology is made of Individuals which are organized in Classes. Both can have Attributes and Relations among themselves. Some complex Ontologies define Restrictions, Rules and Events which change attributes or relations. To be computer accessible they are written in certain ontology languages, like the OBO language or the more used Common Algebraic Specification Language. With the rising of a digitalized, interconnected and globalized world, where common standards have to be found, ontologies are of great interest. [2] Yet, the development of chemical ontologies is in the beginning. Indeed, some interesting basic approaches towards chemical ontologies can be found, but nevertheless they suffer from two main flaws. Firstly, we found that they are mostly only fragmentary completed or are still in an architecture state. Secondly, apparently no chemical ontology is widespread accepted. Therefore, we herein try to describe the major ontology-developments in the chemical related fields Ontologies about chemical analytical methods, Ontologies about name reactions and Ontologies about scientific units.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.03842v1
PDF	https://arxiv.org/pdf/2002.03842v1.pdf
PWC	https://paperswithcode.com/paper/overview-of-chemical-ontologies
Repo
Framework

Deep Network Approximation for Smooth Functions


Title	Deep Network Approximation for Smooth Functions
Authors	Jianfeng Lu, Zuowei Shen, Haizhao Yang, Shijun Zhang
Abstract	This paper establishes optimal approximation error characterization of deep ReLU networks for smooth functions in terms of both width and depth simultaneously. To that end, we first prove that multivariate polynomials can be approximated by deep ReLU networks of width $\mathcal{O}(N)$ and depth $\mathcal{O}(L)$ with an approximation error $\mathcal{O}(N^{-L})$. Through local Taylor expansions and their deep ReLU network approximations, we show that deep ReLU networks of width $\mathcal{O}(N\ln N)$ and depth $\mathcal{O}(L\ln L)$ can approximate $f\in C^s([0,1]^d)$ with a nearly optimal approximation rate $\mathcal{O}(\f_{C^s([0,1]^d)}N^{-2s/d}L^{-2s/d})$. Our estimate is non-asymptotic in the sense that it is valid for arbitrary width and depth specified by $N\in\mathbb{N}^+$ and $L\in\mathbb{N}^+$, respectively.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.03040v1
PDF	https://arxiv.org/pdf/2001.03040v1.pdf
PWC	https://paperswithcode.com/paper/deep-network-approximation-for-smooth
Repo
Framework

NODIS: Neural Ordinary Differential Scene Understanding


Title	NODIS: Neural Ordinary Differential Scene Understanding
Authors	Cong Yuren, Hanno Ackermann, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn
Abstract	Semantic image understanding is a challenging topic in computer vision. It requires to detect all objects in an image, but also to identify all the relations between them. Detected objects, their labels and the discovered relations can be used to construct a scene graph which provides an abstract semantic interpretation of an image. In previous works, relations were identified by solving an assignment problem formulated as Mixed-Integer Linear Programs. In this work, we interpret that formulation as Ordinary Differential Equation (ODE). The proposed architecture performs scene graph inference by solving a neural variant of an ODE by end-to-end learning. It achieves state-of-the-art results on all three benchmark tasks: scene graph generation (SGGen), classification (SGCls) and visual relationship detection (PredCls) on Visual Genome benchmark.
Tasks	Graph Generation, Scene Graph Generation, Scene Understanding
Published	2020-01-14
URL	https://arxiv.org/abs/2001.04735v1
PDF	https://arxiv.org/pdf/2001.04735v1.pdf
PWC	https://paperswithcode.com/paper/nodis-neural-ordinary-differential-scene
Repo
Framework

BARNet: Bilinear Attention Network with Adaptive Receptive Field for Surgical Instrument Segmentation


Title	BARNet: Bilinear Attention Network with Adaptive Receptive Field for Surgical Instrument Segmentation
Authors	Zhen-Liang Ni, Gui-Bin Bian, Guan-An Wang, Xiao-Hu Zhou, Zeng-Guang Hou, Xiao-Liang Xie, Zhen Li, Yu-Han Wang
Abstract	Surgical instrument segmentation is extremely important for computer-assisted surgery. Different from common object segmentation, it is more challenging due to the large illumination and scale variation caused by the special surgical scenes. In this paper, we propose a novel bilinear attention network with adaptive receptive field to solve these two challenges. For the illumination variation, the bilinear attention module can capture second-order statistics to encode global contexts and semantic dependencies between local pixels. With them, semantic features in challenging areas can be inferred from their neighbors and the distinction of various semantics can be boosted. For the scale variation, our adaptive receptive field module aggregates multi-scale features and automatically fuses them with different weights. Specifically, it encodes the semantic relationship between channels to emphasize feature maps with appropriate scales, changing the receptive field of subsequent convolutions. The proposed network achieves the best performance 97.47% mean IOU on Cata7 and comes first place on EndoVis 2017 by 10.10% IOU overtaking second-ranking method.
Tasks	Semantic Segmentation
Published	2020-01-20
URL	https://arxiv.org/abs/2001.07093v1
PDF	https://arxiv.org/pdf/2001.07093v1.pdf
PWC	https://paperswithcode.com/paper/barnet-bilinear-attention-network-with
Repo
Framework