April 3, 2020

3244 words 16 mins read

Paper Group ANR 18

Paper Group ANR 18

FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning. Parallel sequence tagging for concept recognition. MagicEyes: A Large Scale Eye Gaze Estimation Dataset for Mixed Reality. The empirical structure of word frequency distributions. Deep Attention Spatio-Temporal Point Processes. DeepCap: Monocular Human Performance …

FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning

Title FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning
Authors Suyu Ge, Fangzhao Wu, Chuhan Wu, Tao Qi, Yongfeng Huang, Xing Xie
Abstract Medical named entity recognition (NER) has wide applications in intelligent healthcare. Sufficient labeled data is critical for training accurate medical NER model. However, the labeled data in a single medical platform is usually limited. Although labeled datasets may exist in many different medical platforms, they cannot be directly shared since medical data is highly privacy-sensitive. In this paper, we propose a privacy-preserving medical NER method based on federated learning, which can leverage the labeled data in different platforms to boost the training of medical NER model and remove the need of exchanging raw data among different platforms. Since the labeled data in different platforms usually has some differences in entity type and annotation criteria, instead of constraining different platforms to share the same model, we decompose the medical NER model in each platform into a shared module and a private module. The private module is used to capture the characteristics of the local data in each platform, and is updated using local labeled data. The shared module is learned across different medical platform to capture the shared NER knowledge. Its local gradients from different platforms are aggregated to update the global shared module, which is further delivered to each platform to update their local shared modules. Experiments on three publicly available datasets validate the effectiveness of our method.
Tasks Medical Named Entity Recognition, Named Entity Recognition
Published 2020-03-20
URL https://arxiv.org/abs/2003.09288v2
PDF https://arxiv.org/pdf/2003.09288v2.pdf
PWC https://paperswithcode.com/paper/fedner-medical-named-entity-recognition-with

Parallel sequence tagging for concept recognition

Title Parallel sequence tagging for concept recognition
Authors Lenz Furrer, Joseph Cornelius, Fabio Rinaldi
Abstract Motivation: Named Entity Recognition (NER) and Normalisation (NEN) are core components of any text-mining system for biomedical texts. In a traditional concept-recognition pipeline, these tasks are combined in a serial way, which is inherently prone to error propagation from NER to NEN. We propose a parallel architecture, where both NER and NEN are modeled as a sequence-labeling task, operating directly on the source text. We examine different harmonisation strategies for merging the predictions of the two classifiers into a single output sequence. Results: We test our approach on the recent Version 4 of the CRAFT corpus. In all 20 annotation sets of the concept-annotation task, our system outperforms the pipeline system reported as a baseline in the CRAFT shared task 2019. Our analysis shows that the strengths of the two classifiers can be combined in a fruitful way. However, prediction harmonisation requires individual calibration on a development set for each annotation set. This allows achieving a good trade-off between established knowledge (training set) and novel information (unseen concepts). Availability and Implementation: Source code freely available for download at https://github.com/OntoGene/craft-st. Supplementary data are available at arXiv online.
Tasks Calibration, Named Entity Recognition
Published 2020-03-16
URL https://arxiv.org/abs/2003.07424v1
PDF https://arxiv.org/pdf/2003.07424v1.pdf
PWC https://paperswithcode.com/paper/parallel-sequence-tagging-for-concept

MagicEyes: A Large Scale Eye Gaze Estimation Dataset for Mixed Reality

Title MagicEyes: A Large Scale Eye Gaze Estimation Dataset for Mixed Reality
Authors Zhengyang Wu, Srivignesh Rajendran, Tarrence van As, Joelle Zimmermann, Vijay Badrinarayanan, Andrew Rabinovich
Abstract With the emergence of Virtual and Mixed Reality (XR) devices, eye tracking has received significant attention in the computer vision community. Eye gaze estimation is a crucial component in XR – enabling energy efficient rendering, multi-focal displays, and effective interaction with content. In head-mounted XR devices, the eyes are imaged off-axis to avoid blocking the field of view. This leads to increased challenges in inferring eye related quantities and simultaneously provides an opportunity to develop accurate and robust learning based approaches. To this end, we present MagicEyes, the first large scale eye dataset collected using real MR devices with comprehensive ground truth labeling. MagicEyes includes $587$ subjects with $80,000$ images of human-labeled ground truth and over $800,000$ images with gaze target labels. We evaluate several state-of-the-art methods on MagicEyes and also propose a new multi-task EyeNet model designed for detecting the cornea, glints and pupil along with eye segmentation in a single forward pass.
Tasks Eye Tracking, Gaze Estimation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08806v1
PDF https://arxiv.org/pdf/2003.08806v1.pdf
PWC https://paperswithcode.com/paper/magiceyes-a-large-scale-eye-gaze-estimation

The empirical structure of word frequency distributions

Title The empirical structure of word frequency distributions
Authors Michael Ramscar
Abstract The frequencies at which individual words occur across languages follow power law distributions, a pattern of findings known as Zipf’s law. A vast literature argues over whether this serves to optimize the efficiency of human communication, however this claim is necessarily post hoc, and it has been suggested that Zipf’s law may in fact describe mixtures of other distributions. From this perspective, recent findings that Sinosphere first (family) names are geometrically distributed are notable, because this is actually consistent with information theoretic predictions regarding optimal coding. First names form natural communicative distributions in most languages, and I show that when analyzed in relation to the communities in which they are used, first name distributions across a diverse set of languages are both geometric and, historically, remarkably similar, with power law distributions only emerging when empirical distributions are aggregated. I then show this pattern of findings replicates in communicative distributions of English nouns and verbs. These results indicate that if lexical distributions support efficient communication, they do so because their functional structures directly satisfy the constraints described by information theory, and not because of Zipf’s law. Understanding the function of these information structures is likely to be key to explaining humankind’s remarkable communicative capacities.
Published 2020-01-09
URL https://arxiv.org/abs/2001.05292v1
PDF https://arxiv.org/pdf/2001.05292v1.pdf
PWC https://paperswithcode.com/paper/the-empirical-structure-of-word-frequency

Deep Attention Spatio-Temporal Point Processes

Title Deep Attention Spatio-Temporal Point Processes
Authors Shixiang Zhu, Minghe Zhang, Ruyi Ding, Yao Xie
Abstract We present a novel attention-based sequential model for mutually dependent spatio-temporal discrete event data, which is a versatile framework for capturing the non-homogeneous influence of events. We go beyond the assumption that the influence of the historical event (causing an upper-ward or downward jump in the intensity function) will fade monotonically over time, which is a key assumption made by many widely-used point process models, including those based on Recurrent Neural Networks (RNNs). We borrow the idea from the attention model based on a probabilistic score function, which leads to a flexible representation of the intensity function and is highly interpretable. We demonstrate the superior performance of our approach compared to the state-of-the-art for both synthetic and real data.
Tasks Deep Attention, Point Processes
Published 2020-02-17
URL https://arxiv.org/abs/2002.07281v2
PDF https://arxiv.org/pdf/2002.07281v2.pdf
PWC https://paperswithcode.com/paper/deep-attention-spatio-temporal-point

DeepCap: Monocular Human Performance Capture Using Weak Supervision

Title DeepCap: Monocular Human Performance Capture Using Weak Supervision
Authors Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt
Abstract Human performance capture is a highly important computer vision problem with many applications in movie production and virtual/augmented reality. Many previous performance capture approaches either required expensive multi-view setups or did not recover dense space-time coherent geometry with frame-to-frame correspondences. We propose a novel deep learning approach for monocular dense human performance capture. Our method is trained in a weakly supervised manner based on multi-view supervision completely removing the need for training data with 3D ground truth annotations. The network architecture is based on two separate networks that disentangle the task into a pose estimation and a non-rigid surface deformation step. Extensive qualitative and quantitative evaluations show that our approach outperforms the state of the art in terms of quality and robustness.
Tasks Pose Estimation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08325v1
PDF https://arxiv.org/pdf/2003.08325v1.pdf
PWC https://paperswithcode.com/paper/deepcap-monocular-human-performance-capture

Robust quantum minimum finding with an application to hypothesis selection

Title Robust quantum minimum finding with an application to hypothesis selection
Authors Yihui Quek, Clement Canonne, Patrick Rebentrost
Abstract We consider the problem of finding the minimum element in a list of length $N$ using a noisy comparator. The noise is modelled as follows: given two elements to compare, if the values of the elements differ by at least $\alpha$ by some metric defined on the elements, then the comparison will be made correctly; if the values of the elements are closer than $\alpha$, the outcome of the comparison is not subject to any guarantees. We demonstrate a quantum algorithm for noisy quantum minimum-finding that preserves the quadratic speedup of the noiseless case: our algorithm runs in time $\tilde O(\sqrt{N (1+\Delta)})$, where $\Delta$ is an upper-bound on the number of elements within the interval $\alpha$, and outputs a good approximation of the true minimum with high probability. Our noisy comparator model is motivated by the problem of hypothesis selection, where given a set of $N$ known candidate probability distributions and samples from an unknown target distribution, one seeks to output some candidate distribution $O(\varepsilon)$-close to the unknown target. Much work on the classical front has been devoted to speeding up the run time of classical hypothesis selection from $O(N^2)$ to $O(N)$, in part by using statistical primitives such as the Scheff'{e} test. Assuming a quantum oracle generalization of the classical data access and applying our noisy quantum minimum-finding algorithm, we take this run time into the sublinear regime. The final expected run time is $\tilde O( \sqrt{N(1+\Delta)})$, with the same $O(\log N)$ sample complexity from the unknown distribution as the classical algorithm. We expect robust quantum minimum-finding to be a useful building block for algorithms in situations where the comparator (which may be another quantum or classical algorithm) is resolution-limited or subject to some uncertainty.
Published 2020-03-26
URL https://arxiv.org/abs/2003.11777v1
PDF https://arxiv.org/pdf/2003.11777v1.pdf
PWC https://paperswithcode.com/paper/robust-quantum-minimum-finding-with-an

Data Uncertainty Learning in Face Recognition

Title Data Uncertainty Learning in Face Recognition
Authors Jie Chang, Zhonghao Lan, Changmao Cheng, Yichen Wei
Abstract Modeling data uncertainty is important for noisy images, but seldom explored for face recognition. The pioneer work, PFE, considers uncertainty by modeling each face image embedding as a Gaussian distribution. It is quite effective. However, it uses fixed feature (mean of the Gaussian) from an existing model. It only estimates the variance and relies on an ad-hoc and costly metric. Thus, it is not easy to use. It is unclear how uncertainty affects feature learning. This work applies data uncertainty learning to face recognition, such that the feature (mean) and uncertainty (variance) are learnt simultaneously, for the first time. Two learning methods are proposed. They are easy to use and outperform existing deterministic methods as well as PFE on challenging unconstrained scenarios. We also provide insightful analysis on how incorporating uncertainty estimation helps reducing the adverse effects of noisy samples and affects the feature learning.
Tasks Face Recognition
Published 2020-03-25
URL https://arxiv.org/abs/2003.11339v1
PDF https://arxiv.org/pdf/2003.11339v1.pdf
PWC https://paperswithcode.com/paper/data-uncertainty-learning-in-face-recognition

TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics

Title TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics
Authors Alexander Tong, Jessie Huang, Guy Wolf, David van Dijk, Smita Krishnaswamy
Abstract It is increasingly common to encounter data from dynamic processes captured by static cross-sectional measurements over time, particularly in biomedical settings. Recent attempts to model individual trajectories from this data use optimal transport to create pairwise matchings between time points. However, these methods cannot model continuous dynamics and non-linear paths that entities can take in these systems. To address this issue, we establish a link between continuous normalizing flows and dynamic optimal transport, that allows us to model the expected paths of points over time. Continuous normalizing flows are generally under constrained, as they are allowed to take an arbitrary path from the source to the target distribution. We present TrajectoryNet, which controls the continuous paths taken between distributions. We show how this is particularly applicable for studying cellular dynamics in data from single-cell RNA sequencing (scRNA-seq) technologies, and that TrajectoryNet improves upon recently proposed static optimal transport-based models that can be used for interpolating cellular distributions.
Published 2020-02-09
URL https://arxiv.org/abs/2002.04461v1
PDF https://arxiv.org/pdf/2002.04461v1.pdf
PWC https://paperswithcode.com/paper/trajectorynet-a-dynamic-optimal-transport

Neural networks approach for mammography diagnosis using wavelets features

Title Neural networks approach for mammography diagnosis using wavelets features
Authors Essam A. Rashed, and Mohamed G. Awad
Abstract A supervised diagnosis system for digital mammogram is developed. The diagnosis processes are done by transforming the data of the images into a feature vector using wavelets multilevel decomposition. This vector is used as the feature tailored toward separating different mammogram classes. The suggested model consists of artificial neural networks designed for classifying mammograms according to tumor type and risk level. Results are enhanced from our previous study by extracting feature vectors using multilevel decompositions instead of one level of decomposition. Radiologist-labeled images were used to evaluate the diagnosis system. Results are very promising and show possible guide for future work.
Published 2020-03-06
URL https://arxiv.org/abs/2003.03000v1
PDF https://arxiv.org/pdf/2003.03000v1.pdf
PWC https://paperswithcode.com/paper/neural-networks-approach-for-mammography

Towards Intelligent Pick and Place Assembly of Individualized Products Using Reinforcement Learning

Title Towards Intelligent Pick and Place Assembly of Individualized Products Using Reinforcement Learning
Authors Caterina Neef, Dario Luipers, Jan Bollenbacher, Christian Gebel, Anja Richert
Abstract Individualized manufacturing is becoming an important approach as a means to fulfill increasingly diverse and specific consumer requirements and expectations. While there are various solutions to the implementation of the manufacturing process, such as additive manufacturing, the subsequent automated assembly remains a challenging task. As an approach to this problem, we aim to teach a collaborative robot to successfully perform pick and place tasks by implementing reinforcement learning. For the assembly of an individualized product in a constantly changing manufacturing environment, the simulated geometric and dynamic parameters will be varied. Using reinforcement learning algorithms capable of meta-learning, the tasks will first be trained in simulation. They will then be performed in a real-world environment where new factors are introduced that were not simulated in training to confirm the robustness of the algorithms. The robot will gain its input data from tactile sensors, area scan cameras, and 3D cameras used to generate heightmaps of the environment and the objects. The selection of machine learning algorithms and hardware components as well as further research questions to realize the outlined production scenario are the results of the presented work.
Tasks Meta-Learning
Published 2020-02-11
URL https://arxiv.org/abs/2002.08333v1
PDF https://arxiv.org/pdf/2002.08333v1.pdf
PWC https://paperswithcode.com/paper/towards-intelligent-pick-and-place-assembly

Generalizable semi-supervised learning method to estimate mass from sparsely annotated images

Title Generalizable semi-supervised learning method to estimate mass from sparsely annotated images
Authors Muhammad K. A. Hamdan, Diane T. Rover, Matthew J. Darr, John Just
Abstract Mass flow estimation is of great importance to several industries, and it can be quite challenging to obtain accurate estimates due to limitation in expense or general infeasibility. In the context of agricultural applications, yield monitoring is a key component to precision agriculture and mass flow is the critical factor to measure. Measuring mass flow allows for field productivity analysis, cost minimization, and adjustments to machine efficiency. Methods such as volume or force-impact have been used to measure mass flow; however, these methods are limited in application and accuracy. In this work, we use deep learning to develop and test a vision system that can accurately estimate the mass of sugarcane while running in real-time on a sugarcane harvester during operation. The deep learning algorithm that is used to estimate mass flow is trained using very sparsely annotated images (semi-supervised) using only final load weights (aggregated weights over a certain period of time). The deep neural network (DNN) succeeds in capturing the mass of sugarcane accurately and surpasses older volumetric-based methods, despite highly varying lighting and material colors in the images. The deep neural network is initially trained to predict mass on laboratory data (bamboo) and then transfer learning is utilized to apply the same methods to estimate mass of sugarcane. Using a vision system with a relatively lightweight deep neural network we are able to estimate mass of bamboo with an average error of 4.5% and 5.9% for a select season of sugarcane.
Tasks Transfer Learning
Published 2020-03-05
URL https://arxiv.org/abs/2003.03192v1
PDF https://arxiv.org/pdf/2003.03192v1.pdf
PWC https://paperswithcode.com/paper/generalizable-semi-supervised-learning-method

ML-misfit: Learning a robust misfit function for full-waveform inversion using machine learning

Title ML-misfit: Learning a robust misfit function for full-waveform inversion using machine learning
Authors Bingbing Sun, Tariq Alkhalifah
Abstract Most of the available advanced misfit functions for full waveform inversion (FWI) are hand-crafted, and the performance of those misfit functions is data-dependent. Thus, we propose to learn a misfit function for FWI, entitled ML-misfit, based on machine learning. Inspired by the optimal transport of the matching filter misfit, we design a neural network (NN) architecture for the misfit function in a form similar to comparing the mean and variance for two distributions. To guarantee the resulting learned misfit is a metric, we accommodate the symmetry of the misfit with respect to its input and a Hinge loss regularization term in a meta-loss function to satisfy the “triangle inequality” rule. In the framework of meta-learning, we train the network by running FWI to invert for randomly generated velocity models and update the parameters of the NN by minimizing the meta-loss, which is defined as accumulated difference between the true and inverted models. We first illustrate the basic principle of the ML-misfit for learning a convex misfit function for travel-time shifted signals. Further, we train the NN on 2D horizontally layered models, and we demonstrate the effectiveness and robustness of the learned ML-misfit by applying it to the well-known Marmousi model.
Tasks Meta-Learning
Published 2020-02-08
URL https://arxiv.org/abs/2002.03163v2
PDF https://arxiv.org/pdf/2002.03163v2.pdf
PWC https://paperswithcode.com/paper/ml-misfit-learning-a-robust-misfit-function

Modular network for high accuracy object detection

Title Modular network for high accuracy object detection
Authors Erez Yahalomi
Abstract We present a novel modular object detection convolutional neural network that significantly improves the accuracy of object detection. The network consists of two stages in a hierarchical structure. The first stage is a network that detects general classes. The second stage consists of separate networks to refine the classification and localization of each of the general classes objects. Compared to a state of the art object detection networks the classification error in the modular network is improved by approximately 3-5 times, from 12% to 2.5 %-4.5%. This network is easy to implement and has a 0.94 mAP. The network architecture can be a platform to improve the accuracy of widespread state of the art object detection networks and other kinds of deep learning networks. We show that a deep learning network initialized by transfer learning becomes more accurate as the number of classes it later trained to detect becomes smaller.
Tasks Object Detection, Transfer Learning
Published 2020-01-24
URL https://arxiv.org/abs/2001.09203v2
PDF https://arxiv.org/pdf/2001.09203v2.pdf
PWC https://paperswithcode.com/paper/modular-network-for-high-accuracy-object

LayoutMP3D: Layout Annotation of Matterport3D

Title LayoutMP3D: Layout Annotation of Matterport3D
Authors Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, Yi-Hsuan Tsai
Abstract Inferring the information of 3D layout from a single equirectangular panorama is crucial for numerous applications of virtual reality or robotics (e.g., scene understanding and navigation). To achieve this, several datasets are collected for the task of 360 layout estimation. To facilitate the learning algorithms for autonomous systems in indoor scenarios, we consider the Matterport3D dataset with their originally provided depth map ground truths and further release our annotations for layout ground truths from a subset of Matterport3D. As Matterport3D contains accurate depth ground truths from time-of-flight (ToF) sensors, our dataset provides both the layout and depth information, which enables the opportunity to explore the environment by integrating both cues.
Tasks Scene Understanding
Published 2020-03-30
URL https://arxiv.org/abs/2003.13516v1
PDF https://arxiv.org/pdf/2003.13516v1.pdf
PWC https://paperswithcode.com/paper/layoutmp3d-layout-annotation-of-matterport3d
comments powered by Disqus