Paper Group ANR 413
Towards Generalizable Surgical Activity Recognition Using Spatial Temporal Graph Convolutional Networks. Attributed Multi-Relational Attention Network for Fact-checking URL Recommendation. Towards Ground Truth Evaluation of Visual Explanations. Coupled Tensor Completion via Low-rank Tensor Ring. Adversarial Deep Network Embedding for Cross-network …
Towards Generalizable Surgical Activity Recognition Using Spatial Temporal Graph Convolutional Networks
Title | Towards Generalizable Surgical Activity Recognition Using Spatial Temporal Graph Convolutional Networks |
Authors | Duygu Sarikaya, Pierre Jannin |
Abstract | Purpose Modeling and recognition of surgical activities poses an interesting research problem. Although a number of recent works studied automatic recognition of surgical activities, generalizability of these works across different tasks and different datasets remains a challenge. We introduce a modality that is robust to scene variation, based on spatial temporal graph representations of surgical tools in videos for surgical activity recognition. Methods To show its effectiveness, we model and recognize surgical gestures with the proposed modality. We construct spatial graphs connecting the joint pose estimations of surgical tools. Then, we connect each joint to the corresponding joint in the consecutive frames forming inter-frame edges representing the trajectory of the joint over time. We then learn hierarchical spatial temporal graph representations using Spatial Temporal Graph Convolutional Networks (ST-GCN). Results Our experimental results show that learned spatial temporal graph representations of surgical videos perform well in surgical gesture recognition even when used individually. We experiment with the Suturing task of the JIGSAWS dataset where the chance baseline for gesture recognition is 10%. Our results demonstrate 68% average accuracy which suggests a significant improvement. Conclusions Our experimental results show that our model learns meaningful representations.These learned representations can be used either individually, in cascades or as a complementary modality in surgical activity recognition, therefore provide a benchmark. To our knowledge, our paper is the first to use spatial temporal graph representations based on pose estimations of surgical tools in surgical activity recognition. |
Tasks | Activity Recognition, Gesture Recognition, Surgical Gesture Recognition |
Published | 2020-01-11 |
URL | https://arxiv.org/abs/2001.03728v2 |
https://arxiv.org/pdf/2001.03728v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-generalizable-surgical-activity |
Repo | |
Framework | |
Attributed Multi-Relational Attention Network for Fact-checking URL Recommendation
Title | Attributed Multi-Relational Attention Network for Fact-checking URL Recommendation |
Authors | Di You, Nguyen Vo, Kyumin Lee, Qiang Liu |
Abstract | To combat fake news, researchers mostly focused on detecting fake news and journalists built and maintained fact-checking sites (e.g., Snopes.com and Politifact.com). However, fake news dissemination has been greatly promoted via social media sites, and these fact-checking sites have not been fully utilized. To overcome these problems and complement existing methods against fake news, in this paper we propose a deep-learning based fact-checking URL recommender system to mitigate impact of fake news in social media sites such as Twitter and Facebook. In particular, our proposed framework consists of a multi-relational attentive module and a heterogeneous graph attention network to learn complex/semantic relationship between user-URL pairs, user-user pairs, and URL-URL pairs. Extensive experiments on a real-world dataset show that our proposed framework outperforms eight state-of-the-art recommendation models, achieving at least 3~5.3% improvement. |
Tasks | Recommendation Systems |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.02214v1 |
https://arxiv.org/pdf/2001.02214v1.pdf | |
PWC | https://paperswithcode.com/paper/attributed-multi-relational-attention-network |
Repo | |
Framework | |
Towards Ground Truth Evaluation of Visual Explanations
Title | Towards Ground Truth Evaluation of Visual Explanations |
Authors | Ahmed Osman, Leila Arras, Wojciech Samek |
Abstract | Several methods have been proposed to explain the decisions of neural networks in the visual domain via saliency heatmaps (aka relevances/feature importance scores). Thus far, these methods were mainly validated on real-world images, using either pixel perturbation experiments or bounding box localization accuracies. In the present work, we propose instead to evaluate explanations in a restricted and controlled setup using a synthetic dataset of rendered 3D shapes. To this end, we generate a CLEVR-alike visual question answering benchmark with around 40,000 questions, where the ground truth pixel coordinates of relevant objects are known, which allows us to validate explanations in a fair and transparent way. We further introduce two straightforward metrics to evaluate explanations in this setup, and compare their outcomes to standard pixel perturbation using a Relation Network model and three decomposition-based explanation methods: Gradient x Input, Integrated Gradients and Layer-wise Relevance Propagation. Among the tested methods, Layer-wise Relevance Propagation was shown to perform best, followed by Integrated Gradients. More generally, we expect the release of our dataset and code to support the development and comparison of methods on a well-defined common ground. |
Tasks | Feature Importance, Question Answering, Visual Question Answering |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07258v1 |
https://arxiv.org/pdf/2003.07258v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-ground-truth-evaluation-of-visual |
Repo | |
Framework | |
Coupled Tensor Completion via Low-rank Tensor Ring
Title | Coupled Tensor Completion via Low-rank Tensor Ring |
Authors | Huyan Huang, Yipeng Liu, Ce Zhu |
Abstract | The coupled tensor decomposition aims to reveal the latent data structure which may share common factors. As a quantum inspired representation for tensors, the recently proposed tensor ring decomposition shows powerful representational ability. Using this decomposition, a novel non-convex model using the coupled tensor ring Frobenius norm is proposed in this paper. We also provide an excess risk bound for this model, which shows improvement compared with the recent coupled nuclear norm method. The model is solved by the block coordinate descent which only involves solving a series of quadratic forms constructed by the sampling pattern, thus it leads to efficient optimization. The proposed algorithm is validated on synthetic data and real-world data, which demonstrates the superiority over the existing coupled completion methods. |
Tasks | |
Published | 2020-01-09 |
URL | https://arxiv.org/abs/2001.02810v2 |
https://arxiv.org/pdf/2001.02810v2.pdf | |
PWC | https://paperswithcode.com/paper/coupled-tensor-completion-via-low-rank-tensor |
Repo | |
Framework | |
Adversarial Deep Network Embedding for Cross-network Node Classification
Title | Adversarial Deep Network Embedding for Cross-network Node Classification |
Authors | Xiao Shen, Quanyu Dai, Fu-lai Chung, Wei Lu, Kup-Sze Choi |
Abstract | In this paper, the task of cross-network node classification, which leverages the abundant labeled nodes from a source network to help classify unlabeled nodes in a target network, is studied. The existing domain adaptation algorithms generally fail to model the network structural information, and the current network embedding models mainly focus on single-network applications. Thus, both of them cannot be directly applied to solve the cross-network node classification problem. This motivates us to propose an adversarial cross-network deep network embedding (ACDNE) model to integrate adversarial domain adaptation with deep network embedding so as to learn network-invariant node representations that can also well preserve the network structural information. In ACDNE, the deep network embedding module utilizes two feature extractors to jointly preserve attributed affinity and topological proximities between nodes. In addition, a node classifier is incorporated to make node representations label-discriminative. Moreover, an adversarial domain adaptation technique is employed to make node representations network-invariant. Extensive experimental results demonstrate that the proposed ACDNE model achieves the state-of-the-art performance in cross-network node classification. |
Tasks | Domain Adaptation, Network Embedding, Node Classification |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07366v1 |
https://arxiv.org/pdf/2002.07366v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-deep-network-embedding-for-cross |
Repo | |
Framework | |
A Matlab Toolbox for Feature Importance Ranking
Title | A Matlab Toolbox for Feature Importance Ranking |
Authors | Shaode Yu, Zhicheng Zhang, Xiaokun Liang, Junjie Wu, Erlei Zhang, Wenjian Qin, Yaoqin Xie |
Abstract | More attention is being paid for feature importance ranking (FIR), in particular when thousands of features can be extracted for intelligent diagnosis and personalized medicine. A large number of FIR approaches have been proposed, while few are integrated for comparison and real-life applications. In this study, a matlab toolbox is presented and a total of 30 algorithms are collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound images. To each breast mass lesion, 15 features are extracted. To figure out the optimal subset of features for classification, all combinations of features are tested and linear support vector machine is used for the malignancy prediction of lesions annotated in ultrasound images. At last, the effectiveness of FIR is analyzed according to performance comparison. The toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work, more FIR methods, feature selection methods and machine learning classifiers will be integrated. |
Tasks | Feature Importance, Feature Selection |
Published | 2020-03-10 |
URL | https://arxiv.org/abs/2003.08737v1 |
https://arxiv.org/pdf/2003.08737v1.pdf | |
PWC | https://paperswithcode.com/paper/a-matlab-toolbox-for-feature-importance |
Repo | |
Framework | |
Toward Adaptive Guidance: Modeling the Variety of User Behaviors in Continuous-Skill-Improving Experiences of Machine Operation Tasks
Title | Toward Adaptive Guidance: Modeling the Variety of User Behaviors in Continuous-Skill-Improving Experiences of Machine Operation Tasks |
Authors | Long-fei Chen, Yuichi Nakamura, Kazuaki Kondo |
Abstract | An adaptive guidance system that supports equipment operators requires a comprehensive model of task and user behavior that considers different skill and knowledge levels as well as diverse situations. In this study, we investigated the relationships between user behaviors and skill levels under operational conditions. We captured sixty samples of two sewing tasks performed by five operators using a head-mounted RGB-d camera and a static gaze tracker. We examined the operators’ gaze and head movements, and hand interactions to essential regions (hotspots on machine surface) to determine behavioral differences among continuous skill improving experiences. We modeled the variety of user behaviors to an extensive task model with a two-step automatic approach, baseline model selection and experience integration. The experimental results indicate that some features, such as task execution time and user head movements, are good indexes for skill level and provide valuable information that can be applied to obtain an effective task model. Operators with varying knowledge and operating habits demonstrate different operational features, which can contribute to the design of user-specific guidance. |
Tasks | Model Selection |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03025v1 |
https://arxiv.org/pdf/2003.03025v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-adaptive-guidance-modeling-the-variety |
Repo | |
Framework | |
Scalable Distributed Approximation of Internal Measures for Clustering Evaluation
Title | Scalable Distributed Approximation of Internal Measures for Clustering Evaluation |
Authors | Federico Altieri, Andrea Pietracaprina, Geppino Pucci, Fabio Vandin |
Abstract | The most widely used internal measure for clustering evaluation is the silhouette coefficient, whose naive computation requires a quadratic number of distance calculations, which is clearly unfeasible for massive datasets. Surprisingly, there are no known general methods to efficiently approximate the silhouette coefficient of a clustering with rigorously provable high accuracy. In this paper, we present the first scalable algorithm to compute such a rigorous approximation for the evaluation of clusterings based on any metric distances. Our algorithm hinges on a Probability Proportional to Size (PPS) sampling scheme, and, for any fixed $\varepsilon, \delta \in (0,1)$, it approximates the silhouette coefficient within a mere additive error $O(\varepsilon)$ with probability $1-\delta$, using a very small number of distance calculations. We also prove that the algorithm can be adapted to obtain rigorous approximations of other internal measures of clustering quality, such as cohesion and separation. Importantly, we provide a distributed implementation of the algorithm using the MapReduce model, which runs in constant rounds and requires only sublinear local space at each worker, which makes our estimation approach applicable to big data scenarios. We perform an extensive experimental evaluation of our silhouette approximation algorithm, comparing its performance to a number of baseline heuristics on real and synthetic datasets. The experiments provide evidence that, unlike other heuristics, our estimation strategy not only provides tight theoretical guarantees but is also able to return highly accurate estimations while running in a fraction of the time required by the exact computation, and that its distributed implementation is highly scalable, thus enabling the computation of internal measures for very large datasets for which the exact computation is prohibitive. |
Tasks | |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01430v1 |
https://arxiv.org/pdf/2003.01430v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-distributed-approximation-of |
Repo | |
Framework | |
Reinforcement Learning via Fenchel-Rockafellar Duality
Title | Reinforcement Learning via Fenchel-Rockafellar Duality |
Authors | Ofir Nachum, Bo Dai |
Abstract | We review basic concepts of convex duality, focusing on the very general and supremely useful Fenchel-Rockafellar duality. We summarize how this duality may be applied to a variety of reinforcement learning (RL) settings, including policy evaluation or optimization, online or offline learning, and discounted or undiscounted rewards. The derivations yield a number of intriguing results, including the ability to perform policy evaluation and on-policy policy gradient with behavior-agnostic offline data and methods to learn a policy via max-likelihood optimization. Although many of these results have appeared previously in various forms, we provide a unified treatment and perspective on these results, which we hope will enable researchers to better use and apply the tools of convex duality to make further progress in RL. |
Tasks | |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.01866v2 |
https://arxiv.org/pdf/2001.01866v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-via-fenchel |
Repo | |
Framework | |
On the Value of Target Data in Transfer Learning
Title | On the Value of Target Data in Transfer Learning |
Authors | Steve Hanneke, Samory Kpotufe |
Abstract | We aim to understand the value of additional labeled or unlabeled target data in transfer learning, for any given amount of source data; this is motivated by practical questions around minimizing sampling costs, whereby, target data is usually harder or costlier to acquire than source data, but can yield better accuracy. To this aim, we establish the first minimax-rates in terms of both source and target sample sizes, and show that performance limits are captured by new notions of discrepancy between source and target, which we refer to as transfer exponents. |
Tasks | Transfer Learning |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.04747v1 |
https://arxiv.org/pdf/2002.04747v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-value-of-target-data-in-transfer-1 |
Repo | |
Framework | |
Understanding the QuickXPlain Algorithm: Simple Explanation and Formal Proof
Title | Understanding the QuickXPlain Algorithm: Simple Explanation and Formal Proof |
Authors | Patrick Rodler |
Abstract | In his seminal paper of 2004, Ulrich Junker proposed the QuickXPlain algorithm, which provides a divide-and-conquer computation strategy to find within a given set an irreducible subset with a particular (monotone) property. Beside its original application in the domain of constraint satisfaction problems, the algorithm has since then found widespread adoption in areas as different as model-based diagnosis, recommender systems, verification, or the Semantic Web. This popularity is due to the frequent occurrence of the problem of finding irreducible subsets on the one hand, and to QuickXPlain’s general applicability and favorable computational complexity on the other hand. However, although (we regularly experience) people are having a hard time understanding QuickXPlain and seeing why it works correctly, a proof of correctness of the algorithm has never been published. This is what we account for in this work, by explaining QuickXPlain in a novel tried and tested way and by presenting an intelligible formal proof of it. Apart from showing the correctness of the algorithm and excluding the later detection of errors (proof and trust effect), the added value of the availability of a formal proof is, e.g., (i) that the workings of the algorithm often become completely clear only after studying, verifying and comprehending the proof (didactic effect), (ii) the shown proof methodology can be used as a guidance for proving other recursive algorithms (transfer effect), and (iii) the possibility of providing “gapless” correctness proofs of systems that rely on (results computed by) QuickXPlain, such as numerous model-based debuggers (completeness effect). |
Tasks | Recommendation Systems |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.01835v2 |
https://arxiv.org/pdf/2001.01835v2.pdf | |
PWC | https://paperswithcode.com/paper/understanding-the-quickxplain-algorithm |
Repo | |
Framework | |
QED: using Quality-Environment-Diversity to evolve resilient robot swarms
Title | QED: using Quality-Environment-Diversity to evolve resilient robot swarms |
Authors | David M. Bossens, Danesh Tarapore |
Abstract | In swarm robotics, any of the robots in a swarm may be affected by different faults, resulting in significant performance declines. To allow fault recovery from randomly injected faults to different robots in a swarm, a model-free approach may be preferable due to the accumulation of faults in models and the difficulty to predict the behaviour of neighbouring robots. One model-free approach to fault recovery involves two phases: during simulation, a quality-diversity algorithm evolves a behaviourally diverse archive of controllers; during the target application, a search for the best controller is initiated after fault injection. In quality-diversity algorithms, the choice of the behavioural descriptor is a key design choice that determines the quality of the evolved archives, and therefore the fault recovery performance. Although the environment is an important determinant of behaviour, the impact of environmental diversity is often ignored in the choice of a suitable behavioural descriptor. This study compares different behavioural descriptors, including two generic descriptors that work on a wide range of tasks, one hand-coded descriptor which fits the domain of interest, and one novel type of descriptor based on environmental diversity, which we call Quality-Environment-Diversity (QED). Results demonstrate that the above-mentioned model-free approach to fault recovery is feasible in the context of swarm robotics, reducing the fault impact by a factor 2-3. Further, the environmental diversity obtained with QED yields a unique behavioural diversity profile that allows it to recover from high-impact faults. |
Tasks | |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.02341v1 |
https://arxiv.org/pdf/2003.02341v1.pdf | |
PWC | https://paperswithcode.com/paper/qed-using-quality-environment-diversity-to |
Repo | |
Framework | |
Universal Differentiable Renderer for Implicit Neural Representations
Title | Universal Differentiable Renderer for Implicit Neural Representations |
Authors | Lior Yariv, Matan Atzmon, Yaron Lipman |
Abstract | The goal of this work is to learn implicit 3D shape representation with 2D supervision (i.e., a collection of images). To that end we introduce the Universal Differentiable Renderer (UDR) a neural network architecture that can provably approximate reflected light from an implicit neural representation of a 3D surface, under a wide set of reflectance properties and lighting conditions. Experimenting with the task of multiview 3D reconstruction, we find our model to improve upon the baselines in the accuracy of the reconstructed 3D geometry and rendering from unseen viewing directions. |
Tasks | 3D Reconstruction, 3D Shape Representation |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09852v1 |
https://arxiv.org/pdf/2003.09852v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-differentiable-renderer-for |
Repo | |
Framework | |
Reconstructing Sinus Anatomy from Endoscopic Video – Towards a Radiation-free Approach for Quantitative Longitudinal Assessment
Title | Reconstructing Sinus Anatomy from Endoscopic Video – Towards a Radiation-free Approach for Quantitative Longitudinal Assessment |
Authors | Xingtong Liu, Maia Stiber, Jindan Huang, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath |
Abstract | Reconstructing accurate 3D surface models of sinus anatomy directly from an endoscopic video is a promising avenue for cross-sectional and longitudinal analysis to better understand the relationship between sinus anatomy and surgical outcomes. We present a patient-specific, learning-based method for 3D reconstruction of sinus surface anatomy directly and only from endoscopic videos. We demonstrate the effectiveness and accuracy of our method on in and ex vivo data where we compare to sparse reconstructions from Structure from Motion, dense reconstruction from COLMAP, and ground truth anatomy from CT. Our textured reconstructions are watertight and enable measurement of clinically relevant parameters in good agreement with CT. The source code will be made publicly available upon publication. |
Tasks | 3D Reconstruction |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08502v1 |
https://arxiv.org/pdf/2003.08502v1.pdf | |
PWC | https://paperswithcode.com/paper/reconstructing-sinus-anatomy-from-endoscopic |
Repo | |
Framework | |
Fully Convolutional Networks for Automatically Generating Image Masks to Train Mask R-CNN
Title | Fully Convolutional Networks for Automatically Generating Image Masks to Train Mask R-CNN |
Authors | Hao Wu, Jan Paul Siebert |
Abstract | This paper proposes a novel automatically generating image masks method for the state-of-the-art Mask R-CNN deep learning method. The Mask R-CNN method achieves the best results in object detection until now, however, it is very time-consuming and laborious to get the object Masks for training, the proposed method is composed by a two-stage design, to automatically generating image masks, the first stage implements a fully convolutional networks (FCN) based segmentation network, the second stage network, a Mask R-CNN based object detection network, which is trained on the object image masks from FCN output, the original input image, and additional label information. Through experimentation, our proposed method can obtain the image masks automatically to train Mask R-CNN, and it can achieve very high classification accuracy with an over 90% mean of average precision (mAP) for segmentation |
Tasks | Object Detection |
Published | 2020-03-03 |
URL | https://arxiv.org/abs/2003.01383v1 |
https://arxiv.org/pdf/2003.01383v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-convolutional-networks-for |
Repo | |
Framework | |