Paper Group ANR 230
i3PosNet: Instrument Pose Estimation from X-Ray in temporal bone surgery. LoopSmart: Smart Visual SLAM Through Surface Loop Closure. Model Interpretation: A Unified Derivative-based Framework for Nonparametric Regression and Supervised Machine Learning. A Stacked Autoencoder Neural Network based Automated Feature Extraction Method for Anomaly detec …
i3PosNet: Instrument Pose Estimation from X-Ray in temporal bone surgery
Title | i3PosNet: Instrument Pose Estimation from X-Ray in temporal bone surgery |
Authors | David Kügler, Jannik Sehring, Andrei Stefanov, Igor Stenin, Julia Kristin, Thomas Klenzner, Jörg Schipper, Anirban Mukhopadhyay |
Abstract | Purpose: Accurate estimation of the position and orientation (pose) of surgical instruments is crucial for delicate minimally invasive temporal bone surgery. Current techniques lack in accuracy and/or line-of-sight constraints (conventional tracking systems) or expose the patient to prohibitive ionizing radiation (intra-operative CT). A possible solution is to capture the instrument with a c-arm at irregular intervals and recover the pose from the image. Methods: i3PosNet infers the position and orientation of instruments from images using a pose estimation network. Said framework considers localized patches and outputs pseudo-landmarks. The pose is reconstructed from pseudo-landmarks by geometric considerations. Results: We show i3PosNet reaches errors less than 0.05mm. It outperforms conventional image registration-based approaches reducing average and maximum errors by at least two thirds. i3PosNet trained on synthetic images generalizes to real x-rays without any further adaptation. Conclusion: The translation of Deep Learning based methods to surgical applications is difficult, because large representative datasets for training and testing are not available. This work empirically shows sub-millimeter pose estimation trained solely based on synthetic training data. |
Tasks | Image Registration, Pose Estimation |
Published | 2018-02-26 |
URL | https://arxiv.org/abs/1802.09575v2 |
https://arxiv.org/pdf/1802.09575v2.pdf | |
PWC | https://paperswithcode.com/paper/i3posnet-instrument-pose-estimation-from-x |
Repo | |
Framework | |
LoopSmart: Smart Visual SLAM Through Surface Loop Closure
Title | LoopSmart: Smart Visual SLAM Through Surface Loop Closure |
Authors | Guoxiang Zhang, YangQuan Chen |
Abstract | We present a visual simultaneous localization and mapping (SLAM) framework of closing surface loops. It combines both sparse feature matching and dense surface alignment. Sparse feature matching is used for visual odometry and globally camera pose fine-tuning when dense loops are detected, while dense surface alignment is the way of closing large loops and solving surface mismatching problem. To achieve smart dense surface loop closure, a highly efficient CUDA-based global point cloud registration method and a map content dependent loop verification method are proposed. We run extensive experiments on different datasets, our method outperforms state-of-the-art ones in terms of both camera trajectory and surface reconstruction accuracy. |
Tasks | Point Cloud Registration, Simultaneous Localization and Mapping, Visual Odometry |
Published | 2018-01-04 |
URL | http://arxiv.org/abs/1801.01572v1 |
http://arxiv.org/pdf/1801.01572v1.pdf | |
PWC | https://paperswithcode.com/paper/loopsmart-smart-visual-slam-through-surface |
Repo | |
Framework | |
Model Interpretation: A Unified Derivative-based Framework for Nonparametric Regression and Supervised Machine Learning
Title | Model Interpretation: A Unified Derivative-based Framework for Nonparametric Regression and Supervised Machine Learning |
Authors | Xiaoyu Liu, Jie Chen, Joel Vaughan, Vijayan Nair, Agus Sudjianto |
Abstract | Interpreting a nonparametric regression model with many predictors is known to be a challenging problem. There has been renewed interest in this topic due to the extensive use of machine learning algorithms and the difficulty in understanding and explaining their input-output relationships. This paper develops a unified framework using a derivative-based approach for existing tools in the literature, including the partial-dependence plots, marginal plots and accumulated effects plots. It proposes a new interpretation technique called the accumulated total derivative effects plot and demonstrates how its components can be used to develop extensive insights in complex regression models with correlated predictors. The techniques are illustrated through simulation results. |
Tasks | |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07216v2 |
http://arxiv.org/pdf/1808.07216v2.pdf | |
PWC | https://paperswithcode.com/paper/model-interpretation-a-unified-derivative |
Repo | |
Framework | |
A Stacked Autoencoder Neural Network based Automated Feature Extraction Method for Anomaly detection in On-line Condition Monitoring
Title | A Stacked Autoencoder Neural Network based Automated Feature Extraction Method for Anomaly detection in On-line Condition Monitoring |
Authors | Mohendra Roy, Sumon Kumar Bose, Bapi Kar, Pradeep Kumar Gopalakrishnan, Arindam Basu |
Abstract | Condition monitoring is one of the routine tasks in all major process industries. The mechanical parts such as a motor, gear, bearings are the major components of a process industry and any fault in them may cause a total shutdown of the whole process, which may result in serious losses. Therefore, it is very crucial to predict any approaching defects before its occurrence. Several methods exist for this purpose and many research are being carried out for better and efficient models. However, most of them are based on the processing of raw sensor signals, which is tedious and expensive. Recently, there has been an increase in the feature based condition monitoring, where only the useful features are extracted from the raw signals and interpreted for the prediction of the fault. Most of these are handcrafted features, where these are manually obtained based on the nature of the raw data. This of course requires the prior knowledge of the nature of data and related processes. This limits the feature extraction process. However, recent development in the autoencoder based feature extraction method provides an alternative to the traditional handcrafted approaches; however, they have mostly been confined in the area of image and audio processing. In this work, we have developed an automated feature extraction method for on-line condition monitoring based on the stack of the traditional autoencoder and an on-line sequential extreme learning machine(OSELM) network. The performance of this method is comparable to that of the traditional feature extraction approaches. The method can achieve 100% detection accuracy for determining the bearing health states of NASA bearing dataset. The simple design of this method is promising for the easy hardware implementation of Internet of Things(IoT) based prognostics solutions. |
Tasks | Anomaly Detection |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08609v1 |
http://arxiv.org/pdf/1810.08609v1.pdf | |
PWC | https://paperswithcode.com/paper/a-stacked-autoencoder-neural-network-based |
Repo | |
Framework | |
Concept Mask: Large-Scale Segmentation from Semantic Concepts
Title | Concept Mask: Large-Scale Segmentation from Semantic Concepts |
Authors | Yufei Wang, Zhe Lin, Xiaohui Shen, Jianming Zhang, Scott Cohen |
Abstract | Existing works on semantic segmentation typically consider a small number of labels, ranging from tens to a few hundreds. With a large number of labels, training and evaluation of such task become extremely challenging due to correlation between labels and lack of datasets with complete annotations. We formulate semantic segmentation as a problem of image segmentation given a semantic concept, and propose a novel system which can potentially handle an unlimited number of concepts, including objects, parts, stuff, and attributes. We achieve this using a weakly and semi-supervised framework leveraging multiple datasets with different levels of supervision. We first train a deep neural network on a 6M stock image dataset with only image-level labels to learn visual-semantic embedding on 18K concepts. Then, we refine and extend the embedding network to predict an attention map, using a curated dataset with bounding box annotations on 750 concepts. Finally, we train an attention-driven class agnostic segmentation network using an 80-category fully annotated dataset. We perform extensive experiments to validate that the proposed system performs competitively to the state of the art on fully supervised concepts, and is capable of producing accurate segmentations for weakly learned and unseen concepts. |
Tasks | Semantic Segmentation |
Published | 2018-08-18 |
URL | http://arxiv.org/abs/1808.06032v1 |
http://arxiv.org/pdf/1808.06032v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-mask-large-scale-segmentation-from |
Repo | |
Framework | |
Online Robust Policy Learning in the Presence of Unknown Adversaries
Title | Online Robust Policy Learning in the Presence of Unknown Adversaries |
Authors | Aaron J. Havens, Zhanhong Jiang, Soumik Sarkar |
Abstract | The growing prospect of deep reinforcement learning (DRL) being used in cyber-physical systems has raised concerns around safety and robustness of autonomous agents. Recent work on generating adversarial attacks have shown that it is computationally feasible for a bad actor to fool a DRL policy into behaving sub optimally. Although certain adversarial attacks with specific attack models have been addressed, most studies are only interested in off-line optimization in the data space (e.g., example fitting, distillation). This paper introduces a Meta-Learned Advantage Hierarchy (MLAH) framework that is attack model-agnostic and more suited to reinforcement learning, via handling the attacks in the decision space (as opposed to data space) and directly mitigating learned bias introduced by the adversary. In MLAH, we learn separate sub-policies (nominal and adversarial) in an online manner, as guided by a supervisory master agent that detects the presence of the adversary by leveraging the advantage function for the sub-policies. We demonstrate that the proposed algorithm enables policy learning with significantly lower bias as compared to the state-of-the-art policy learning approaches even in the presence of heavy state information attacks. We present algorithm analysis and simulation results using popular OpenAI Gym environments. |
Tasks | |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.06064v1 |
http://arxiv.org/pdf/1807.06064v1.pdf | |
PWC | https://paperswithcode.com/paper/online-robust-policy-learning-in-the-presence |
Repo | |
Framework | |
Sequence to Logic with Copy and Cache
Title | Sequence to Logic with Copy and Cache |
Authors | Javid Dadashkarimi, Sekhar Tatikonda |
Abstract | Generating logical form equivalents of human language is a fresh way to employ neural architectures where long short-term memory effectively captures dependencies in both encoder and decoder units. The logical form of the sequence usually preserves information from the natural language side in the form of similar tokens, and recently a copying mechanism has been proposed which increases the probability of outputting tokens from the source input through decoding. In this paper we propose a caching mechanism as a more general form of the copying mechanism which also weighs all the words from the source vocabulary according to their relation to the current decoding context. Our results confirm that the proposed method achieves improvements in sequence/token-level accuracy on sequence to logical form tasks. Further experiments on cross-domain adversarial attacks show substantial improvements when using the most influential examples of other domains for training. |
Tasks | |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07333v1 |
http://arxiv.org/pdf/1807.07333v1.pdf | |
PWC | https://paperswithcode.com/paper/sequence-to-logic-with-copy-and-cache |
Repo | |
Framework | |
GeniePath: Graph Neural Networks with Adaptive Receptive Paths
Title | GeniePath: Graph Neural Networks with Adaptive Receptive Paths |
Authors | Ziqi Liu, Chaochao Chen, Longfei Li, Jun Zhou, Xiaolong Li, Le Song, Yuan Qi |
Abstract | We present, GeniePath, a scalable approach for learning adaptive receptive fields of neural networks defined on permutation invariant graph data. In GeniePath, we propose an adaptive path layer consists of two complementary functions designed for breadth and depth exploration respectively, where the former learns the importance of different sized neighborhoods, while the latter extracts and filters signals aggregated from neighbors of different hops away. Our method works in both transductive and inductive settings, and extensive experiments compared with competitive methods show that our approaches yield state-of-the-art results on large graphs. |
Tasks | |
Published | 2018-02-03 |
URL | http://arxiv.org/abs/1802.00910v3 |
http://arxiv.org/pdf/1802.00910v3.pdf | |
PWC | https://paperswithcode.com/paper/geniepath-graph-neural-networks-with-adaptive |
Repo | |
Framework | |
Latent Dirichlet Allocation in Generative Adversarial Networks
Title | Latent Dirichlet Allocation in Generative Adversarial Networks |
Authors | Lili Pan, Shen Cheng, Jian Liu, Yazhou Ren, Zenglin Xu |
Abstract | We study the problem of multimodal generative modelling of images based on generative adversarial networks (GANs). Despite the success of existing methods, they often ignore the underlying structure of vision data or its multimodal generation characteristics. To address this problem, we introduce the Dirichlet prior for multimodal image generation, which leads to a new Latent Dirichlet Allocation based GAN (LDAGAN). In detail, for the generative process modelling, LDAGAN defines a generative mode for each sample, determining which generative sub-process it belongs to. For the adversarial training, LDAGAN derives a variational expectation-maximization (VEM) algorithm to estimate model parameters. Experimental results on real-world datasets have demonstrated the outstanding performance of LDAGAN over other existing GANs. |
Tasks | Image Generation, Stochastic Optimization |
Published | 2018-12-17 |
URL | https://arxiv.org/abs/1812.06571v5 |
https://arxiv.org/pdf/1812.06571v5.pdf | |
PWC | https://paperswithcode.com/paper/latent-dirichlet-allocation-in-generative |
Repo | |
Framework | |
Gated Transfer Network for Transfer Learning
Title | Gated Transfer Network for Transfer Learning |
Authors | Yi Zhu, Jia Xue, Shawn Newsam |
Abstract | Deep neural networks have led to a series of breakthroughs in computer vision given sufficient annotated training datasets. For novel tasks with limited labeled data, the prevalent approach is to transfer the knowledge learned in the pre-trained models to the new tasks by fine-tuning. Classic model fine-tuning utilizes the fact that well trained neural networks appear to learn cross domain features. These features are treated equally during transfer learning. In this paper, we explore the impact of feature selection in model fine-tuning by introducing a transfer module, which assigns weights to features extracted from pre-trained models. The proposed transfer module proves the importance of feature selection for transferring models from source to target domains. It is shown to significantly improve upon fine-tuning results with only marginal extra computational cost. We also incorporate an auxiliary classifier as an extra regularizer to avoid over-fitting. Finally, we build a Gated Transfer Network (GTN) based on our transfer module and achieve state-of-the-art results on six different tasks. |
Tasks | Feature Selection, Transfer Learning |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12521v1 |
http://arxiv.org/pdf/1810.12521v1.pdf | |
PWC | https://paperswithcode.com/paper/gated-transfer-network-for-transfer-learning |
Repo | |
Framework | |
Domain Adaptation for Real-Time Student Performance Prediction
Title | Domain Adaptation for Real-Time Student Performance Prediction |
Authors | Byung-Hak Kim, Ethan Vizitei, Varun Ganapathi |
Abstract | Increasingly fast development and update cycle of online course contents, and diverse demographics of students in each online classroom, make student performance prediction in real-time (before the course finishes) and/or on curriculum without specific historical performance data available interesting topics for both industrial research and practical needs. In this research, we tackle the problem of real-time student performance prediction with on-going courses in a domain adaptation framework, which is a system trained on students’ labeled outcome from one set of previous coursework but is meant to be deployed on another. In particular, we first introduce recently-developed GritNet architecture which is the current state of the art for student performance prediction problem, and develop a new \emph{unsupervised} domain adaptation method to transfer a GritNet trained on a past course to a new course without any (students’ outcome) label. Our results for real Udacity students’ graduation predictions show that the GritNet not only \emph{generalizes} well from one course to another across different Nanodegree programs, but enhances real-time predictions explicitly in the first few weeks when accurate predictions are most challenging. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.06686v3 |
http://arxiv.org/pdf/1809.06686v3.pdf | |
PWC | https://paperswithcode.com/paper/gritnet-2-real-time-student-performance |
Repo | |
Framework | |
Non-rigid Reconstruction with a Single Moving RGB-D Camera
Title | Non-rigid Reconstruction with a Single Moving RGB-D Camera |
Authors | Shafeeq Elanattil, Peyman Moghadam, Sridha Sridharan, Clinton Fookes, Mark Cox |
Abstract | We present a novel non-rigid reconstruction method using a moving RGB-D camera. Current approaches use only non-rigid part of the scene and completely ignore the rigid background. Non-rigid parts often lack sufficient geometric and photometric information for tracking large frame-to-frame motion. Our approach uses camera pose estimated from the rigid background for foreground tracking. This enables robust foreground tracking in situations where large frame-to-frame motion occurs. Moreover, we are proposing a multi-scale deformation graph which improves non-rigid tracking without compromising the quality of the reconstruction. We are also contributing a synthetic dataset which is made publically available for evaluating non-rigid reconstruction methods. The dataset provides frame-by-frame ground truth geometry of the scene, the camera trajectory, and masks for background foreground. Experimental results show that our approach is more robust in handling larger frame-to-frame motions and provides better reconstruction compared to state-of-the-art approaches. |
Tasks | |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11219v2 |
http://arxiv.org/pdf/1805.11219v2.pdf | |
PWC | https://paperswithcode.com/paper/non-rigid-reconstruction-with-a-single-moving |
Repo | |
Framework | |
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
Title | Cross Pixel Optical Flow Similarity for Self-Supervised Learning |
Authors | Aravindh Mahendran, James Thewlis, Andrea Vedaldi |
Abstract | We propose a novel method for learning convolutional neural image representations without manual supervision. We use motion cues in the form of optical flow, to supervise representations of static images. The obvious approach of training a network to predict flow from a single image can be needlessly difficult due to intrinsic ambiguities in this prediction task. We instead propose a much simpler learning goal: embed pixels such that the similarity between their embeddings matches that between their optical flow vectors. At test time, the learned deep network can be used without access to video or flow information and transferred to tasks such as image classification, detection, and segmentation. Our method, which significantly simplifies previous attempts at using motion for self-supervision, achieves state-of-the-art results in self-supervision using motion cues, competitive results for self-supervision in general, and is overall state of the art in self-supervised pretraining for semantic image segmentation, as demonstrated on standard benchmarks. |
Tasks | Image Classification, Optical Flow Estimation, Semantic Segmentation |
Published | 2018-07-15 |
URL | http://arxiv.org/abs/1807.05636v1 |
http://arxiv.org/pdf/1807.05636v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-pixel-optical-flow-similarity-for-self |
Repo | |
Framework | |
Leveraging Multi-grained Sentiment Lexicon Information for Neural Sequence Models
Title | Leveraging Multi-grained Sentiment Lexicon Information for Neural Sequence Models |
Authors | Yan Zeng, Yangyang Lan, Yazhou Hao, Chen Li, Qinhua Zheng |
Abstract | Neural sequence models have achieved great success in sentence-level sentiment classification. However, some models are exceptionally complex or based on expensive features. Some other models recognize the value of existed linguistic resource but utilize it insufficiently. This paper proposes a novel and general method to incorporate lexicon information, including sentiment lexicons(+/-), negation words and intensifiers. Words are annotated in fine-grained and coarse-grained labels. The proposed method first encodes the fine-grained labels into sentiment embedding and concatenates it with word embedding. Second, the coarse-grained labels are utilized to enhance the attention mechanism to give large weight on sentiment-related words. Experimental results show that our method can increase classification accuracy for neural sequence models on both SST-5 and MR dataset. Specifically, the enhanced Bi-LSTM model can even compare with a Tree-LSTM which uses expensive phrase-level annotations. Further analysis shows that in most cases the lexicon resource can offer the right annotations. Besides, the proposed method is capable of overcoming the effect from inevitably wrong annotations. |
Tasks | Sentiment Analysis |
Published | 2018-12-04 |
URL | https://arxiv.org/abs/1812.01527v2 |
https://arxiv.org/pdf/1812.01527v2.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-multi-grained-sentiment-lexicon |
Repo | |
Framework | |
Deep learning in business analytics and operations research: Models, applications and managerial implications
Title | Deep learning in business analytics and operations research: Models, applications and managerial implications |
Authors | Mathias Kraus, Stefan Feuerriegel, Asil Oztekin |
Abstract | Business analytics refers to methods and practices that create value through data for individuals, firms, and organizations. This field is currently experiencing a radical shift due to the advent of deep learning: deep neural networks promise improvements in prediction performance as compared to models from traditional machine learning. However, our research into the existing body of literature reveals a scarcity of research works utilizing deep learning in our discipline. Accordingly, the objectives of this overview article are as follows: (1) we review research on deep learning for business analytics from an operational point of view. (2) We motivate why researchers and practitioners from business analytics should utilize deep neural networks and review potential use cases, necessary requirements, and benefits. (3) We investigate the added value to operations research in different case studies with real data from entrepreneurial undertakings. All such cases demonstrate improvements in operational performance over traditional machine learning and thus direct value gains. (4) We provide guidelines and implications for researchers, managers and practitioners in operations research who want to advance their capabilities for business analytics with regard to deep learning. (5) Our computational experiments find that default, out-of-the-box architectures are often suboptimal and thus highlight the value of customized architectures by proposing a novel deep-embedded network. |
Tasks | |
Published | 2018-06-28 |
URL | https://arxiv.org/abs/1806.10897v3 |
https://arxiv.org/pdf/1806.10897v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-in-business-analytics-and |
Repo | |
Framework | |