Paper Group ANR 906
Two Techniques That Enhance the Performance of Multi-robot Prioritized Path Planning. Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees. Stanza: Layer Separation for Distributed Training in Deep Learning. A High-Performance HOG Extractor on FPGA. Lifted Wasserstein Matcher for Fast and Robust Topology Tracking. Sequence Lea …
Two Techniques That Enhance the Performance of Multi-robot Prioritized Path Planning
Title | Two Techniques That Enhance the Performance of Multi-robot Prioritized Path Planning |
Authors | Anton Andreychuk, Konstantin Yakovlev |
Abstract | We introduce and empirically evaluate two techniques aimed at enhancing the performance of multi-robot prioritized path planning. The first technique is the deterministic procedure for re-scheduling (as opposed to well-known approach based on random restarts), the second one is the heuristic procedure that modifies the search-space of the individual planner involved in the prioritized path finding. |
Tasks | |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01270v1 |
http://arxiv.org/pdf/1805.01270v1.pdf | |
PWC | https://paperswithcode.com/paper/two-techniques-that-enhance-the-performance |
Repo | |
Framework | |
Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees
Title | Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees |
Authors | Guiliang Liu, Oliver Schulte, Wang Zhu, Qingcan Li |
Abstract | Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair. The Q function neural network contains a lot of implicit knowledge about the RL problems, but often remains unexamined and uninterpreted. To our knowledge, this work develops the first mimic learning framework for Q functions in DRL. We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions. An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment. Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods. The transparent tree structure of an LMUT facilitates understanding the network’s learned knowledge by analyzing feature influence, extracting rules, and highlighting the super-pixels in image inputs. |
Tasks | |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05887v1 |
http://arxiv.org/pdf/1807.05887v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-interpretable-deep-reinforcement |
Repo | |
Framework | |
Stanza: Layer Separation for Distributed Training in Deep Learning
Title | Stanza: Layer Separation for Distributed Training in Deep Learning |
Authors | Xiaorui Wu, Hong Xu, Bo Li, Yongqiang Xiong |
Abstract | The parameter server architecture is prevalently used for distributed deep learning. Each worker machine in a parameter server system trains the complete model, which leads to a hefty amount of network data transfer between workers and servers. We empirically observe that the data transfer has a non-negligible impact on training time. To tackle the problem, we design a new distributed training system called Stanza. Stanza exploits the fact that in many models such as convolution neural networks, most data exchange is attributed to the fully connected layers, while most computation is carried out in convolutional layers. Thus, we propose layer separation in distributed training: the majority of the nodes just train the convolutional layers, and the rest train the fully connected layers only. Gradients and parameters of the fully connected layers no longer need to be exchanged across the cluster, thereby substantially reducing the data transfer volume. We implement Stanza on PyTorch and evaluate its performance on Azure and EC2. Results show that Stanza accelerates training significantly over current parameter server systems: on EC2 instances with Tesla V100 GPU and 10Gb bandwidth for example, Stanza is 1.34x–13.9x faster for common deep learning models. |
Tasks | |
Published | 2018-12-27 |
URL | http://arxiv.org/abs/1812.10624v2 |
http://arxiv.org/pdf/1812.10624v2.pdf | |
PWC | https://paperswithcode.com/paper/stanza-layer-separation-for-distributed |
Repo | |
Framework | |
A High-Performance HOG Extractor on FPGA
Title | A High-Performance HOG Extractor on FPGA |
Authors | Vinh Ngo, Arnau Casadevall, Marc Codina, David Castells-Rufas, Jordi Carrabina |
Abstract | Pedestrian detection is one of the key problems in emerging self-driving car industry. And HOG algorithm has proven to provide good accuracy for pedestrian detection. There are plenty of research works have been done in accelerating HOG algorithm on FPGA because of its low-power and high-throughput characteristics. In this paper, we present a high-performance HOG architecture for pedestrian detection on a low-cost FPGA platform. It achieves a maximum throughput of 526 FPS with 640x480 input images, which is 3.25 times faster than the state of the art design. The accelerator is integrated with SVM-based prediction in realizing a pedestrian detection system. And the power consumption of the whole system is comparable with the best existing implementations. |
Tasks | Pedestrian Detection |
Published | 2018-01-12 |
URL | http://arxiv.org/abs/1802.02187v1 |
http://arxiv.org/pdf/1802.02187v1.pdf | |
PWC | https://paperswithcode.com/paper/a-high-performance-hog-extractor-on-fpga |
Repo | |
Framework | |
Lifted Wasserstein Matcher for Fast and Robust Topology Tracking
Title | Lifted Wasserstein Matcher for Fast and Robust Topology Tracking |
Authors | Maxime Soler, Mélanie Plainchault, Bruno Conche, Julien Tierny |
Abstract | This paper presents a robust and efficient method for tracking topological features in time-varying scalar data. Structures are tracked based on the optimal matching between persistence diagrams with respect to the Wasserstein metric. This fundamentally relies on solving the assignment problem, a special case of optimal transport, for all consecutive timesteps. Our approach relies on two main contributions. First, we revisit the seminal assignment algorithm by Kuhn and Munkres which we specifically adapt to the problem of matching persistence diagrams in an efficient way. Second, we propose an extension of the Wasserstein metric that significantly improves the geometrical stability of the matching of domain-embedded persistence pairs. We show that this geometrical lifting has the additional positive side-effect of improving the assignment matrix sparsity and therefore computing time. The global framework implements a coarse-grained parallelism by computing persistence diagrams and finding optimal matchings in parallel for every couple of consecutive timesteps. Critical trajectories are constructed by associating successively matched persistence pairs over time. Merging and splitting events are detected with a geometrical threshold in a post-processing stage. Extensive experiments on real-life datasets show that our matching approach is an order of magnitude faster than the seminal Munkres algorithm. Moreover, compared to a modern approximation method, our method provides competitive runtimes while yielding exact results. We demonstrate the utility of our global framework by extracting critical point trajectories from various simulated time-varying datasets and compare it to the existing methods based on associated overlaps of volumes. Robustness to noise and temporal resolution downsampling is empirically demonstrated. |
Tasks | |
Published | 2018-08-17 |
URL | http://arxiv.org/abs/1808.05870v4 |
http://arxiv.org/pdf/1808.05870v4.pdf | |
PWC | https://paperswithcode.com/paper/lifted-wasserstein-matcher-for-fast-and |
Repo | |
Framework | |
Sequence Learning with RNNs for Medical Concept Normalization in User-Generated Texts
Title | Sequence Learning with RNNs for Medical Concept Normalization in User-Generated Texts |
Authors | Elena Tutubalina, Zulfat Miftahutdinov, Sergey Nikolenko, Valentin Malykh |
Abstract | In this work, we consider the medical concept normalization problem, i.e., the problem of mapping a disease mention in free-form text to a concept in a controlled vocabulary, usually to the standard thesaurus in the Unified Medical Language System (UMLS). This task is challenging since medical terminology is very different when coming from health care professionals or from the general public in the form of social media texts. We approach it as a sequence learning problem, with recurrent neural networks trained to obtain semantic representations of one- and multi-word expressions. We develop end-to-end neural architectures tailored specifically to medical concept normalization, including bidirectional LSTM and GRU with an attention mechanism and additional semantic similarity features based on UMLS. Our evaluation over a standard benchmark shows that our model improves over a state of the art baseline for classification based on CNNs. |
Tasks | Medical Concept Normalization, Semantic Similarity, Semantic Textual Similarity |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11523v2 |
http://arxiv.org/pdf/1811.11523v2.pdf | |
PWC | https://paperswithcode.com/paper/sequence-learning-with-rnns-for-medical |
Repo | |
Framework | |
Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach
Title | Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach |
Authors | Rajat Sen, Kirthevasan Kandasamy, Sanjay Shakkottai |
Abstract | We study the problem of black-box optimization of a noisy function in the presence of low-cost approximations or fidelities, which is motivated by problems like hyper-parameter tuning. In hyper-parameter tuning evaluating the black-box function at a point involves training a learning algorithm on a large data-set at a particular hyper-parameter and evaluating the validation error. Even a single such evaluation can be prohibitively expensive. Therefore, it is beneficial to use low-cost approximations, like training the learning algorithm on a sub-sampled version of the whole data-set. These low-cost approximations/fidelities can however provide a biased and noisy estimate of the function value. In this work, we incorporate the multi-fidelity setup in the powerful framework of noisy black-box optimization through tree-like hierarchical partitions. We propose a multi-fidelity bandit based tree-search algorithm for the problem and provide simple regret bounds for our algorithm. Finally, we validate the performance of our algorithm on real and synthetic datasets, where it outperforms several benchmarks. |
Tasks | |
Published | 2018-10-24 |
URL | http://arxiv.org/abs/1810.10482v1 |
http://arxiv.org/pdf/1810.10482v1.pdf | |
PWC | https://paperswithcode.com/paper/noisy-blackbox-optimization-with-multi |
Repo | |
Framework | |
Semantic Segmentation via Highly Fused Convolutional Network with Multiple Soft Cost Functions
Title | Semantic Segmentation via Highly Fused Convolutional Network with Multiple Soft Cost Functions |
Authors | Tao Yang, Yan Wu, Junqiao Zhao, Linting Guan |
Abstract | Semantic image segmentation is one of the most challenged tasks in computer vision. In this paper, we propose a highly fused convolutional network, which consists of three parts: feature downsampling, combined feature upsampling and multiple predictions. We adopt a strategy of multiple steps of upsampling and combined feature maps in pooling layers with its corresponding unpooling layers. Then we bring out multiple pre-outputs, each pre-output is generated from an unpooling layer by one-step upsampling. Finally, we concatenate these pre-outputs to get the final output. As a result, our proposed network makes highly use of the feature information by fusing and reusing feature maps. In addition, when training our model, we add multiple soft cost functions on pre-outputs and final outputs. In this way, we can reduce the loss reduction when the loss is back propagated. We evaluate our model on three major segmentation datasets: CamVid, PASCAL VOC and ADE20K. We achieve a state-of-the-art performance on CamVid dataset, as well as considerable improvements on PASCAL VOC dataset and ADE20K dataset |
Tasks | Semantic Segmentation |
Published | 2018-01-04 |
URL | http://arxiv.org/abs/1801.01317v1 |
http://arxiv.org/pdf/1801.01317v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-segmentation-via-highly-fused |
Repo | |
Framework | |
GPU Accelerated Cascade Hashing Image Matching for Large Scale 3D Reconstruction
Title | GPU Accelerated Cascade Hashing Image Matching for Large Scale 3D Reconstruction |
Authors | Tao Xu, Kun Sun, Wenbing Tao |
Abstract | Image feature point matching is a key step in Structure from Motion(SFM). However, it is becoming more and more time consuming because the number of images is getting larger and larger. In this paper, we proposed a GPU accelerated image matching method with improved Cascade Hashing. Firstly, we propose a Disk-Memory-GPU data exchange strategy and optimize the load order of data, so that the proposed method can deal with big data. Next, we parallelize the Cascade Hashing method on GPU. An improved parallel reduction and an improved parallel hashing ranking are proposed to fulfill this task. Finally, extensive experiments show that our image matching is about 20 times faster than SiftGPU on the same graphics card, nearly 100 times faster than the CPU CasHash method and hundreds of times faster than the CPU Kd-Tree based matching method. Further more, we introduce the epipolar constraint to the proposed method, and use the epipolar geometry to guide the feature matching procedure, which further reduces the matching cost. |
Tasks | 3D Reconstruction |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.08995v1 |
http://arxiv.org/pdf/1805.08995v1.pdf | |
PWC | https://paperswithcode.com/paper/gpu-accelerated-cascade-hashing-image |
Repo | |
Framework | |
DASN:Data-Aware Skilled Network for Accurate MR Brain Tissue Segmentation
Title | DASN:Data-Aware Skilled Network for Accurate MR Brain Tissue Segmentation |
Authors | Yang Deng, Yao Sun, Yongpei Zhu, Shuo Zhang, Mingwang Zhu, Kehong Yuan |
Abstract | Accurate segmentation of MR brain tissue is a crucial step for diagnosis, surgical planning, and treatment of brain abnormalities. Automatic and reliable segmenta-tion methods are required to assist doctor. Over the last few years, deep learning especially deep convolutional neural networks (CNNs) have emerged as one of the most prominent approaches for image recognition problems in various do-mains. But the improvement of deep networks always needs inspiration, which is rare for the ordinary. Until now,there have been reasonable MR brain tissue segmentation methods,all of which can achieve promising performance. These different methods have their own characteristic and are distinctive for data sets. In other words, different models performance vary widely on the same data sets and each model has what it is skilled in. It is on the basis of this, we propose a judgement to distinguish data sets that different models are good at. With our method, the segmentation accuracy can be improved easily based on the existing models, neither without increasing training data nor improving the network. We validate our method on the widely used IBSR 18 dataset and obtain average dice ratio of 88.06%,while it is 85.82% and 86.92% when only using separate one model respectively. |
Tasks | |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08473v2 |
http://arxiv.org/pdf/1807.08473v2.pdf | |
PWC | https://paperswithcode.com/paper/dasndata-aware-skilled-network-for-accurate |
Repo | |
Framework | |
Can Euroscepticism Contribute to a European Public Sphere? The Europeanization of Media Discourses about Euroscepticism across Six Countries
Title | Can Euroscepticism Contribute to a European Public Sphere? The Europeanization of Media Discourses about Euroscepticism across Six Countries |
Authors | Anamaria Dutceac Segesten, Michael Bossetta |
Abstract | This study compares the media discourses about Euroscepticism in 2014 across six countries (United Kingdom, Ireland, France, Spain, Sweden, and Denmark). We assess the extent to which the mass media’s reporting of Euroscepticism indicates the Europeanization of public spheres. Using a mixed-methods approach combining LDA topic modeling and qualitative coding, we find that approximately 70 per cent of print articles mentioning “Euroscepticism” or “Eurosceptic” are framed in a non-domestic (i.e. European) context. In five of the six cases studied, articles exhibiting a European context are strikingly similar in content, with the British case as the exception. However, coverage of British Euroscepticism drives Europeanization in other Member States. Bivariate logistic regressions further reveal three macro-level structural variables that significantly correlate with a Europeanized media discourse: newspaper type (tabloid or broadsheet), presence of a strong Eurosceptic party, and relationship to the EU budget (net contributor or receiver of EU funds). |
Tasks | |
Published | 2018-10-15 |
URL | http://arxiv.org/abs/1810.06745v1 |
http://arxiv.org/pdf/1810.06745v1.pdf | |
PWC | https://paperswithcode.com/paper/can-euroscepticism-contribute-to-a-european |
Repo | |
Framework | |
Training Multi-organ Segmentation Networks with Sample Selection by Relaxed Upper Confident Bound
Title | Training Multi-organ Segmentation Networks with Sample Selection by Relaxed Upper Confident Bound |
Authors | Yan Wang, Yuyin Zhou, Peng Tang, Wei Shen, Elliot K. Fishman, Alan L. Yuille |
Abstract | Deep convolutional neural networks (CNNs), especially fully convolutional networks, have been widely applied to automatic medical image segmentation problems, e.g., multi-organ segmentation. Existing CNN-based segmentation methods mainly focus on looking for increasingly powerful network architectures, but pay less attention to data sampling strategies for training networks more effectively. In this paper, we present a simple but effective sample selection method for training multi-organ segmentation networks. Sample selection exhibits an exploitation-exploration strategy, i.e., exploiting hard samples and exploring less frequently visited samples. Based on the fact that very hard samples might have annotation errors, we propose a new sample selection policy, named Relaxed Upper Confident Bound (RUCB). Compared with other sample selection policies, e.g., Upper Confident Bound (UCB), it exploits a range of hard samples rather than being stuck with a small set of very hard ones, which mitigates the influence of annotation errors during training. We apply this new sample selection policy to training a multi-organ segmentation network on a dataset containing 120 abdominal CT scans and show that it boosts segmentation performance significantly. |
Tasks | Medical Image Segmentation, Semantic Segmentation |
Published | 2018-04-07 |
URL | http://arxiv.org/abs/1804.02595v1 |
http://arxiv.org/pdf/1804.02595v1.pdf | |
PWC | https://paperswithcode.com/paper/training-multi-organ-segmentation-networks |
Repo | |
Framework | |
Artificial neural networks condensation: A strategy to facilitate adaption of machine learning in medical settings by reducing computational burden
Title | Artificial neural networks condensation: A strategy to facilitate adaption of machine learning in medical settings by reducing computational burden |
Authors | Dianbo Liu, Nestor Sepulveda, Ming Zheng |
Abstract | Machine Learning (ML) applications on healthcare can have a great impact on people’s lives helping deliver better and timely treatment to those in need. At the same time, medical data is usually big and sparse requiring important computational resources. Although it might not be a problem for wide-adoption of ML tools in developed nations, availability of computational resource can very well be limited in third-world nations. This can prevent the less favored people from benefiting of the advancement in ML applications for healthcare. In this project we explored methods to increase computational efficiency of ML algorithms, in particular Artificial Neural Nets (NN), while not compromising the accuracy of the predicted results. We used in-hospital mortality prediction as our case analysis based on the MIMIC III publicly available dataset. We explored three methods on two different NN architectures. We reduced the size of recurrent neural net (RNN) and dense neural net (DNN) by applying pruning of “unused” neurons. Additionally, we modified the RNN structure by adding a hidden-layer to the LSTM cell allowing to use less recurrent layers for the model. Finally, we implemented quantization on DNN forcing the weights to be 8-bits instead of 32-bits. We found that all our methods increased computational efficiency without compromising accuracy and some of them even achieved higher accuracy than the pre-condensed baseline models. |
Tasks | Mortality Prediction, Quantization |
Published | 2018-12-23 |
URL | http://arxiv.org/abs/1812.09659v1 |
http://arxiv.org/pdf/1812.09659v1.pdf | |
PWC | https://paperswithcode.com/paper/artificial-neural-networks-condensation-a |
Repo | |
Framework | |
Data-parallel distributed training of very large models beyond GPU capacity
Title | Data-parallel distributed training of very large models beyond GPU capacity |
Authors | Samuel Matzek, Max Grossman, Minsik Cho, Anar Yusifov, Bryant Nelson, Amit Juneja |
Abstract | GPUs have limited memory and it is difficult to train wide and/or deep models that cause the training process to go out of memory. It is shown in this paper how an open source tool called Large Model Support (LMS) can utilize a high bandwidth NVLink connection between CPUs and GPUs to accomplish training of deep convolutional networks. LMS performs tensor swapping between CPU memory and GPU memory such that only a minimal number of tensors required in a training step are kept in the GPU memory. It is also shown how LMS can be combined with an MPI based distributed deep learning module to train models in a data-parallel fashion across multiple GPUs, such that each GPU is utilizing the CPU memory for tensor swapping. The hardware architecture that enables the high bandwidth GPU link with the CPU is discussed as well as the associated set of software tools that are available as the PowerAI package. |
Tasks | |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12174v1 |
http://arxiv.org/pdf/1811.12174v1.pdf | |
PWC | https://paperswithcode.com/paper/data-parallel-distributed-training-of-very |
Repo | |
Framework | |
An Overview of Robust Subspace Recovery
Title | An Overview of Robust Subspace Recovery |
Authors | Gilad Lerman, Tyler Maunu |
Abstract | This paper will serve as an introduction to the body of work on robust subspace recovery. Robust subspace recovery involves finding an underlying low-dimensional subspace in a dataset that is possibly corrupted with outliers. While this problem is easy to state, it has been difficult to develop optimal algorithms due to its underlying nonconvexity. This work emphasizes advantages and disadvantages of proposed approaches and unsolved problems in the area. |
Tasks | |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.01013v2 |
http://arxiv.org/pdf/1803.01013v2.pdf | |
PWC | https://paperswithcode.com/paper/an-overview-of-robust-subspace-recovery |
Repo | |
Framework | |