July 29, 2019

3473 words 17 mins read

Paper Group AWR 188

Paper Group AWR 188

Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks. A Unified Approach to Interpreting Model Predictions. KATE: K-Competitive Autoencoder for Text. Backtracking Regression Forests for Accurate Camera Relocalization. ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Lo …

Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks

Title Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks
Authors Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, Luc Van Gool
Abstract We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for multi-scale contour detection and it uses a novel sparse boundary representation for hierarchical segmentation; it gives a significant leap in performance over the state-of-the-art, and it generalizes very well to unseen categories and datasets. Particularly, we show that learning to estimate not only contour strength but also orientation provides more accurate results. We perform extensive experiments for low-level applications on BSDS, PASCAL Context, PASCAL Segmentation, and NYUD to evaluate boundary detection performance, showing that COB provides state-of-the-art contours and region hierarchies in all datasets. We also evaluate COB on high-level tasks when coupled with multiple pipelines for object proposals, semantic contours, semantic segmentation, and object detection on MS-COCO, SBD, and PASCAL; showing that COB also improves the results for all tasks.
Tasks Boundary Detection, Contour Detection, Image Classification, Object Detection, Semantic Segmentation
Published 2017-01-17
URL http://arxiv.org/abs/1701.04658v2
PDF http://arxiv.org/pdf/1701.04658v2.pdf
PWC https://paperswithcode.com/paper/convolutional-oriented-boundaries-from-image
Repo https://github.com/kmaninis/COB
Framework none

A Unified Approach to Interpreting Model Predictions

Title A Unified Approach to Interpreting Model Predictions
Authors Scott Lundberg, Su-In Lee
Abstract Understanding why a model makes a certain prediction can be as crucial as the prediction’s accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.
Tasks Feature Importance, Interpretable Machine Learning
Published 2017-05-22
URL http://arxiv.org/abs/1705.07874v2
PDF http://arxiv.org/pdf/1705.07874v2.pdf
PWC https://paperswithcode.com/paper/a-unified-approach-to-interpreting-model
Repo https://github.com/GISH123/Cathay-Holdings-CIP-Projects-for-Interpretable-Machine-Learning
Framework tf

KATE: K-Competitive Autoencoder for Text

Title KATE: K-Competitive Autoencoder for Text
Authors Yu Chen, Mohammed J. Zaki
Abstract Autoencoders have been successful in learning meaningful representations from image datasets. However, their performance on text datasets has not been widely studied. Traditional autoencoders tend to learn possibly trivial representations of text documents due to their confounding properties such as high-dimensionality, sparsity and power-law word distributions. In this paper, we propose a novel k-competitive autoencoder, called KATE, for text documents. Due to the competition between the neurons in the hidden layer, each neuron becomes specialized in recognizing specific data patterns, and overall the model can learn meaningful representations of textual data. A comprehensive set of experiments show that KATE can learn better representations than traditional autoencoders including denoising, contractive, variational, and k-sparse autoencoders. Our model also outperforms deep generative models, probabilistic topic models, and even word representation models (e.g., Word2Vec) in terms of several downstream tasks such as document classification, regression, and retrieval.
Tasks Document Classification, Topic Models
Published 2017-05-04
URL http://arxiv.org/abs/1705.02033v2
PDF http://arxiv.org/pdf/1705.02033v2.pdf
PWC https://paperswithcode.com/paper/kate-k-competitive-autoencoder-for-text
Repo https://github.com/hugochan/KATE
Framework tf

Backtracking Regression Forests for Accurate Camera Relocalization

Title Backtracking Regression Forests for Accurate Camera Relocalization
Authors Lili Meng, Jianhui Chen, Frederick Tung, James J. Little, Julien Valentin, Clarence W. de Silva
Abstract Camera relocalization plays a vital role in many robotics and computer vision tasks, such as global localization, recovery from tracking failure, and loop closure detection. Recent random forests based methods directly predict 3D world locations for 2D image locations to guide the camera pose optimization. During training, each tree greedily splits the samples to minimize the spatial variance. However, these greedy splits often produce uneven sub-trees in training or incorrect 2D-3D correspondences in testing. To address these problems, we propose a sample-balanced objective to encourage equal numbers of samples in the left and right sub-trees, and a novel backtracking scheme to remedy the incorrect 2D-3D correspondence predictions. Furthermore, we extend the regression forests based methods to use local features in both training and testing stages for outdoor RGB-only applications. Experimental results on publicly available indoor and outdoor datasets demonstrate the efficacy of our approach, which shows superior or on-par accuracy with several state-of-the-art methods.
Tasks Camera Relocalization, Loop Closure Detection, Simultaneous Localization and Mapping
Published 2017-10-22
URL http://arxiv.org/abs/1710.07965v1
PDF http://arxiv.org/pdf/1710.07965v1.pdf
PWC https://paperswithcode.com/paper/backtracking-regression-forests-for-accurate
Repo https://github.com/LiliMeng/btrf
Framework none

ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases

Title ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases
Authors Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald M. Summers
Abstract The chest X-ray is one of the most commonly accessible radiological examinations for screening and diagnosis of many lung diseases. A tremendous number of X-ray imaging studies accompanied by radiological reports are accumulated and stored in many modern hospitals’ Picture Archiving and Communication Systems (PACS). On the other side, it is still an open question how this type of hospital-size knowledge database containing invaluable imaging informatics (i.e., loosely labeled) can be used to facilitate the data-hungry deep learning paradigms in building truly large-scale high precision computer-aided diagnosis (CAD) systems. In this paper, we present a new chest X-ray database, namely “ChestX-ray8”, which comprises 108,948 frontal-view X-ray images of 32,717 unique patients with the text-mined eight disease image labels (where each image can have multi-labels), from the associated radiological reports using natural language processing. Importantly, we demonstrate that these commonly occurring thoracic diseases can be detected and even spatially-located via a unified weakly-supervised multi-label image classification and disease localization framework, which is validated using our proposed dataset. Although the initial quantitative results are promising as reported, deep convolutional neural network based “reading chest X-rays” (i.e., recognizing and locating the common disease patterns trained with only image-level labels) remains a strenuous task for fully-automated high precision CAD systems. Data download link: https://nihcc.app.box.com/v/ChestXray-NIHCC
Tasks Image Classification, Lung Disease Classification
Published 2017-05-05
URL http://arxiv.org/abs/1705.02315v5
PDF http://arxiv.org/pdf/1705.02315v5.pdf
PWC https://paperswithcode.com/paper/chestx-ray8-hospital-scale-chest-x-ray
Repo https://github.com/Azure/AzureChestXRay
Framework pytorch

Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier

Title Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier
Authors Joseph Futoma, Sanjay Hariharan, Katherine Heller
Abstract We present a scalable end-to-end classifier that uses streaming physiological and medication data to accurately predict the onset of sepsis, a life-threatening complication from infections that has high mortality and morbidity. Our proposed framework models the multivariate trajectories of continuous-valued physiological time series using multitask Gaussian processes, seamlessly accounting for the high uncertainty, frequent missingness, and irregular sampling rates typically associated with real clinical data. The Gaussian process is directly connected to a black-box classifier that predicts whether a patient will become septic, chosen in our case to be a recurrent neural network to account for the extreme variability in the length of patient encounters. We show how to scale the computations associated with the Gaussian process in a manner so that the entire system can be discriminatively trained end-to-end using backpropagation. In a large cohort of heterogeneous inpatient encounters at our university health system we find that it outperforms several baselines at predicting sepsis, and yields 19.4% and 55.5% improved areas under the Receiver Operating Characteristic and Precision Recall curves as compared to the NEWS score currently used by our hospital.
Tasks Gaussian Processes, Time Series
Published 2017-06-13
URL http://arxiv.org/abs/1706.04152v1
PDF http://arxiv.org/pdf/1706.04152v1.pdf
PWC https://paperswithcode.com/paper/learning-to-detect-sepsis-with-a-multitask
Repo https://github.com/BorgwardtLab/mgp-tcn
Framework tf

Real-Time Panoramic Tracking for Event Cameras

Title Real-Time Panoramic Tracking for Event Cameras
Authors Christian Reinbacher, Gottfried Munda, Thomas Pock
Abstract Event cameras are a paradigm shift in camera technology. Instead of full frames, the sensor captures a sparse set of events caused by intensity changes. Since only the changes are transferred, those cameras are able to capture quick movements of objects in the scene or of the camera itself. In this work we propose a novel method to perform camera tracking of event cameras in a panoramic setting with three degrees of freedom. We propose a direct camera tracking formulation, similar to state-of-the-art in visual odometry. We show that the minimal information needed for simultaneous tracking and mapping is the spatial position of events, without using the appearance of the imaged scene point. We verify the robustness to fast camera movements and dynamic objects in the scene on a recently proposed dataset and self-recorded sequences.
Tasks Visual Odometry
Published 2017-03-15
URL http://arxiv.org/abs/1703.05161v2
PDF http://arxiv.org/pdf/1703.05161v2.pdf
PWC https://paperswithcode.com/paper/real-time-panoramic-tracking-for-event
Repo https://github.com/VLOGroup/dvs-panotracking
Framework none

Single-Shot Refinement Neural Network for Object Detection

Title Single-Shot Refinement Neural Network for Object Detection
Authors Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, Stan Z. Li
Abstract For object detection, the two-stage approach (e.g., Faster R-CNN) has been achieving the highest accuracy, whereas the one-stage approach (e.g., SSD) has the advantage of high efficiency. To inherit the merits of both while overcoming their disadvantages, in this paper, we propose a novel single-shot based detector, called RefineDet, that achieves better accuracy than two-stage methods and maintains comparable efficiency of one-stage methods. RefineDet consists of two inter-connected modules, namely, the anchor refinement module and the object detection module. Specifically, the former aims to (1) filter out negative anchors to reduce search space for the classifier, and (2) coarsely adjust the locations and sizes of anchors to provide better initialization for the subsequent regressor. The latter module takes the refined anchors as the input from the former to further improve the regression and predict multi-class label. Meanwhile, we design a transfer connection block to transfer the features in the anchor refinement module to predict locations, sizes and class labels of objects in the object detection module. The multi-task loss function enables us to train the whole network in an end-to-end way. Extensive experiments on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO demonstrate that RefineDet achieves state-of-the-art detection accuracy with high efficiency. Code is available at https://github.com/sfzhang15/RefineDet
Tasks Object Detection
Published 2017-11-18
URL http://arxiv.org/abs/1711.06897v3
PDF http://arxiv.org/pdf/1711.06897v3.pdf
PWC https://paperswithcode.com/paper/single-shot-refinement-neural-network-for
Repo https://github.com/laycoding/FaceDetection
Framework none

Decision support from financial disclosures with deep neural networks and transfer learning

Title Decision support from financial disclosures with deep neural networks and transfer learning
Authors Mathias Kraus, Stefan Feuerriegel
Abstract Company disclosures greatly aid in the process of financial decision-making; therefore, they are consulted by financial investors and automated traders before exercising ownership in stocks. While humans are usually able to correctly interpret the content, the same is rarely true of computerized decision support systems, which struggle with the complexity and ambiguity of natural language. A possible remedy is represented by deep learning, which overcomes several shortcomings of traditional methods of text mining. For instance, recurrent neural networks, such as long short-term memories, employ hierarchical structures, together with a large number of hidden layers, to automatically extract features from ordered sequences of words and capture highly non-linear relationships such as context-dependent meanings. However, deep learning has only recently started to receive traction, possibly because its performance is largely untested. Hence, this paper studies the use of deep neural networks for financial decision support. We additionally experiment with transfer learning, in which we pre-train the network on a different corpus with a length of 139.1 million words. Our results reveal a higher directional accuracy as compared to traditional machine learning when predicting stock price movements in response to financial disclosures. Our work thereby helps to highlight the business value of deep learning and provides recommendations to practitioners and executives.
Tasks Decision Making, Transfer Learning
Published 2017-10-11
URL http://arxiv.org/abs/1710.03954v1
PDF http://arxiv.org/pdf/1710.03954v1.pdf
PWC https://paperswithcode.com/paper/decision-support-from-financial-disclosures
Repo https://github.com/MathiasKraus/FinancialDeepLearning
Framework none

Attention Is All You Need

Title Attention Is All You Need
Authors Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
Abstract The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Tasks Constituency Parsing, Machine Translation
Published 2017-06-12
URL http://arxiv.org/abs/1706.03762v5
PDF http://arxiv.org/pdf/1706.03762v5.pdf
PWC https://paperswithcode.com/paper/attention-is-all-you-need
Repo https://github.com/kolloldas/torchnlp
Framework pytorch

Multivariate Time Series Classification with WEASEL+MUSE

Title Multivariate Time Series Classification with WEASEL+MUSE
Authors Patrick Schäfer, Ulf Leser
Abstract Multivariate time series (MTS) arise when multiple interconnected sensors record data over time. Dealing with this high-dimensional data is challenging for every classifier for at least two aspects: First, an MTS is not only characterized by individual feature values, but also by the interplay of features in different dimensions. Second, this typically adds large amounts of irrelevant data and noise. We present our novel MTS classifier WEASEL+MUSE which addresses both challenges. WEASEL+MUSE builds a multivariate feature vector, first using a sliding-window approach applied to each dimension of the MTS, then extracts discrete features per window and dimension. The feature vector is subsequently fed through feature selection, removing non-discriminative features, and analysed by a machine learning classifier. The novelty of WEASEL+MUSE lies in its specific way of extracting and filtering multivariate features from MTS by encoding context information into each feature. Still the resulting feature set is small, yet very discriminative and useful for MTS classification. Based on a popular benchmark of 20 MTS datasets, we found that WEASEL+MUSE is among the most accurate classifiers, when compared to the state of the art. The outstanding robustness of WEASEL+MUSE is further confirmed based on motion gesture recognition data, where it out-of-the-box achieved similar accuracies as domain-specific methods.
Tasks Feature Selection, Gesture Recognition, Time Series, Time Series Classification
Published 2017-11-30
URL http://arxiv.org/abs/1711.11343v4
PDF http://arxiv.org/pdf/1711.11343v4.pdf
PWC https://paperswithcode.com/paper/multivariate-time-series-classification-with
Repo https://github.com/patrickzib/SFA
Framework tf

Weightless: Lossy Weight Encoding For Deep Neural Network Compression

Title Weightless: Lossy Weight Encoding For Deep Neural Network Compression
Authors Brandon Reagen, Udit Gupta, Robert Adolf, Michael M. Mitzenmacher, Alexander M. Rush, Gu-Yeon Wei, David Brooks
Abstract The large memory requirements of deep neural networks limit their deployment and adoption on many devices. Model compression methods effectively reduce the memory requirements of these models, usually through applying transformations such as weight pruning or quantization. In this paper, we present a novel scheme for lossy weight encoding which complements conventional compression techniques. The encoding is based on the Bloomier filter, a probabilistic data structure that can save space at the cost of introducing random errors. Leveraging the ability of neural networks to tolerate these imperfections and by re-training around the errors, the proposed technique, Weightless, can compress DNN weights by up to 496x with the same model accuracy. This results in up to a 1.51x improvement over the state-of-the-art.
Tasks Model Compression, Neural Network Compression, Quantization
Published 2017-11-13
URL http://arxiv.org/abs/1711.04686v1
PDF http://arxiv.org/pdf/1711.04686v1.pdf
PWC https://paperswithcode.com/paper/weightless-lossy-weight-encoding-for-deep
Repo https://github.com/cambridge-mlg/miracle
Framework tf

Fast and Accurate Time Series Classification with WEASEL

Title Fast and Accurate Time Series Classification with WEASEL
Authors Patrick Schäfer, Ulf Leser
Abstract Time series (TS) occur in many scientific and commercial applications, ranging from earth surveillance to industry automation to the smart grids. An important type of TS analysis is classification, which can, for instance, improve energy load forecasting in smart grids by detecting the types of electronic devices based on their energy consumption profiles recorded by automatic sensors. Such sensor-driven applications are very often characterized by (a) very long TS and (b) very large TS datasets needing classification. However, current methods to time series classification (TSC) cannot cope with such data volumes at acceptable accuracy; they are either scalable but offer only inferior classification quality, or they achieve state-of-the-art classification quality but cannot scale to large data volumes. In this paper, we present WEASEL (Word ExtrAction for time SEries cLassification), a novel TSC method which is both scalable and accurate. Like other state-of-the-art TSC methods, WEASEL transforms time series into feature vectors, using a sliding-window approach, which are then analyzed through a machine learning classifier. The novelty of WEASEL lies in its specific method for deriving features, resulting in a much smaller yet much more discriminative feature set. On the popular UCR benchmark of 85 TS datasets, WEASEL is more accurate than the best current non-ensemble algorithms at orders-of-magnitude lower classification and training times, and it is almost as accurate as ensemble classifiers, whose computational complexity makes them inapplicable even for mid-size datasets. The outstanding robustness of WEASEL is also confirmed by experiments on two real smart grid datasets, where it out-of-the-box achieves almost the same accuracy as highly tuned, domain-specific methods.
Tasks Load Forecasting, Time Series, Time Series Classification
Published 2017-01-26
URL http://arxiv.org/abs/1701.07681v1
PDF http://arxiv.org/pdf/1701.07681v1.pdf
PWC https://paperswithcode.com/paper/fast-and-accurate-time-series-classification
Repo https://github.com/patrickzib/SFA
Framework tf

Character-level Deep Conflation for Business Data Analytics

Title Character-level Deep Conflation for Business Data Analytics
Authors Zhe Gan, P. D. Singh, Ameet Joshi, Xiaodong He, Jianshu Chen, Jianfeng Gao, Li Deng
Abstract Connecting different text attributes associated with the same entity (conflation) is important in business data analytics since it could help merge two different tables in a database to provide a more comprehensive profile of an entity. However, the conflation task is challenging because two text strings that describe the same entity could be quite different from each other for reasons such as misspelling. It is therefore critical to develop a conflation model that is able to truly understand the semantic meaning of the strings and match them at the semantic level. To this end, we develop a character-level deep conflation model that encodes the input text strings from character level into finite dimension feature vectors, which are then used to compute the cosine similarity between the text strings. The model is trained in an end-to-end manner using back propagation and stochastic gradient descent to maximize the likelihood of the correct association. Specifically, we propose two variants of the deep conflation model, based on long-short-term memory (LSTM) recurrent neural network (RNN) and convolutional neural network (CNN), respectively. Both models perform well on a real-world business analytics dataset and significantly outperform the baseline bag-of-character (BoC) model.
Tasks
Published 2017-02-08
URL http://arxiv.org/abs/1702.02640v1
PDF http://arxiv.org/pdf/1702.02640v1.pdf
PWC https://paperswithcode.com/paper/character-level-deep-conflation-for-business
Repo https://github.com/zhegan27/Deep_Conflation_Model
Framework none

DeepTFP: Mobile Time Series Data Analytics based Traffic Flow Prediction

Title DeepTFP: Mobile Time Series Data Analytics based Traffic Flow Prediction
Authors Yuanfang Chen, Falin Chen, Yizhi Ren, Ting Wu, Ye Yao
Abstract Traffic flow prediction is an important research issue to avoid traffic congestion in transportation systems. Traffic congestion avoiding can be achieved by knowing traffic flow and then conducting transportation planning. Achieving traffic flow prediction is challenging as the prediction is affected by many complex factors such as inter-region traffic, vehicles’ relations, and sudden events. However, as the mobile data of vehicles has been widely collected by sensor-embedded devices in transportation systems, it is possible to predict the traffic flow by analysing mobile data. This study proposes a deep learning based prediction algorithm, DeepTFP, to collectively predict the traffic flow on each and every traffic road of a city. This algorithm uses three deep residual neural networks to model temporal closeness, period, and trend properties of traffic flow. Each residual neural network consists of a branch of residual convolutional units. DeepTFP aggregates the outputs of the three residual neural networks to optimize the parameters of a time series prediction model. Contrast experiments on mobile time series data from the transportation system of England demonstrate that the proposed DeepTFP outperforms the Long Short-Term Memory (LSTM) architecture based method in prediction accuracy.
Tasks Time Series, Time Series Prediction
Published 2017-10-01
URL http://arxiv.org/abs/1710.01695v1
PDF http://arxiv.org/pdf/1710.01695v1.pdf
PWC https://paperswithcode.com/paper/deeptfp-mobile-time-series-data-analytics
Repo https://github.com/tbinetruy/CIL4SYS
Framework none
comments powered by Disqus