July 27, 2019

3109 words 15 mins read

Paper Group ANR 494

Paper Group ANR 494

3D Trajectory Reconstruction of Dynamic Objects Using Planarity Constraints. ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning. Hyperprior on symmetric Dirichlet distribution. FPGA based Parallelized Architecture of Efficient Graph based Image Segmentation Algorithm. Debiased distributed learning for sparse parti …

3D Trajectory Reconstruction of Dynamic Objects Using Planarity Constraints

Title 3D Trajectory Reconstruction of Dynamic Objects Using Planarity Constraints
Authors Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
Abstract We present a method to reconstruct the three-dimensional trajectory of a moving instance of a known object category in monocular video data. We track the two-dimensional shape of objects on pixel level exploiting instance-aware semantic segmentation techniques and optical flow cues. We apply Structure from Motion techniques to object and background images to determine for each frame camera poses relative to object instances and background structures. By combining object and background camera pose information, we restrict the object trajectory to a one-parameter family of possible solutions. We compute a ground representation by fusing background structures and corresponding semantic segmentations. This allows us to determine an object trajectory consistent to image observations and reconstructed environment model. Our method is robust to occlusion and handles temporarily stationary objects. We show qualitative results using drone imagery. Due to the lack of suitable benchmark datasets we present a new dataset to evaluate the quality of reconstructed three-dimensional object trajectories. The video sequences contain vehicles in urban areas and are rendered using the path-tracing render engine Cycles to achieve realistic results. We perform a quantitative evaluation of the presented approach using this dataset. Our algorithm achieves an average reconstruction-to-ground-truth distance of 0.31 meter.
Tasks Optical Flow Estimation, Semantic Segmentation
Published 2017-11-16
URL http://arxiv.org/abs/1711.06136v1
PDF http://arxiv.org/pdf/1711.06136v1.pdf
PWC https://paperswithcode.com/paper/3d-trajectory-reconstruction-of-dynamic
Repo
Framework

ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning

Title ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning
Authors Zhixiong Yang, Waheed U. Bajwa
Abstract Distributed machine learning algorithms enable learning of models from datasets that are distributed over a network without gathering the data at a centralized location. While efficient distributed algorithms have been developed under the assumption of faultless networks, failures that can render these algorithms nonfunctional occur frequently in the real world. This paper focuses on the problem of Byzantine failures, which are the hardest to safeguard against in distributed algorithms. While Byzantine fault tolerance has a rich history, existing work does not translate into efficient and practical algorithms for high-dimensional learning in fully distributed (also known as decentralized) settings. In this paper, an algorithm termed Byzantine-resilient distributed coordinate descent (ByRDiE) is developed and analyzed that enables distributed learning in the presence of Byzantine failures. Theoretical analysis (convex settings) and numerical experiments (convex and nonconvex settings) highlight its usefulness for high-dimensional distributed learning in the presence of Byzantine failures.
Tasks
Published 2017-08-28
URL https://arxiv.org/abs/1708.08155v4
PDF https://arxiv.org/pdf/1708.08155v4.pdf
PWC https://paperswithcode.com/paper/byrdie-byzantine-resilient-distributed
Repo
Framework

Hyperprior on symmetric Dirichlet distribution

Title Hyperprior on symmetric Dirichlet distribution
Authors Jun Lu
Abstract In this article we introduce how to put vague hyperprior on Dirichlet distribution, and we update the parameter of it by adaptive rejection sampling (ARS). Finally we analyze this hyperprior in an over-fitted mixture model by some synthetic experiments.
Tasks
Published 2017-08-28
URL http://arxiv.org/abs/1708.08177v1
PDF http://arxiv.org/pdf/1708.08177v1.pdf
PWC https://paperswithcode.com/paper/hyperprior-on-symmetric-dirichlet
Repo
Framework

FPGA based Parallelized Architecture of Efficient Graph based Image Segmentation Algorithm

Title FPGA based Parallelized Architecture of Efficient Graph based Image Segmentation Algorithm
Authors Roopal Nahar, Akanksha Baranwal, K. Madhava Krishna
Abstract Efficient and real time segmentation of color images has a variety of importance in many fields of computer vision such as image compression, medical imaging, mapping and autonomous navigation. Being one of the most computationally expensive operation, it is usually done through software imple- mentation using high-performance processors. In robotic systems, however, with the constrained platform dimensions and the need for portability, low power consumption and simultaneously the need for real time image segmentation, we envision hardware parallelism as the way forward to achieve higher acceleration. Field-programmable gate arrays (FPGAs) are among the best suited for this task as they provide high computing power in a small physical area. They exceed the computing speed of software based implementations by breaking the paradigm of sequential execution and accomplishing more per clock cycle operations by enabling hardware level parallelization at an architectural level. In this paper, we propose three novel architectures of a well known Efficient Graph based Image Segmentation algorithm. These proposed implementations optimizes time and power consumption when compared to software implementations. The hybrid design proposed, has notable furtherance of acceleration capabilities delivering atleast 2X speed gain over other implemen- tations, which henceforth allows real time image segmentation that can be deployed on Mobile Robotic systems.
Tasks Autonomous Navigation, Image Compression, Semantic Segmentation
Published 2017-10-06
URL http://arxiv.org/abs/1710.02260v1
PDF http://arxiv.org/pdf/1710.02260v1.pdf
PWC https://paperswithcode.com/paper/fpga-based-parallelized-architecture-of
Repo
Framework

Debiased distributed learning for sparse partial linear models in high dimensions

Title Debiased distributed learning for sparse partial linear models in high dimensions
Authors Shaogao Lv, Heng Lian
Abstract Although various distributed machine learning schemes have been proposed recently for pure linear models and fully nonparametric models, little attention has been paid on distributed optimization for semi-paramemetric models with multiple-level structures (e.g. sparsity, linearity and nonlinearity). To address these issues, the current paper proposes a new communication-efficient distributed learning algorithm for partially sparse linear models with an increasing number of features. The proposed method is based on the classical divide and conquer strategy for handing big data and each sub-method defined on each subsample consists of a debiased estimation of the double-regularized least squares approach. With the proposed method, we theoretically prove that our global parametric estimator can achieve optimal parametric rate in our semi-parametric model given an appropriate partition on the total data. Specially, the choice of data partition relies on the underlying smoothness of the nonparametric component, but it is adaptive to the sparsity parameter. Even under the non-distributed setting, we develop a new and easily-read proof for optimal estimation of the parametric error in high dimensional partial linear model. Finally, several simulated experiments are implemented to indicate comparable empirical performance of our debiased technique under the distributed setting.
Tasks Distributed Optimization
Published 2017-08-18
URL https://arxiv.org/abs/1708.05487v2
PDF https://arxiv.org/pdf/1708.05487v2.pdf
PWC https://paperswithcode.com/paper/a-debiased-distributed-estimation-for-sparse
Repo
Framework

Web-STAR: Towards a Visual Web-Based IDE for a Story Comprehension System

Title Web-STAR: Towards a Visual Web-Based IDE for a Story Comprehension System
Authors Christos Rodosthenous, Loizos Michael
Abstract In this work, we present Web-STAR, an online platform for story understanding built on top of the STAR (STory comprehension through ARgumentation) reasoning engine. This platform includes a web-based IDE, integration with the STAR system and a web service infrastructure to support integration with other systems that rely on story understanding functionality to complete their tasks. The platform also delivers a number of “social” features like public story sharing with a built-in commenting system, a public repository for sharing stories with the community and collaboration tools that can be used from both project team members for development and educators for teaching. Moreover, we discuss the ongoing work on adding new features and functionality to this platform.
Tasks
Published 2017-06-20
URL http://arxiv.org/abs/1706.06954v1
PDF http://arxiv.org/pdf/1706.06954v1.pdf
PWC https://paperswithcode.com/paper/web-star-towards-a-visual-web-based-ide-for-a
Repo
Framework

English Conversational Telephone Speech Recognition by Humans and Machines

Title English Conversational Telephone Speech Recognition by Humans and Machines
Authors George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall
Abstract One of the most difficult speech recognition tasks is accurate recognition of human to human communication. Advances in deep learning over the last few years have produced major speech recognition improvements on the representative Switchboard conversational corpus. Word error rates that just a few years ago were 14% have dropped to 8.0%, then 6.6% and most recently 5.8%, and are now believed to be within striking range of human performance. This then raises two issues - what IS human performance, and how far down can we still drive speech recognition error rates? A recent paper by Microsoft suggests that we have already achieved human performance. In trying to verify this statement, we performed an independent set of human performance measurements on two conversational tasks and found that human performance may be considerably better than what was earlier reported, giving the community a significantly harder goal to achieve. We also report on our own efforts in this area, presenting a set of acoustic and language modeling techniques that lowered the word error rate of our own English conversational telephone LVCSR system to the level of 5.5%/10.3% on the Switchboard/CallHome subsets of the Hub5 2000 evaluation, which - at least at the writing of this paper - is a new performance milestone (albeit not at what we measure to be human performance!). On the acoustic side, we use a score fusion of three models: one LSTM with multiple feature inputs, a second LSTM trained with speaker-adversarial multi-task learning and a third residual net (ResNet) with 25 convolutional layers and time-dilated convolutions. On the language modeling side, we use word and character LSTMs and convolutional WaveNet-style language models.
Tasks Language Modelling, Large Vocabulary Continuous Speech Recognition, Multi-Task Learning, Speech Recognition
Published 2017-03-06
URL http://arxiv.org/abs/1703.02136v1
PDF http://arxiv.org/pdf/1703.02136v1.pdf
PWC https://paperswithcode.com/paper/english-conversational-telephone-speech
Repo
Framework

Denoising Autoencoders for Overgeneralization in Neural Networks

Title Denoising Autoencoders for Overgeneralization in Neural Networks
Authors Giacomo Spigler
Abstract Despite the recent developments that allowed neural networks to achieve impressive performance on a variety of applications, these models are intrinsically affected by the problem of overgeneralization, due to their partitioning of the full input space into the fixed set of target classes used during training. Thus it is possible for novel inputs belonging to categories unknown during training or even completely unrecognizable to humans to fool the system into classifying them as one of the known classes, even with a high degree of confidence. Solving this problem may help improve the security of such systems in critical applications, and may further lead to applications in the context of open set recognition and 1-class recognition. This paper presents a novel way to compute a confidence score using denoising autoencoders and shows that such confidence score can correctly identify the regions of the input space close to the training distribution by approximately identifying its local maxima.
Tasks Denoising, Open Set Learning
Published 2017-09-14
URL http://arxiv.org/abs/1709.04762v3
PDF http://arxiv.org/pdf/1709.04762v3.pdf
PWC https://paperswithcode.com/paper/denoising-autoencoders-for-overgeneralization
Repo
Framework

Deep metric learning for multi-labelled radiographs

Title Deep metric learning for multi-labelled radiographs
Authors Mauro Annarumma, Giovanni Montana
Abstract Many radiological studies can reveal the presence of several co-existing abnormalities, each one represented by a distinct visual pattern. In this article we address the problem of learning a distance metric for plain radiographs that captures a notion of “radiological similarity”: two chest radiographs are considered to be similar if they share similar abnormalities. Deep convolutional neural networks (DCNs) are used to learn a low-dimensional embedding for the radiographs that is equipped with the desired metric. Two loss functions are proposed to deal with multi-labelled images and potentially noisy labels. We report on a large-scale study involving over 745,000 chest radiographs whose labels were automatically extracted from free-text radiological reports through a natural language processing system. Using 4,500 validated exams, we demonstrate that the methodology performs satisfactorily on clustering and image retrieval tasks. Remarkably, the learned metric separates normal exams from those having radiological abnormalities.
Tasks Image Retrieval, Metric Learning
Published 2017-12-11
URL http://arxiv.org/abs/1712.07682v1
PDF http://arxiv.org/pdf/1712.07682v1.pdf
PWC https://paperswithcode.com/paper/deep-metric-learning-for-multi-labelled
Repo
Framework

Simultaneous Multiple Surface Segmentation Using Deep Learning

Title Simultaneous Multiple Surface Segmentation Using Deep Learning
Authors Abhay Shah, Michael Abramoff, Xiaodong Wu
Abstract The task of automatically segmenting 3-D surfaces representing boundaries of objects is important for quantitative analysis of volumetric images, and plays a vital role in biomedical image analysis. Recently, graph-based methods with a global optimization property have been developed and optimized for various medical imaging applications. Despite their widespread use, these require human experts to design transformations, image features, surface smoothness priors, and re-design for a different tissue, organ or imaging modality. Here, we propose a Deep Learning based approach for segmentation of the surfaces in volumetric medical images, by learning the essential features and transformations from training data, without any human expert intervention. We employ a regional approach to learn the local surface profiles. The proposed approach was evaluated on simultaneous intraretinal layer segmentation of optical coherence tomography (OCT) images of normal retinas and retinas affected by age related macular degeneration (AMD). The proposed approach was validated on 40 retina OCT volumes including 20 normal and 20 AMD subjects. The experiments showed statistically significant improvement in accuracy for our approach compared to state-of-the-art graph based optimal surface segmentation with convex priors (G-OSC). A single Convolution Neural Network (CNN) was used to learn the surfaces for both normal and diseased images. The mean unsigned surface positioning errors obtained by G-OSC method 2.31 voxels (95% CI 2.02-2.60 voxels) was improved to $1.27$ voxels (95% CI 1.14-1.40 voxels) using our new approach. On average, our approach takes 94.34 s, requiring 95.35 MB memory, which is much faster than the 2837.46 s and 6.87 GB memory required by the G-OSC method on the same computer system.
Tasks
Published 2017-05-19
URL http://arxiv.org/abs/1705.07142v1
PDF http://arxiv.org/pdf/1705.07142v1.pdf
PWC https://paperswithcode.com/paper/simultaneous-multiple-surface-segmentation
Repo
Framework

Automated Audio Captioning with Recurrent Neural Networks

Title Automated Audio Captioning with Recurrent Neural Networks
Authors Konstantinos Drossos, Sharath Adavanne, Tuomas Virtanen
Abstract We present the first approach to automated audio captioning. We employ an encoder-decoder scheme with an alignment model in between. The input to the encoder is a sequence of log mel-band energies calculated from an audio file, while the output is a sequence of words, i.e. a caption. The encoder is a multi-layered, bi-directional gated recurrent unit (GRU) and the decoder a multi-layered GRU with a classification layer connected to the last GRU of the decoder. The classification layer and the alignment model are fully connected layers with shared weights between timesteps. The proposed method is evaluated using data drawn from a commercial sound effects library, ProSound Effects. The resulting captions were rated through metrics utilized in machine translation and image captioning fields. Results from metrics show that the proposed method can predict words appearing in the original caption, but not always correctly ordered.
Tasks Image Captioning, Machine Translation
Published 2017-06-30
URL http://arxiv.org/abs/1706.10006v2
PDF http://arxiv.org/pdf/1706.10006v2.pdf
PWC https://paperswithcode.com/paper/automated-audio-captioning-with-recurrent
Repo
Framework

Development and evaluation of a deep learning model for protein-ligand binding affinity prediction

Title Development and evaluation of a deep learning model for protein-ligand binding affinity prediction
Authors Marta M. Stepniewska-Dziubinska, Piotr Zielenkiewicz, Pawel Siedlecki
Abstract Structure based ligand discovery is one of the most successful approaches for augmenting the drug discovery process. Currently, there is a notable shift towards machine learning (ML) methodologies to aid such procedures. Deep learning has recently gained considerable attention as it allows the model to “learn” to extract features that are relevant for the task at hand. We have developed a novel deep neural network estimating the binding affinity of ligand-receptor complexes. The complex is represented with a 3D grid, and the model utilizes a 3D convolution to produce a feature map of this representation, treating the atoms of both proteins and ligands in the same manner. Our network was tested on the CASF “scoring power” benchmark and Astex Diverse Set and outperformed classical scoring functions. The model, together with usage instructions and examples, is available as a git repository at http://gitlab.com/cheminfIBB/pafnucy
Tasks Drug Discovery
Published 2017-12-19
URL http://arxiv.org/abs/1712.07042v2
PDF http://arxiv.org/pdf/1712.07042v2.pdf
PWC https://paperswithcode.com/paper/development-and-evaluation-of-a-deep-learning
Repo
Framework

Temporal Context Network for Activity Localization in Videos

Title Temporal Context Network for Activity Localization in Videos
Authors Xiyang Dai, Bharat Singh, Guyue Zhang, Larry S. Davis, Yan Qiu Chen
Abstract We present a Temporal Context Network (TCN) for precise temporal localization of human activities. Similar to the Faster-RCNN architecture, proposals are placed at equal intervals in a video which span multiple temporal scales. We propose a novel representation for ranking these proposals. Since pooling features only inside a segment is not sufficient to predict activity boundaries, we construct a representation which explicitly captures context around a proposal for ranking it. For each temporal segment inside a proposal, features are uniformly sampled at a pair of scales and are input to a temporal convolutional neural network for classification. After ranking proposals, non-maximum suppression is applied and classification is performed to obtain final detections. TCN outperforms state-of-the-art methods on the ActivityNet dataset and the THUMOS14 dataset.
Tasks Temporal Localization
Published 2017-08-08
URL http://arxiv.org/abs/1708.02349v1
PDF http://arxiv.org/pdf/1708.02349v1.pdf
PWC https://paperswithcode.com/paper/temporal-context-network-for-activity
Repo
Framework

Optimised Maintenance of Datalog Materialisations

Title Optimised Maintenance of Datalog Materialisations
Authors Pan Hu, Boris Motik, Ian Horrocks
Abstract To efficiently answer queries, datalog systems often materialise all consequences of a datalog program, so the materialisation must be updated whenever the input facts change. Several solutions to the materialisation update problem have been proposed. The Delete/Rederive (DRed) and the Backward/Forward (B/F) algorithms solve this problem for general datalog, but both contain steps that evaluate rules ‘backwards’ by matching their heads to a fact and evaluating the partially instantiated rule bodies as queries. We show that this can be a considerable source of overhead even on very small updates. In contrast, the Counting algorithm does not evaluate the rules ‘backwards’, but it can handle only nonrecursive rules. We present two hybrid approaches that combine DRed and B/F with Counting so as to reduce or even eliminate ‘backward’ rule evaluation while still handling arbitrary datalog programs. We show empirically that our hybrid algorithms are usually significantly faster than existing approaches, sometimes by orders of magnitude.
Tasks
Published 2017-11-10
URL http://arxiv.org/abs/1711.03987v2
PDF http://arxiv.org/pdf/1711.03987v2.pdf
PWC https://paperswithcode.com/paper/optimised-maintenance-of-datalog
Repo
Framework

An IoT Real-Time Biometric Authentication System Based on ECG Fiducial Extracted Features Using Discrete Cosine Transform

Title An IoT Real-Time Biometric Authentication System Based on ECG Fiducial Extracted Features Using Discrete Cosine Transform
Authors Ahmed F. Hussein, Abbas K. AlZubaidi, Ali Al-Bayaty, Qais A. Habash
Abstract The conventional authentication technologies, like RFID tags and authentication cards/badges, suffer from different weaknesses, therefore a prompt replacement to use biometric method of authentication should be applied instead. Biometrics, such as fingerprints, voices, and ECG signals, are unique human characters that can be used for authentication processing. In this work, we present an IoT real-time authentication system based on using extracted ECG features to identify the unknown persons. The Discrete Cosine Transform (DCT) is used as an ECG feature extraction, where it has better characteristics for real-time system implementations. There are a substantial number of researches with a high accuracy of authentication, but most of them ignore the real-time capability of authenticating individuals. With the accuracy rate of 97.78% at around 1.21 seconds of processing time, the proposed system is more suitable for use in many applications that require fast and reliable authentication processing demands.
Tasks
Published 2017-08-28
URL http://arxiv.org/abs/1708.08189v1
PDF http://arxiv.org/pdf/1708.08189v1.pdf
PWC https://paperswithcode.com/paper/an-iot-real-time-biometric-authentication
Repo
Framework
comments powered by Disqus