October 17, 2019

2810 words 14 mins read

Paper Group ANR 854

Paper Group ANR 854

High-dimensional ABC. Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters. Non-Intrusive Load Monitoring with Fully Convolutional Networks. Background Subtraction with Real-time Semantic Segmentation. Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM. Classification of Dermoscopy Im …

High-dimensional ABC

Title High-dimensional ABC
Authors D. J. Nott, V. M. -H. Ong, Y. Fan, S. A. Sisson
Abstract This Chapter, “High-dimensional ABC”, is to appear in the forthcoming Handbook of Approximate Bayesian Computation (2018). It details the main ideas and concepts behind extending ABC methods to higher dimensions, with supporting examples and illustrations.
Tasks
Published 2018-02-27
URL http://arxiv.org/abs/1802.09725v1
PDF http://arxiv.org/pdf/1802.09725v1.pdf
PWC https://paperswithcode.com/paper/high-dimensional-abc
Repo
Framework

Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters

Title Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters
Authors Ugur Kart, Alan Lukezic, Matej Kristan, Joni-Kristian Kamarainen, Jiri Matas
Abstract Standard RGB-D trackers treat the target as an inherently 2D structure, which makes modelling appearance changes related even to simple out-of-plane rotation highly challenging. We address this limitation by proposing a novel long-term RGB-D tracker - Object Tracking by Reconstruction (OTR). The tracker performs online 3D target reconstruction to facilitate robust learning of a set of view-specific discriminative correlation filters (DCFs). The 3D reconstruction supports two performance-enhancing features: (i) generation of accurate spatial support for constrained DCF learning from its 2D projection and (ii) point cloud based estimation of 3D pose change for selection and storage of view-specific DCFs which are used to robustly localize the target after out-of-view rotation or heavy occlusion. Extensive evaluation of OTR on the challenging Princeton RGB-D tracking and STC Benchmarks shows it outperforms the state-of-the-art by a large margin.
Tasks 3D Reconstruction, Object Tracking
Published 2018-11-27
URL http://arxiv.org/abs/1811.10863v1
PDF http://arxiv.org/pdf/1811.10863v1.pdf
PWC https://paperswithcode.com/paper/object-tracking-by-reconstruction-with-view
Repo
Framework

Non-Intrusive Load Monitoring with Fully Convolutional Networks

Title Non-Intrusive Load Monitoring with Fully Convolutional Networks
Authors Cillian Brewitt, Nigel Goddard
Abstract Non-intrusive load monitoring or energy disaggregation involves estimating the power consumption of individual appliances from measurements of the total power consumption of a home. Deep neural networks have been shown to be effective for energy disaggregation. In this work, we present a deep neural network architecture which achieves state of the art disaggregation performance with substantially improved computational efficiency, reducing model training time by a factor of 32 and prediction time by a factor of 43. This improvement in efficiency could be especially useful for applications where disaggregation must be performed in home on lower power devices, or for research experiments which involve training a large number of models.
Tasks Non-Intrusive Load Monitoring
Published 2018-12-10
URL http://arxiv.org/abs/1812.03915v1
PDF http://arxiv.org/pdf/1812.03915v1.pdf
PWC https://paperswithcode.com/paper/non-intrusive-load-monitoring-with-fully
Repo
Framework

Background Subtraction with Real-time Semantic Segmentation

Title Background Subtraction with Real-time Semantic Segmentation
Authors Dongdong Zeng, Xiang Chen, Ming Zhu, Michael Goesele, Arjan Kuijper
Abstract Accurate and fast foreground object extraction is very important for object tracking and recognition in video surveillance. Although many background subtraction (BGS) methods have been proposed in the recent past, it is still regarded as a tough problem due to the variety of challenging situations that occur in real-world scenarios. In this paper, we explore this problem from a new perspective and propose a novel background subtraction framework with real-time semantic segmentation (RTSS). Our proposed framework consists of two components, a traditional BGS segmenter $\mathcal{B}$ and a real-time semantic segmenter $\mathcal{S}$. The BGS segmenter $\mathcal{B}$ aims to construct background models and segments foreground objects. The real-time semantic segmenter $\mathcal{S}$ is used to refine the foreground segmentation outputs as feedbacks for improving the model updating accuracy. $\mathcal{B}$ and $\mathcal{S}$ work in parallel on two threads. For each input frame $I_t$, the BGS segmenter $\mathcal{B}$ computes a preliminary foreground/background (FG/BG) mask $B_t$. At the same time, the real-time semantic segmenter $\mathcal{S}$ extracts the object-level semantics ${S}_t$. Then, some specific rules are applied on ${B}_t$ and ${S}_t$ to generate the final detection ${D}_t$. Finally, the refined FG/BG mask ${D}_t$ is fed back to update the background model. Comprehensive experiments evaluated on the CDnet 2014 dataset demonstrate that our proposed method achieves state-of-the-art performance among all unsupervised background subtraction methods while operating at real-time, and even performs better than some deep learning based supervised algorithms. In addition, our proposed framework is very flexible and has the potential for generalization.
Tasks Object Tracking, Real-Time Semantic Segmentation, Semantic Segmentation
Published 2018-11-25
URL http://arxiv.org/abs/1811.10020v2
PDF http://arxiv.org/pdf/1811.10020v2.pdf
PWC https://paperswithcode.com/paper/background-subtraction-with-real-time
Repo
Framework

Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM

Title Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM
Authors Yuxiao Chen, Jianbo Yuan, Quanzeng You, Jiebo Luo
Abstract Sentiment analysis on large-scale social media data is important to bridge the gaps between social media contents and real world activities including political election prediction, individual and public emotional status monitoring and analysis, and so on. Although textual sentiment analysis has been well studied based on platforms such as Twitter and Instagram, analysis of the role of extensive emoji uses in sentiment analysis remains light. In this paper, we propose a novel scheme for Twitter sentiment analysis with extra attention on emojis. We first learn bi-sense emoji embeddings under positive and negative sentimental tweets individually, and then train a sentiment classifier by attending on these bi-sense emoji embeddings with an attention-based long short-term memory network (LSTM). Our experiments show that the bi-sense embedding is effective for extracting sentiment-aware embeddings of emojis and outperforms the state-of-the-art models. We also visualize the attentions to show that the bi-sense emoji embedding provides better guidance on the attention mechanism to obtain a more robust understanding of the semantics and sentiments.
Tasks Sentiment Analysis, Twitter Sentiment Analysis
Published 2018-07-20
URL http://arxiv.org/abs/1807.07961v2
PDF http://arxiv.org/pdf/1807.07961v2.pdf
PWC https://paperswithcode.com/paper/twitter-sentiment-analysis-via-bi-sense-emoji
Repo
Framework

Classification of Dermoscopy Images using Deep Learning

Title Classification of Dermoscopy Images using Deep Learning
Authors Nithin D Reddy
Abstract Skin cancer is one of the most common forms of cancer and its incidence is projected to rise over the next decade. Artificial intelligence is a viable solution to the issue of providing quality care to patients in areas lacking access to trained dermatologists. Considerable progress has been made in the use of automated applications for accurate classification of skin lesions from digital images. In this manuscript, we discuss the design and implementation of a deep learning algorithm for classification of dermoscopy images from the HAM10000 Dataset. We trained a convolutional neural network based on the ResNet50 architecture to accurately classify dermoscopy images of skin lesions into one of seven disease categories. Using our custom model, we obtained a balanced accuracy of 91% on the validation dataset.
Tasks
Published 2018-08-05
URL http://arxiv.org/abs/1808.01607v1
PDF http://arxiv.org/pdf/1808.01607v1.pdf
PWC https://paperswithcode.com/paper/classification-of-dermoscopy-images-using
Repo
Framework

Using a Game Engine to Simulate Critical Incidents and Data Collection by Autonomous Drones

Title Using a Game Engine to Simulate Critical Incidents and Data Collection by Autonomous Drones
Authors David L. Smyth, Frank G. Glavin, Michael G. Madden
Abstract Using a game engine, we have developed a virtual environment which models important aspects of critical incident scenarios. We focused on modelling phenomena relating to the identification and gathering of key forensic evidence, in order to develop and test a system which can handle chemical, biological, radiological/nuclear or explosive (CBRNe) events autonomously. This allows us to build and validate AI-based technologies, which can be trained and tested in our custom virtual environment before being deployed in real-world scenarios. We have used our virtual scenario to rapidly prototype a system which can use simulated Remote Aerial Vehicles (RAVs) to gather images from the environment for the purpose of mapping. Our environment provides us with an effective medium through which we can develop and test various AI methodologies for critical incident scene assessment, in a safe and controlled manner
Tasks
Published 2018-08-31
URL http://arxiv.org/abs/1808.10784v2
PDF http://arxiv.org/pdf/1808.10784v2.pdf
PWC https://paperswithcode.com/paper/using-a-game-engine-to-simulate-critical
Repo
Framework

Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training

Title Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training
Authors Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee, Arvind Krishnamurthy
Abstract Distributed deep neural network (DDNN) training constitutes an increasingly important workload that frequently runs in the cloud. Larger DNN models and faster compute engines are shifting DDNN training bottlenecks from computation to communication. This paper characterizes DDNN training to precisely pinpoint these bottlenecks. We found that timely training requires high performance parameter servers (PSs) with optimized network stacks and gradient processing pipelines, as well as server and network hardware with balanced computation and communication resources. We therefore propose PHub, a high performance multi-tenant, rack-scale PS design. PHub co-designs the PS software and hardware to accelerate rack-level and hierarchical cross-rack parameter exchange, with an API compatible with many DDNN training frameworks. PHub provides a performance improvement of up to 2.7x compared to state-of-the-art distributed training techniques for cloud-based ImageNet workloads, with 25% better throughput per dollar.
Tasks
Published 2018-05-21
URL https://arxiv.org/abs/1805.07891v2
PDF https://arxiv.org/pdf/1805.07891v2.pdf
PWC https://paperswithcode.com/paper/parameter-hub-a-rack-scale-parameter-server
Repo
Framework

FlashRL: A Reinforcement Learning Platform for Flash Games

Title FlashRL: A Reinforcement Learning Platform for Flash Games
Authors Per-Arne Andersen, Morten Goodwin, Ole-Christoffer Granmo
Abstract Reinforcement Learning (RL) is a research area that has blossomed tremendously in recent years and has shown remarkable potential in among others successfully playing computer games. However, there only exists a few game platforms that provide diversity in tasks and state-space needed to advance RL algorithms. The existing platforms offer RL access to Atari- and a few web-based games, but no platform fully expose access to Flash games. This is unfortunate because applying RL to Flash games have potential to push the research of RL algorithms. This paper introduces the Flash Reinforcement Learning platform (FlashRL) which attempts to fill this gap by providing an environment for thousands of Flash games on a novel platform for Flash automation. It opens up easy experimentation with RL algorithms for Flash games, which has previously been challenging. The platform shows excellent performance with as little as 5% CPU utilization on consumer hardware. It shows promising results for novel reinforcement learning algorithms.
Tasks
Published 2018-01-26
URL http://arxiv.org/abs/1801.08841v1
PDF http://arxiv.org/pdf/1801.08841v1.pdf
PWC https://paperswithcode.com/paper/flashrl-a-reinforcement-learning-platform-for
Repo
Framework

iLCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data

Title iLCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Authors Andreas Niekler, Arnim Bleier, Christian Kahmann, Lisa Posch, Gregor Wiedemann, Kenan Erdogan, Gerhard Heyer, Markus Strohmaier
Abstract The iLCM project pursues the development of an integrated research environment for the analysis of structured and unstructured data in a “Software as a Service” architecture (SaaS). The research environment addresses requirements for the quantitative evaluation of large amounts of qualitative data with text mining methods as well as requirements for the reproducibility of data-driven research designs in the social sciences. For this, the iLCM research environment comprises two central components. First, the Leipzig Corpus Miner (LCM), a decentralized SaaS application for the analysis of large amounts of news texts developed in a previous Digital Humanities project. Second, the text mining tools implemented in the LCM are extended by an “Open Research Computing” (ORC) environment for executable script documents, so-called “notebooks”. This novel integration allows to combine generic, high-performance methods to process large amounts of unstructured text data and with individual program scripts to address specific research requirements in computational social science and digital humanities.
Tasks
Published 2018-05-11
URL http://arxiv.org/abs/1805.11404v1
PDF http://arxiv.org/pdf/1805.11404v1.pdf
PWC https://paperswithcode.com/paper/ilcm-a-virtual-research-infrastructure-for
Repo
Framework

Egocentric Basketball Motion Planning from a Single First-Person Image

Title Egocentric Basketball Motion Planning from a Single First-Person Image
Authors Gedas Bertasius, Aaron Chan, Jianbo Shi
Abstract We present a model that uses a single first-person image to generate an egocentric basketball motion sequence in the form of a 12D camera configuration trajectory, which encodes a player’s 3D location and 3D head orientation throughout the sequence. To do this, we first introduce a future convolutional neural network (CNN) that predicts an initial sequence of 12D camera configurations, aiming to capture how real players move during a one-on-one basketball game. We also introduce a goal verifier network, which is trained to verify that a given camera configuration is consistent with the final goals of real one-on-one basketball players. Next, we propose an inverse synthesis procedure to synthesize a refined sequence of 12D camera configurations that (1) sufficiently matches the initial configurations predicted by the future CNN, while (2) maximizing the output of the goal verifier network. Finally, by following the trajectory resulting from the refined camera configuration sequence, we obtain the complete 12D motion sequence. Our model generates realistic basketball motion sequences that capture the goals of real players, outperforming standard deep learning approaches such as recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and generative adversarial networks (GANs).
Tasks Motion Planning
Published 2018-03-04
URL http://arxiv.org/abs/1803.01413v1
PDF http://arxiv.org/pdf/1803.01413v1.pdf
PWC https://paperswithcode.com/paper/egocentric-basketball-motion-planning-from-a
Repo
Framework

Left Ventricle Segmentation and Quantification from Cardiac Cine MR Images via Multi-task Learning

Title Left Ventricle Segmentation and Quantification from Cardiac Cine MR Images via Multi-task Learning
Authors Shusil Dangi, Ziv Yaniv, Cristian A. Linte
Abstract Segmentation of the left ventricle and quantification of various cardiac contractile functions is crucial for the timely diagnosis and treatment of cardiovascular diseases. Traditionally, the two tasks have been tackled independently. Here we propose a convolutional neural network based multi-task learning approach to perform both tasks simultaneously, such that, the network learns better representation of the data with improved generalization performance. Probabilistic formulation of the problem enables learning the task uncertainties during the training, which are used to automatically compute the weights for the tasks. We performed a five fold cross-validation of the myocardium segmentation obtained from the proposed multi-task network on 97 patient 4-dimensional cardiac cine-MRI datasets available through the STACOM LV segmentation challenge against the provided gold-standard myocardium segmentation, obtaining a Dice overlap of $0.849 \pm 0.036$ and mean surface distance of $0.274 \pm 0.083$ mm, while simultaneously estimating the myocardial area with mean absolute difference error of $205\pm198$ mm$^2$.
Tasks Multi-Task Learning
Published 2018-09-26
URL http://arxiv.org/abs/1809.10221v1
PDF http://arxiv.org/pdf/1809.10221v1.pdf
PWC https://paperswithcode.com/paper/left-ventricle-segmentation-and
Repo
Framework

AWE: Asymmetric Word Embedding for Textual Entailment

Title AWE: Asymmetric Word Embedding for Textual Entailment
Authors Tengfei Ma, Chiamin Wu, Cao Xiao, Jimeng Sun
Abstract Textual entailment is a fundamental task in natural language processing. It refers to the directional relation between text fragments such that the “premise” can infer “hypothesis”. In recent years deep learning methods have achieved great success in this task. Many of them have considered the inter-sentence word-word interactions between the premise-hypothesis pairs, however, few of them considered the “asymmetry” of these interactions. Different from paraphrase identification or sentence similarity evaluation, textual entailment is essentially determining a directional (asymmetric) relation between the premise and the hypothesis. In this paper, we propose a simple but effective way to enhance existing textual entailment algorithms by using asymmetric word embeddings. Experimental results on SciTail and SNLI datasets show that the learned asymmetric word embeddings could significantly improve the word-word interaction based textual entailment models. It is noteworthy that the proposed AWE-DeIsTe model can get 2.1% accuracy improvement over prior state-of-the-art on SciTail.
Tasks Natural Language Inference, Paraphrase Identification, Word Embeddings
Published 2018-09-11
URL http://arxiv.org/abs/1809.04047v2
PDF http://arxiv.org/pdf/1809.04047v2.pdf
PWC https://paperswithcode.com/paper/awe-asymmetric-word-embedding-for-textual
Repo
Framework

FPGA-Based CNN Inference Accelerator Synthesized from Multi-Threaded C Software

Title FPGA-Based CNN Inference Accelerator Synthesized from Multi-Threaded C Software
Authors Jin Hee Kim, Brett Grady, Ruolong Lian, John Brothers, Jason H. Anderson
Abstract A deep-learning inference accelerator is synthesized from a C-language software program parallelized with Pthreads. The software implementation uses the well-known producer/consumer model with parallel threads interconnected by FIFO queues. The LegUp high-level synthesis (HLS) tool synthesizes threads into parallel FPGA hardware, translating software parallelism into spatial parallelism. A complete system is generated where convolution, pooling and padding are realized in the synthesized accelerator, with remaining tasks executing on an embedded ARM processor. The accelerator incorporates reduced precision, and a novel approach for zero-weight-skipping in convolution. On a mid-sized Intel Arria 10 SoC FPGA, peak performance on VGG-16 is 138 effective GOPS.
Tasks
Published 2018-07-27
URL http://arxiv.org/abs/1807.10695v1
PDF http://arxiv.org/pdf/1807.10695v1.pdf
PWC https://paperswithcode.com/paper/fpga-based-cnn-inference-accelerator
Repo
Framework

Boosting Image Forgery Detection using Resampling Features and Copy-move analysis

Title Boosting Image Forgery Detection using Resampling Features and Copy-move analysis
Authors Tajuddin Manhar Mohammed, Jason Bunk, Lakshmanan Nataraj, Jawadul H. Bappy, Arjuna Flenner, B. S. Manjunath, Shivkumar Chandrasekaran, Amit K. Roy-Chowdhury, Lawrence Peterson
Abstract Realistic image forgeries involve a combination of splicing, resampling, cloning, region removal and other methods. While resampling detection algorithms are effective in detecting splicing and resampling, copy-move detection algorithms excel in detecting cloning and region removal. In this paper, we combine these complementary approaches in a way that boosts the overall accuracy of image manipulation detection. We use the copy-move detection method as a pre-filtering step and pass those images that are classified as untampered to a deep learning based resampling detection framework. Experimental results on various datasets including the 2017 NIST Nimble Challenge Evaluation dataset comprising nearly 10,000 pristine and tampered images shows that there is a consistent increase of 8%-10% in detection rates, when copy-move algorithm is combined with different resampling detection algorithms.
Tasks Image Manipulation Detection
Published 2018-02-09
URL http://arxiv.org/abs/1802.03154v2
PDF http://arxiv.org/pdf/1802.03154v2.pdf
PWC https://paperswithcode.com/paper/boosting-image-forgery-detection-using
Repo
Framework
comments powered by Disqus