October 17, 2019

2854 words 14 mins read

Paper Group ANR 793

Deployment of Customized Deep Learning based Video Analytics On Surveillance Cameras. Explaining Away Syntactic Structure in Semantic Document Representations. Recent Developments from Attribute Profiles for Remote Sensing Image Classification. A Regularized Attention Mechanism for Graph Attention Networks. IR2VI: Enhanced Night Environmental Perce …

Deployment of Customized Deep Learning based Video Analytics On Surveillance Cameras


Title	Deployment of Customized Deep Learning based Video Analytics On Surveillance Cameras
Authors	Pratik Dubal, Rohan Mahadev, Suraj Kothawade, Kunal Dargan, Rishabh Iyer
Abstract	This paper demonstrates the effectiveness of our customized deep learning based video analytics system in various applications focused on security, safety, customer analytics and process compliance. We describe our video analytics system comprising of Search, Summarize, Statistics and real-time alerting, and outline its building blocks. These building blocks include object detection, tracking, face detection and recognition, human and face sub-attribute analytics. In each case, we demonstrate how custom models trained using data from the deployment scenarios provide considerably superior accuracies than off-the-shelf models. Towards this end, we describe our data processing and model training pipeline, which can train and fine-tune models from videos with a quick turnaround time. Finally, since most of these models are deployed on-site, it is important to have resource constrained models which do not require GPUs. We demonstrate how we custom train resource constrained models and deploy them on embedded devices without significant loss in accuracy. To our knowledge, this is the first work which provides a comprehensive evaluation of different deep learning models on various real-world customer deployment scenarios of surveillance video analytics. By sharing our implementation details and the experiences learned from deploying customized deep learning models for various customers, we hope that customized deep learning based video analytics is widely incorporated in commercial products around the world.
Tasks	Face Detection, Object Detection
Published	2018-05-27
URL	http://arxiv.org/abs/1805.10604v2
PDF	http://arxiv.org/pdf/1805.10604v2.pdf
PWC	https://paperswithcode.com/paper/deployment-of-customized-deep-learning-based
Repo
Framework

Explaining Away Syntactic Structure in Semantic Document Representations


Title	Explaining Away Syntactic Structure in Semantic Document Representations
Authors	Erik Holmer, Andreas Marfurt
Abstract	Most generative document models act on bag-of-words input in an attempt to focus on the semantic content and thereby partially forego syntactic information. We argue that it is preferable to keep the original word order intact and explicitly account for the syntactic structure instead. We propose an extension to the Neural Variational Document Model (Miao et al., 2016) that does exactly that to separate local (syntactic) context from the global (semantic) representation of the document. Our model builds on the variational autoencoder framework to define a generative document model based on next-word prediction. We name our approach Sequence-Aware Variational Autoencoder since in contrast to its predecessor, it operates on the true input sequence. In a series of experiments we observe stronger topicality of the learned representations as well as increased robustness to syntactic noise in our training data.
Tasks
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01620v1
PDF	http://arxiv.org/pdf/1806.01620v1.pdf
PWC	https://paperswithcode.com/paper/explaining-away-syntactic-structure-in
Repo
Framework

Recent Developments from Attribute Profiles for Remote Sensing Image Classification


Title	Recent Developments from Attribute Profiles for Remote Sensing Image Classification
Authors	Minh-Tan Pham, Sébastien Lefèvre, Erchan Aptoula, Lorenzo Bruzzone
Abstract	Morphological attribute profiles (APs) are among the most effective methods to model the spatial and contextual information for the analysis of remote sensing images, especially for classification task. Since their first introduction to this field in early 2010’s, many research studies have been contributed not only to exploit and adapt their use to different applications, but also to extend and improve their performance for better dealing with more complex data. In this paper, we revisit and discuss different developments and extensions from APs which have drawn significant attention from researchers in the past few years. These studies are analyzed and gathered based on the concept of multi-stage AP construction. In our experiments, a comparative study on classification results of two remote sensing data is provided in order to show their significant improvements compared to the originally proposed APs.
Tasks	Image Classification, Remote Sensing Image Classification
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10036v1
PDF	http://arxiv.org/pdf/1803.10036v1.pdf
PWC	https://paperswithcode.com/paper/recent-developments-from-attribute-profiles
Repo
Framework

A Regularized Attention Mechanism for Graph Attention Networks


Title	A Regularized Attention Mechanism for Graph Attention Networks
Authors	Uday Shankar Shanthamallu, Jayaraman J. Thiagarajan, Andreas Spanias
Abstract	Machine learning models that can exploit the inherent structure in data have gained prominence. In particular, there is a surge in deep learning solutions for graph-structured data, due to its wide-spread applicability in several fields. Graph attention networks (GAT), a recent addition to the broad class of feature learning models in graphs, utilizes the attention mechanism to efficiently learn continuous vector representations for semi-supervised learning problems. In this paper, we perform a detailed analysis of GAT models, and present interesting insights into their behavior. In particular, we show that the models are vulnerable to heterogeneous rogue nodes and hence propose novel regularization strategies to improve the robustness of GAT models. Using benchmark datasets, we demonstrate performance improvements on semi-supervised learning, using the proposed robust variant of GAT.
Tasks
Published	2018-11-01
URL	https://arxiv.org/abs/1811.00181v2
PDF	https://arxiv.org/pdf/1811.00181v2.pdf
PWC	https://paperswithcode.com/paper/improving-robustness-of-attention-models-on
Repo
Framework

IR2VI: Enhanced Night Environmental Perception by Unsupervised Thermal Image Translation


Title	IR2VI: Enhanced Night Environmental Perception by Unsupervised Thermal Image Translation
Authors	Shuo Liu, Vijay John, Erik Blasch, Zheng Liu, Ying Huang
Abstract	Context enhancement is critical for night vision (NV) applications, especially for the dark night situation without any artificial lights. In this paper, we present the infrared-to-visual (IR2VI) algorithm, a novel unsupervised thermal-to-visible image translation framework based on generative adversarial networks (GANs). IR2VI is able to learn the intrinsic characteristics from VI images and integrate them into IR images. Since the existing unsupervised GAN-based image translation approaches face several challenges, such as incorrect mapping and lack of fine details, we propose a structure connection module and a region-of-interest (ROI) focal loss method to address the current limitations. Experimental results show the superiority of the IR2VI algorithm over baseline methods.
Tasks
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09565v1
PDF	http://arxiv.org/pdf/1806.09565v1.pdf
PWC	https://paperswithcode.com/paper/ir2vi-enhanced-night-environmental-perception
Repo
Framework

Classification of simulated radio signals using Wide Residual Networks for use in the search for extra-terrestrial intelligence


Title	Classification of simulated radio signals using Wide Residual Networks for use in the search for extra-terrestrial intelligence
Authors	G. A. Cox, S. Egly, G. R. Harp, J. Richards, S. Vinodababu, J. Voien
Abstract	We describe a new approach and algorithm for the detection of artificial signals and their classification in the search for extraterrestrial intelligence (SETI). The characteristics of radio signals observed during SETI research are often most apparent when those signals are represented as spectrograms. Additionally, many observed signals tend to share the same characteristics, allowing for sorting of the signals into different classes. For this work, complex-valued time-series data were simulated to produce a corpus of 140,000 signals from seven different signal classes. A wide residual neural network was then trained to classify these signal types using the gray-scale 2D spectrogram representation of those signals. An average $F_1$ score of 95.11% was attained when tested on previously unobserved simulated signals. We also report on the performance of the model across a range of signal amplitudes.
Tasks	Time Series
Published	2018-03-23
URL	http://arxiv.org/abs/1803.08624v1
PDF	http://arxiv.org/pdf/1803.08624v1.pdf
PWC	https://paperswithcode.com/paper/classification-of-simulated-radio-signals
Repo
Framework

Visual-Quality-Driven Learning for Underwater Vision Enhancement


Title	Visual-Quality-Driven Learning for Underwater Vision Enhancement
Authors	Walysson Vital Barbosa, Henrique Grandinetti Barbosa Amaral, Thiago Lages Rocha, Erickson Rangel Nascimento
Abstract	The image processing community has witnessed remarkable advances in enhancing and restoring images. Nevertheless, restoring the visual quality of underwater images remains a great challenge. End-to-end frameworks might fail to enhance the visual quality of underwater images since in several scenarios it is not feasible to provide the ground truth of the scene radiance. In this work, we propose a CNN-based approach that does not require ground truth data since it uses a set of image quality metrics to guide the restoration learning process. The experiments showed that our method improved the visual quality of underwater images preserving their edges and also performed well considering the UCIQE metric.
Tasks
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04624v1
PDF	http://arxiv.org/pdf/1809.04624v1.pdf
PWC	https://paperswithcode.com/paper/visual-quality-driven-learning-for-underwater
Repo
Framework

Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation


Title	Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation
Authors	Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Stephan Vogel
Abstract	We address the problem of simultaneous translation by modifying the Neural MT decoder to operate with dynamically built encoder and attention. We propose a tunable agent which decides the best segmentation strategy for a user-defined BLEU loss and Average Proportion (AP) constraint. Our agent outperforms previously proposed Wait-if-diff and Wait-if-worse agents (Cho and Esipova, 2016) on BLEU with a lower latency. Secondly we proposed data-driven changes to Neural MT training to better match the incremental decoding framework.
Tasks	Machine Translation
Published	2018-06-10
URL	http://arxiv.org/abs/1806.03661v1
PDF	http://arxiv.org/pdf/1806.03661v1.pdf
PWC	https://paperswithcode.com/paper/incremental-decoding-and-training-methods-for
Repo
Framework

Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge


Title	Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge
Authors	Guangli Li, Lei Liu, Xueying Wang, Xiao Dong, Peng Zhao, Xiaobing Feng
Abstract	Recently, deep neural networks (DNNs) have been widely applied in mobile intelligent applications. The inference for the DNNs is usually performed in the cloud. However, it leads to a large overhead of transmitting data via wireless network. In this paper, we demonstrate the advantages of the cloud-edge collaborative inference with quantization. By analyzing the characteristics of layers in DNNs, an auto-tuning neural network quantization framework for collaborative inference is proposed. We study the effectiveness of mixed-precision collaborative inference of state-of-the-art DNNs by using ImageNet dataset. The experimental results show that our framework can generate reasonable network partitions and reduce the storage on mobile devices with trivial loss of accuracy.
Tasks	Quantization
Published	2018-12-16
URL	http://arxiv.org/abs/1812.06426v1
PDF	http://arxiv.org/pdf/1812.06426v1.pdf
PWC	https://paperswithcode.com/paper/auto-tuning-neural-network-quantization
Repo
Framework

Detecting egregious responses in neural sequence-to-sequence models


Title	Detecting egregious responses in neural sequence-to-sequence models
Authors	Tianxing He, James Glass
Abstract	In this work, we attempt to answer a critical question: whether there exists some input sequence that will cause a well-trained discrete-space neural network sequence-to-sequence (seq2seq) model to generate egregious outputs (aggressive, malicious, attacking, etc.). And if such inputs exist, how to find them efficiently. We adopt an empirical methodology, in which we first create lists of egregious output sequences, and then design a discrete optimization algorithm to find input sequences that will cause the model to generate them. Moreover, the optimization algorithm is enhanced for large vocabulary search and constrained to search for input sequences that are likely to be input by real-world users. In our experiments, we apply this approach to dialogue response generation models trained on three real-world dialogue data-sets: Ubuntu, Switchboard and OpenSubtitles, testing whether the model can generate malicious responses. We demonstrate that given the trigger inputs our algorithm finds, a significant number of malicious sentences are assigned large probability by the model, which reveals an undesirable consequence of standard seq2seq training.
Tasks
Published	2018-09-11
URL	http://arxiv.org/abs/1809.04113v2
PDF	http://arxiv.org/pdf/1809.04113v2.pdf
PWC	https://paperswithcode.com/paper/detecting-egregious-responses-in-neural
Repo
Framework

Salient Object Detection via High-to-Low Hierarchical Context Aggregation


Title	Salient Object Detection via High-to-Low Hierarchical Context Aggregation
Authors	Yun Liu, Yu Qiu, Le Zhang, JiaWang Bian, Guang-Yu Nie, Ming-Ming Cheng
Abstract	Recent progress on salient object detection mainly aims at exploiting how to effectively integrate convolutional side-output features in convolutional neural networks (CNN). Based on this, most of the existing state-of-the-art saliency detectors design complex network structures to fuse the side-output features of the backbone feature extraction networks. However, should the fusion strategies be more and more complex for accurate salient object detection? In this paper, we observe that the contexts of a natural image can be well expressed by a high-to-low self-learning of side-output convolutional features. As we know, the contexts of an image usually refer to the global structures, and the top layers of CNN usually learn to convey global information. On the other hand, it is difficult for the intermediate side-output features to express contextual information. Here, we design an hourglass network with intermediate supervision to learn contextual features in a high-to-low manner. The learned hierarchical contexts are aggregated to generate the hybrid contextual expression for an input image. At last, the hybrid contextual features can be used for accurate saliency estimation. We extensively evaluate our method on six challenging saliency datasets, and our simple method achieves state-of-the-art performance under various evaluation metrics. Code will be released upon paper acceptance.
Tasks	Object Detection, Saliency Prediction, Salient Object Detection
Published	2018-12-28
URL	http://arxiv.org/abs/1812.10956v2
PDF	http://arxiv.org/pdf/1812.10956v2.pdf
PWC	https://paperswithcode.com/paper/salient-object-detection-via-high-to-low
Repo
Framework

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection


Title	Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection
Authors	Amir Soleimani, Nasser M. Nasrabadi
Abstract	The low resolution of objects of interest in aerial images makes pedestrian detection and action detection extremely challenging tasks. Furthermore, using deep convolutional neural networks to process large images can be demanding in terms of computational requirements. In order to alleviate these challenges, we propose a two-step, yes and no question answering framework to find specific individuals doing one or multiple specific actions in aerial images. First, a deep object detector, Single Shot Multibox Detector (SSD), is used to generate object proposals from small aerial images. Second, another deep network, is used to learn a latent common sub-space which associates the high resolution aerial imagery and the pedestrian action labels that are provided by the human-based sources
Tasks	Action Detection, Pedestrian Detection, Question Answering
Published	2018-07-16
URL	http://arxiv.org/abs/1807.05983v1
PDF	http://arxiv.org/pdf/1807.05983v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-aerial-1
Repo
Framework

How to improve the interpretability of kernel learning


Title	How to improve the interpretability of kernel learning
Authors	Jinwei Zhao, Qizhou Wang, Yufei Wang, Yu Liu, Zhenghao Shi, Xinhong Hei
Abstract	In recent years, machine learning researchers have focused on methods to construct flexible and interpretable prediction models. However, an interpretability evaluation, a relationship between generalization performance and an interpretability of the model and a method for improving the interpretability have to be considered. In this paper, a quantitative index of the interpretability is proposed and its rationality is proved, and equilibrium problem between the interpretability and the generalization performance is analyzed. Probability upper bound of the sum of the two performances is analyzed. For traditional supervised kernel machine learning problem, a universal learning framework is put forward to solve the equilibrium problem between the two performances. The condition for global optimal solution based on the framework is deduced. The learning framework is applied to the least-squares support vector machine and is evaluated by some experiments.
Tasks
Published	2018-11-21
URL	https://arxiv.org/abs/1811.10469v2
PDF	https://arxiv.org/pdf/1811.10469v2.pdf
PWC	https://paperswithcode.com/paper/how-to-improve-the-interpretability-of-kernel
Repo
Framework

DCDistance: A Supervised Text Document Feature extraction based on class labels


Title	DCDistance: A Supervised Text Document Feature extraction based on class labels
Authors	Charles Henrique Porto Ferreira, Debora Maria Rossi de Medeiros, Fabricio Olivetti de França
Abstract	Text Mining is a field that aims at extracting information from textual data. One of the challenges of such field of study comes from the pre-processing stage in which a vector (and structured) representation should be extracted from unstructured data. The common extraction creates large and sparse vectors representing the importance of each term to a document. As such, this usually leads to the curse-of-dimensionality that plagues most machine learning algorithms. To cope with this issue, in this paper we propose a new supervised feature extraction and reduction algorithm, named DCDistance, that creates features based on the distance between a document to a representative of each class label. As such, the proposed technique can reduce the features set in more than 99% of the original set. Additionally, this algorithm was also capable of improving the classification accuracy over a set of benchmark datasets when compared to traditional and state-of-the-art features selection algorithms.
Tasks
Published	2018-01-14
URL	http://arxiv.org/abs/1801.04554v1
PDF	http://arxiv.org/pdf/1801.04554v1.pdf
PWC	https://paperswithcode.com/paper/dcdistance-a-supervised-text-document-feature
Repo
Framework

An Intelligent Safety System for Human-Centered Semi-Autonomous Vehicles


Title	An Intelligent Safety System for Human-Centered Semi-Autonomous Vehicles
Authors	Hadi Abdi Khojasteh, Alireza Abbas Alipour, Ebrahim Ansari, Parvin Razzaghi
Abstract	Nowadays, automobile manufacturers make efforts to develop ways to make cars fully safe. Monitoring driver’s actions by computer vision techniques to detect driving mistakes in real-time and then planning for autonomous driving to avoid vehicle collisions is one of the most important issues that has been investigated in the machine vision and Intelligent Transportation Systems (ITS). The main goal of this study is to prevent accidents caused by fatigue, drowsiness, and driver distraction. To avoid these incidents, this paper proposes an integrated safety system that continuously monitors the driver’s attention and vehicle surroundings, and finally decides whether the actual steering control status is safe or not. For this purpose, we equipped an ordinary car called FARAZ with a vision system consisting of four mounted cameras along with a universal car tool for communicating with surrounding factory-installed sensors and other car systems, and sending commands to actuators. The proposed system leverages a scene understanding pipeline using deep convolutional encoder-decoder networks and a driver state detection pipeline. We have been identifying and assessing domestic capabilities for the development of technologies specifically of the ordinary vehicles in order to manufacture smart cars and eke providing an intelligent system to increase safety and to assist the driver in various conditions/situations.
Tasks	Autonomous Driving, Autonomous Vehicles, Scene Understanding, Steering Control
Published	2018-12-10
URL	http://arxiv.org/abs/1812.03953v2
PDF	http://arxiv.org/pdf/1812.03953v2.pdf
PWC	https://paperswithcode.com/paper/an-intelligent-safety-system-for-human
Repo
Framework