Paper Group ANR 793
Deployment of Customized Deep Learning based Video Analytics On Surveillance Cameras. Explaining Away Syntactic Structure in Semantic Document Representations. Recent Developments from Attribute Profiles for Remote Sensing Image Classification. A Regularized Attention Mechanism for Graph Attention Networks. IR2VI: Enhanced Night Environmental Perce …
Deployment of Customized Deep Learning based Video Analytics On Surveillance Cameras
Title | Deployment of Customized Deep Learning based Video Analytics On Surveillance Cameras |
Authors | Pratik Dubal, Rohan Mahadev, Suraj Kothawade, Kunal Dargan, Rishabh Iyer |
Abstract | This paper demonstrates the effectiveness of our customized deep learning based video analytics system in various applications focused on security, safety, customer analytics and process compliance. We describe our video analytics system comprising of Search, Summarize, Statistics and real-time alerting, and outline its building blocks. These building blocks include object detection, tracking, face detection and recognition, human and face sub-attribute analytics. In each case, we demonstrate how custom models trained using data from the deployment scenarios provide considerably superior accuracies than off-the-shelf models. Towards this end, we describe our data processing and model training pipeline, which can train and fine-tune models from videos with a quick turnaround time. Finally, since most of these models are deployed on-site, it is important to have resource constrained models which do not require GPUs. We demonstrate how we custom train resource constrained models and deploy them on embedded devices without significant loss in accuracy. To our knowledge, this is the first work which provides a comprehensive evaluation of different deep learning models on various real-world customer deployment scenarios of surveillance video analytics. By sharing our implementation details and the experiences learned from deploying customized deep learning models for various customers, we hope that customized deep learning based video analytics is widely incorporated in commercial products around the world. |
Tasks | Face Detection, Object Detection |
Published | 2018-05-27 |
URL | http://arxiv.org/abs/1805.10604v2 |
http://arxiv.org/pdf/1805.10604v2.pdf | |
PWC | https://paperswithcode.com/paper/deployment-of-customized-deep-learning-based |
Repo | |
Framework | |
Explaining Away Syntactic Structure in Semantic Document Representations
Title | Explaining Away Syntactic Structure in Semantic Document Representations |
Authors | Erik Holmer, Andreas Marfurt |
Abstract | Most generative document models act on bag-of-words input in an attempt to focus on the semantic content and thereby partially forego syntactic information. We argue that it is preferable to keep the original word order intact and explicitly account for the syntactic structure instead. We propose an extension to the Neural Variational Document Model (Miao et al., 2016) that does exactly that to separate local (syntactic) context from the global (semantic) representation of the document. Our model builds on the variational autoencoder framework to define a generative document model based on next-word prediction. We name our approach Sequence-Aware Variational Autoencoder since in contrast to its predecessor, it operates on the true input sequence. In a series of experiments we observe stronger topicality of the learned representations as well as increased robustness to syntactic noise in our training data. |
Tasks | |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01620v1 |
http://arxiv.org/pdf/1806.01620v1.pdf | |
PWC | https://paperswithcode.com/paper/explaining-away-syntactic-structure-in |
Repo | |
Framework | |
Recent Developments from Attribute Profiles for Remote Sensing Image Classification
Title | Recent Developments from Attribute Profiles for Remote Sensing Image Classification |
Authors | Minh-Tan Pham, Sébastien Lefèvre, Erchan Aptoula, Lorenzo Bruzzone |
Abstract | Morphological attribute profiles (APs) are among the most effective methods to model the spatial and contextual information for the analysis of remote sensing images, especially for classification task. Since their first introduction to this field in early 2010’s, many research studies have been contributed not only to exploit and adapt their use to different applications, but also to extend and improve their performance for better dealing with more complex data. In this paper, we revisit and discuss different developments and extensions from APs which have drawn significant attention from researchers in the past few years. These studies are analyzed and gathered based on the concept of multi-stage AP construction. In our experiments, a comparative study on classification results of two remote sensing data is provided in order to show their significant improvements compared to the originally proposed APs. |
Tasks | Image Classification, Remote Sensing Image Classification |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.10036v1 |
http://arxiv.org/pdf/1803.10036v1.pdf | |
PWC | https://paperswithcode.com/paper/recent-developments-from-attribute-profiles |
Repo | |
Framework | |
A Regularized Attention Mechanism for Graph Attention Networks
Title | A Regularized Attention Mechanism for Graph Attention Networks |
Authors | Uday Shankar Shanthamallu, Jayaraman J. Thiagarajan, Andreas Spanias |
Abstract | Machine learning models that can exploit the inherent structure in data have gained prominence. In particular, there is a surge in deep learning solutions for graph-structured data, due to its wide-spread applicability in several fields. Graph attention networks (GAT), a recent addition to the broad class of feature learning models in graphs, utilizes the attention mechanism to efficiently learn continuous vector representations for semi-supervised learning problems. In this paper, we perform a detailed analysis of GAT models, and present interesting insights into their behavior. In particular, we show that the models are vulnerable to heterogeneous rogue nodes and hence propose novel regularization strategies to improve the robustness of GAT models. Using benchmark datasets, we demonstrate performance improvements on semi-supervised learning, using the proposed robust variant of GAT. |
Tasks | |
Published | 2018-11-01 |
URL | https://arxiv.org/abs/1811.00181v2 |
https://arxiv.org/pdf/1811.00181v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-robustness-of-attention-models-on |
Repo | |
Framework | |
IR2VI: Enhanced Night Environmental Perception by Unsupervised Thermal Image Translation
Title | IR2VI: Enhanced Night Environmental Perception by Unsupervised Thermal Image Translation |
Authors | Shuo Liu, Vijay John, Erik Blasch, Zheng Liu, Ying Huang |
Abstract | Context enhancement is critical for night vision (NV) applications, especially for the dark night situation without any artificial lights. In this paper, we present the infrared-to-visual (IR2VI) algorithm, a novel unsupervised thermal-to-visible image translation framework based on generative adversarial networks (GANs). IR2VI is able to learn the intrinsic characteristics from VI images and integrate them into IR images. Since the existing unsupervised GAN-based image translation approaches face several challenges, such as incorrect mapping and lack of fine details, we propose a structure connection module and a region-of-interest (ROI) focal loss method to address the current limitations. Experimental results show the superiority of the IR2VI algorithm over baseline methods. |
Tasks | |
Published | 2018-06-25 |
URL | http://arxiv.org/abs/1806.09565v1 |
http://arxiv.org/pdf/1806.09565v1.pdf | |
PWC | https://paperswithcode.com/paper/ir2vi-enhanced-night-environmental-perception |
Repo | |
Framework | |
Classification of simulated radio signals using Wide Residual Networks for use in the search for extra-terrestrial intelligence
Title | Classification of simulated radio signals using Wide Residual Networks for use in the search for extra-terrestrial intelligence |
Authors | G. A. Cox, S. Egly, G. R. Harp, J. Richards, S. Vinodababu, J. Voien |
Abstract | We describe a new approach and algorithm for the detection of artificial signals and their classification in the search for extraterrestrial intelligence (SETI). The characteristics of radio signals observed during SETI research are often most apparent when those signals are represented as spectrograms. Additionally, many observed signals tend to share the same characteristics, allowing for sorting of the signals into different classes. For this work, complex-valued time-series data were simulated to produce a corpus of 140,000 signals from seven different signal classes. A wide residual neural network was then trained to classify these signal types using the gray-scale 2D spectrogram representation of those signals. An average $F_1$ score of 95.11% was attained when tested on previously unobserved simulated signals. We also report on the performance of the model across a range of signal amplitudes. |
Tasks | Time Series |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08624v1 |
http://arxiv.org/pdf/1803.08624v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-of-simulated-radio-signals |
Repo | |
Framework | |
Visual-Quality-Driven Learning for Underwater Vision Enhancement
Title | Visual-Quality-Driven Learning for Underwater Vision Enhancement |
Authors | Walysson Vital Barbosa, Henrique Grandinetti Barbosa Amaral, Thiago Lages Rocha, Erickson Rangel Nascimento |
Abstract | The image processing community has witnessed remarkable advances in enhancing and restoring images. Nevertheless, restoring the visual quality of underwater images remains a great challenge. End-to-end frameworks might fail to enhance the visual quality of underwater images since in several scenarios it is not feasible to provide the ground truth of the scene radiance. In this work, we propose a CNN-based approach that does not require ground truth data since it uses a set of image quality metrics to guide the restoration learning process. The experiments showed that our method improved the visual quality of underwater images preserving their edges and also performed well considering the UCIQE metric. |
Tasks | |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04624v1 |
http://arxiv.org/pdf/1809.04624v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-quality-driven-learning-for-underwater |
Repo | |
Framework | |
Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation
Title | Incremental Decoding and Training Methods for Simultaneous Translation in Neural Machine Translation |
Authors | Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Stephan Vogel |
Abstract | We address the problem of simultaneous translation by modifying the Neural MT decoder to operate with dynamically built encoder and attention. We propose a tunable agent which decides the best segmentation strategy for a user-defined BLEU loss and Average Proportion (AP) constraint. Our agent outperforms previously proposed Wait-if-diff and Wait-if-worse agents (Cho and Esipova, 2016) on BLEU with a lower latency. Secondly we proposed data-driven changes to Neural MT training to better match the incremental decoding framework. |
Tasks | Machine Translation |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03661v1 |
http://arxiv.org/pdf/1806.03661v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-decoding-and-training-methods-for |
Repo | |
Framework | |
Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge
Title | Auto-tuning Neural Network Quantization Framework for Collaborative Inference Between the Cloud and Edge |
Authors | Guangli Li, Lei Liu, Xueying Wang, Xiao Dong, Peng Zhao, Xiaobing Feng |
Abstract | Recently, deep neural networks (DNNs) have been widely applied in mobile intelligent applications. The inference for the DNNs is usually performed in the cloud. However, it leads to a large overhead of transmitting data via wireless network. In this paper, we demonstrate the advantages of the cloud-edge collaborative inference with quantization. By analyzing the characteristics of layers in DNNs, an auto-tuning neural network quantization framework for collaborative inference is proposed. We study the effectiveness of mixed-precision collaborative inference of state-of-the-art DNNs by using ImageNet dataset. The experimental results show that our framework can generate reasonable network partitions and reduce the storage on mobile devices with trivial loss of accuracy. |
Tasks | Quantization |
Published | 2018-12-16 |
URL | http://arxiv.org/abs/1812.06426v1 |
http://arxiv.org/pdf/1812.06426v1.pdf | |
PWC | https://paperswithcode.com/paper/auto-tuning-neural-network-quantization |
Repo | |
Framework | |
Detecting egregious responses in neural sequence-to-sequence models
Title | Detecting egregious responses in neural sequence-to-sequence models |
Authors | Tianxing He, James Glass |
Abstract | In this work, we attempt to answer a critical question: whether there exists some input sequence that will cause a well-trained discrete-space neural network sequence-to-sequence (seq2seq) model to generate egregious outputs (aggressive, malicious, attacking, etc.). And if such inputs exist, how to find them efficiently. We adopt an empirical methodology, in which we first create lists of egregious output sequences, and then design a discrete optimization algorithm to find input sequences that will cause the model to generate them. Moreover, the optimization algorithm is enhanced for large vocabulary search and constrained to search for input sequences that are likely to be input by real-world users. In our experiments, we apply this approach to dialogue response generation models trained on three real-world dialogue data-sets: Ubuntu, Switchboard and OpenSubtitles, testing whether the model can generate malicious responses. We demonstrate that given the trigger inputs our algorithm finds, a significant number of malicious sentences are assigned large probability by the model, which reveals an undesirable consequence of standard seq2seq training. |
Tasks | |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.04113v2 |
http://arxiv.org/pdf/1809.04113v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-egregious-responses-in-neural |
Repo | |
Framework | |
Salient Object Detection via High-to-Low Hierarchical Context Aggregation
Title | Salient Object Detection via High-to-Low Hierarchical Context Aggregation |
Authors | Yun Liu, Yu Qiu, Le Zhang, JiaWang Bian, Guang-Yu Nie, Ming-Ming Cheng |
Abstract | Recent progress on salient object detection mainly aims at exploiting how to effectively integrate convolutional side-output features in convolutional neural networks (CNN). Based on this, most of the existing state-of-the-art saliency detectors design complex network structures to fuse the side-output features of the backbone feature extraction networks. However, should the fusion strategies be more and more complex for accurate salient object detection? In this paper, we observe that the contexts of a natural image can be well expressed by a high-to-low self-learning of side-output convolutional features. As we know, the contexts of an image usually refer to the global structures, and the top layers of CNN usually learn to convey global information. On the other hand, it is difficult for the intermediate side-output features to express contextual information. Here, we design an hourglass network with intermediate supervision to learn contextual features in a high-to-low manner. The learned hierarchical contexts are aggregated to generate the hybrid contextual expression for an input image. At last, the hybrid contextual features can be used for accurate saliency estimation. We extensively evaluate our method on six challenging saliency datasets, and our simple method achieves state-of-the-art performance under various evaluation metrics. Code will be released upon paper acceptance. |
Tasks | Object Detection, Saliency Prediction, Salient Object Detection |
Published | 2018-12-28 |
URL | http://arxiv.org/abs/1812.10956v2 |
http://arxiv.org/pdf/1812.10956v2.pdf | |
PWC | https://paperswithcode.com/paper/salient-object-detection-via-high-to-low |
Repo | |
Framework | |
Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection
Title | Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection |
Authors | Amir Soleimani, Nasser M. Nasrabadi |
Abstract | The low resolution of objects of interest in aerial images makes pedestrian detection and action detection extremely challenging tasks. Furthermore, using deep convolutional neural networks to process large images can be demanding in terms of computational requirements. In order to alleviate these challenges, we propose a two-step, yes and no question answering framework to find specific individuals doing one or multiple specific actions in aerial images. First, a deep object detector, Single Shot Multibox Detector (SSD), is used to generate object proposals from small aerial images. Second, another deep network, is used to learn a latent common sub-space which associates the high resolution aerial imagery and the pedestrian action labels that are provided by the human-based sources |
Tasks | Action Detection, Pedestrian Detection, Question Answering |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05983v1 |
http://arxiv.org/pdf/1807.05983v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-for-aerial-1 |
Repo | |
Framework | |
How to improve the interpretability of kernel learning
Title | How to improve the interpretability of kernel learning |
Authors | Jinwei Zhao, Qizhou Wang, Yufei Wang, Yu Liu, Zhenghao Shi, Xinhong Hei |
Abstract | In recent years, machine learning researchers have focused on methods to construct flexible and interpretable prediction models. However, an interpretability evaluation, a relationship between generalization performance and an interpretability of the model and a method for improving the interpretability have to be considered. In this paper, a quantitative index of the interpretability is proposed and its rationality is proved, and equilibrium problem between the interpretability and the generalization performance is analyzed. Probability upper bound of the sum of the two performances is analyzed. For traditional supervised kernel machine learning problem, a universal learning framework is put forward to solve the equilibrium problem between the two performances. The condition for global optimal solution based on the framework is deduced. The learning framework is applied to the least-squares support vector machine and is evaluated by some experiments. |
Tasks | |
Published | 2018-11-21 |
URL | https://arxiv.org/abs/1811.10469v2 |
https://arxiv.org/pdf/1811.10469v2.pdf | |
PWC | https://paperswithcode.com/paper/how-to-improve-the-interpretability-of-kernel |
Repo | |
Framework | |
DCDistance: A Supervised Text Document Feature extraction based on class labels
Title | DCDistance: A Supervised Text Document Feature extraction based on class labels |
Authors | Charles Henrique Porto Ferreira, Debora Maria Rossi de Medeiros, Fabricio Olivetti de França |
Abstract | Text Mining is a field that aims at extracting information from textual data. One of the challenges of such field of study comes from the pre-processing stage in which a vector (and structured) representation should be extracted from unstructured data. The common extraction creates large and sparse vectors representing the importance of each term to a document. As such, this usually leads to the curse-of-dimensionality that plagues most machine learning algorithms. To cope with this issue, in this paper we propose a new supervised feature extraction and reduction algorithm, named DCDistance, that creates features based on the distance between a document to a representative of each class label. As such, the proposed technique can reduce the features set in more than 99% of the original set. Additionally, this algorithm was also capable of improving the classification accuracy over a set of benchmark datasets when compared to traditional and state-of-the-art features selection algorithms. |
Tasks | |
Published | 2018-01-14 |
URL | http://arxiv.org/abs/1801.04554v1 |
http://arxiv.org/pdf/1801.04554v1.pdf | |
PWC | https://paperswithcode.com/paper/dcdistance-a-supervised-text-document-feature |
Repo | |
Framework | |
An Intelligent Safety System for Human-Centered Semi-Autonomous Vehicles
Title | An Intelligent Safety System for Human-Centered Semi-Autonomous Vehicles |
Authors | Hadi Abdi Khojasteh, Alireza Abbas Alipour, Ebrahim Ansari, Parvin Razzaghi |
Abstract | Nowadays, automobile manufacturers make efforts to develop ways to make cars fully safe. Monitoring driver’s actions by computer vision techniques to detect driving mistakes in real-time and then planning for autonomous driving to avoid vehicle collisions is one of the most important issues that has been investigated in the machine vision and Intelligent Transportation Systems (ITS). The main goal of this study is to prevent accidents caused by fatigue, drowsiness, and driver distraction. To avoid these incidents, this paper proposes an integrated safety system that continuously monitors the driver’s attention and vehicle surroundings, and finally decides whether the actual steering control status is safe or not. For this purpose, we equipped an ordinary car called FARAZ with a vision system consisting of four mounted cameras along with a universal car tool for communicating with surrounding factory-installed sensors and other car systems, and sending commands to actuators. The proposed system leverages a scene understanding pipeline using deep convolutional encoder-decoder networks and a driver state detection pipeline. We have been identifying and assessing domestic capabilities for the development of technologies specifically of the ordinary vehicles in order to manufacture smart cars and eke providing an intelligent system to increase safety and to assist the driver in various conditions/situations. |
Tasks | Autonomous Driving, Autonomous Vehicles, Scene Understanding, Steering Control |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.03953v2 |
http://arxiv.org/pdf/1812.03953v2.pdf | |
PWC | https://paperswithcode.com/paper/an-intelligent-safety-system-for-human |
Repo | |
Framework | |