October 18, 2019

3066 words 15 mins read

Paper Group ANR 662

SSCNets: A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks. Comparative study of motion detection methods for video surveillance systems. Software Engineering Challenges of Deep Learning. Diagnosing Error in Temporal Action Detectors. Bayesian shape modelling of cross-sectional …

SSCNets: A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks


Title	SSCNets: A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks
Authors	Hammad Tariq, Hassan Ali, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman, Rehan Ahmed, Muhammad Shafique
Abstract	Recent studies have shown that slight perturbations in the input data can significantly affect the robustness of Deep Neural Networks (DNNs), leading to misclassification and confidence reduction. In this paper, we introduce a novel technique based on the Selective Sobel Convolution (SSC) operation in the training loop, that increases the robustness of a given DNN by allowing it to learn important edges in the input in a controlled fashion. This is achieved by introducing a trainable parameter, which acts as a threshold for eliminating the weaker edges. We validate our technique against the attacks of Cleverhans library on Convolutional DNNs against adversarial attacks. Our experimental results on the MNIST and CIFAR10 datasets illustrate that this controlled learning considerably increases the accuracy of the DNNs by 1.53% even when subjected to adversarial attacks.
Tasks
Published	2018-11-04
URL	http://arxiv.org/abs/1811.01443v1
PDF	http://arxiv.org/pdf/1811.01443v1.pdf
PWC	https://paperswithcode.com/paper/sscnets-a-selective-sobel-convolution-based
Repo
Framework

Comparative study of motion detection methods for video surveillance systems


Title	Comparative study of motion detection methods for video surveillance systems
Authors	Kamal Sehairi, Chouireb Fatima, Jean Meunier
Abstract	The objective of this study is to compare several change detection methods for a mono static camera and identify the best method for different complex environments and backgrounds in indoor and outdoor scenes. To this end, we used the CDnet video dataset as a benchmark that consists of many challenging problems, ranging from basic simple scenes to complex scenes affected by bad weather and dynamic backgrounds. Twelve change detection methods, ranging from simple temporal differencing to more sophisticated methods, were tested and several performance metrics were used to precisely evaluate the results. Because most of the considered methods have not previously been evaluated on this recent large scale dataset, this work compares these methods to fill a lack in the literature, and thus this evaluation joins as complementary compared with the previous comparative evaluations. Our experimental results show that there is no perfect method for all challenging cases, each method performs well in certain cases and fails in others. However, this study enables the user to identify the most suitable method for his or her needs.
Tasks	Motion Detection
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05459v1
PDF	http://arxiv.org/pdf/1804.05459v1.pdf
PWC	https://paperswithcode.com/paper/comparative-study-of-motion-detection-methods
Repo
Framework

Software Engineering Challenges of Deep Learning


Title	Software Engineering Challenges of Deep Learning
Authors	Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, Jan Bosch
Abstract	Surprisingly promising results have been achieved by deep learning (DL) systems in recent years. Many of these achievements have been reached in academic settings, or by large technology companies with highly skilled research groups and advanced supporting infrastructure. For companies without large research groups or advanced infrastructure, building high-quality production-ready systems with DL components has proven challenging. There is a clear lack of well-functioning tools and best practices for building DL systems. It is the goal of this research to identify what the main challenges are, by applying an interpretive research approach in close collaboration with companies of varying size and type. A set of seven projects have been selected to describe the potential with this new technology and to identify associated main challenges. A set of 12 main challenges has been identified and categorized into the three areas of development, production, and organizational challenges. Furthermore, a mapping between the challenges and the projects is defined, together with selected motivating descriptions of how and why the challenges apply to specific projects. Compared to other areas such as software engineering or database technologies, it is clear that DL is still rather immature and in need of further work to facilitate development of high-quality systems. The challenges identified in this paper can be used to guide future research by the software engineering and DL communities. Together, we could enable a large number of companies to start taking advantage of the high potential of the DL technology.
Tasks
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12034v1
PDF	http://arxiv.org/pdf/1810.12034v1.pdf
PWC	https://paperswithcode.com/paper/software-engineering-challenges-of-deep
Repo
Framework

Diagnosing Error in Temporal Action Detectors


Title	Diagnosing Error in Temporal Action Detectors
Authors	Humam Alwassel, Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem
Abstract	Despite the recent progress in video understanding and the continuous rate of improvement in temporal action localization throughout the years, it is still unclear how far (or close?) we are to solving the problem. To this end, we introduce a new diagnostic tool to analyze the performance of temporal action detectors in videos and compare different methods beyond a single scalar metric. We exemplify the use of our tool by analyzing the performance of the top rewarded entries in the latest ActivityNet action localization challenge. Our analysis shows that the most impactful areas to work on are: strategies to better handle temporal context around the instances, improving the robustness w.r.t. the instance absolute and relative size, and strategies to reduce the localization errors. Moreover, our experimental analysis finds the lack of agreement among annotator is not a major roadblock to attain progress in the field. Our diagnostic tool is publicly available to keep fueling the minds of other researchers with additional insights about their algorithms.
Tasks	Action Localization, Temporal Action Localization, Video Understanding
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10706v1
PDF	http://arxiv.org/pdf/1807.10706v1.pdf
PWC	https://paperswithcode.com/paper/diagnosing-error-in-temporal-action-detectors
Repo
Framework

Bayesian shape modelling of cross-sectional geological data


Title	Bayesian shape modelling of cross-sectional geological data
Authors	Thomai Tsiftsi, Ian H. Jermyn, Jochen Einbeck
Abstract	Shape information is of great importance in many applications. For example, the oil-bearing capacity of sand bodies, the subterranean remnants of ancient rivers, is related to their cross-sectional shapes. The analysis of these shapes is therefore of some interest, but current classifications are simplistic and ad hoc. In this paper, we describe the first steps towards a coherent statistical analysis of these shapes by deriving the integrated likelihood for data shapes given class parameters. The result is of interest beyond this particular application.
Tasks
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09631v1
PDF	http://arxiv.org/pdf/1802.09631v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-shape-modelling-of-cross-sectional
Repo
Framework

FusionNet and AugmentedFlowNet: Selective Proxy Ground Truth for Training on Unlabeled Images


Title	FusionNet and AugmentedFlowNet: Selective Proxy Ground Truth for Training on Unlabeled Images
Authors	Osama Makansi, Eddy Ilg, Thomas Brox
Abstract	Recent work has shown that convolutional neural networks (CNNs) can be used to estimate optical flow with high quality and fast runtime. This makes them preferable for real-world applications. However, such networks require very large training datasets. Engineering the training data is difficult and/or laborious. This paper shows how to augment a network trained on an existing synthetic dataset with large amounts of additional unlabelled data. In particular, we introduce a selection mechanism to assemble from multiple estimates a joint optical flow field, which outperforms that of all input methods. The latter can be used as proxy-ground-truth to train a network on real-world data and to adapt it to specific domains of interest. Our experimental results show that the performance of networks improves considerably, both, in cross-domain and in domain-specific scenarios. As a consequence, we obtain state-of-the-art results on the KITTI benchmarks.
Tasks	Optical Flow Estimation
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06389v1
PDF	http://arxiv.org/pdf/1808.06389v1.pdf
PWC	https://paperswithcode.com/paper/fusionnet-and-augmentedflownet-selective
Repo
Framework

Keyphrase Generation with Correlation Constraints


Title	Keyphrase Generation with Correlation Constraints
Authors	Jun Chen, Xiaoming Zhang, Yu Wu, Zhao Yan, Zhoujun Li
Abstract	In this paper, we study automatic keyphrase generation. Although conventional approaches to this task show promising results, they neglect correlation among keyphrases, resulting in duplication and coverage issues. To solve these problems, we propose a new sequence-to-sequence architecture for keyphrase generation named CorrRNN, which captures correlation among multiple keyphrases in two ways. First, we employ a coverage vector to indicate whether the word in the source document has been summarized by previous phrases to improve the coverage for keyphrases. Second, preceding phrases are taken into account to eliminate duplicate phrases and improve result coherence. Experiment results show that our model significantly outperforms the state-of-the-art method on benchmark datasets in terms of both accuracy and diversity.
Tasks
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07185v1
PDF	http://arxiv.org/pdf/1808.07185v1.pdf
PWC	https://paperswithcode.com/paper/keyphrase-generation-with-correlation
Repo
Framework

Generative Steganography by Sampling


Title	Generative Steganography by Sampling
Authors	Jia Liu, Yu Lei, Yan Ke, Jun Li, Minqing Zhang, Xiaoyuan Yan
Abstract	In this paper, a new data-driven information hiding scheme called generative steganography by sampling (GSS) is proposed. The stego is directly sampled by a powerful generator without an explicit cover. Secret key shared by both parties is used for message embedding and extraction, respectively. Jensen-Shannon Divergence is introduced as new criteria for evaluation of the security of the generative steganography. Based on these principles, a simple practical generative steganography method is proposed using semantic image inpainting. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated stego images.
Tasks	Image Inpainting
Published	2018-04-26
URL	http://arxiv.org/abs/1804.10531v1
PDF	http://arxiv.org/pdf/1804.10531v1.pdf
PWC	https://paperswithcode.com/paper/generative-steganography-by-sampling
Repo
Framework

Evaluating the Impact of Intensity Normalization on MR Image Synthesis


Title	Evaluating the Impact of Intensity Normalization on MR Image Synthesis
Authors	Jacob C. Reinhold, Blake E. Dewey, Aaron Carass, Jerry L. Prince
Abstract	Image synthesis learns a transformation from the intensity features of an input image to yield a different tissue contrast of the output image. This process has been shown to have application in many medical image analysis tasks including imputation, registration, and segmentation. To carry out synthesis, the intensities of the input images are typically scaled–i.e., normalized–both in training to learn the transformation and in testing when applying the transformation, but it is not presently known what type of input scaling is optimal. In this paper, we consider seven different intensity normalization algorithms and three different synthesis methods to evaluate the impact of normalization. Our experiments demonstrate that intensity normalization as a preprocessing step improves the synthesis results across all investigated synthesis algorithms. Furthermore, we show evidence that suggests intensity normalization is vital for successful deep learning-based MR image synthesis.
Tasks	Image Generation, Imputation
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04652v1
PDF	http://arxiv.org/pdf/1812.04652v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-the-impact-of-intensity
Repo
Framework

A Purely End-to-end System for Multi-speaker Speech Recognition


Title	A Purely End-to-end System for Multi-speaker Speech Recognition
Authors	Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey
Abstract	Recently, there has been growing interest in multi-speaker speech recognition, where the utterances of multiple speakers are recognized from their mixture. Promising techniques have been proposed for this task, but earlier works have required additional training data such as isolated source signals or senone alignments for effective learning. In this paper, we propose a new sequence-to-sequence framework to directly decode multiple label sequences from a single speech sequence by unifying source separation and speech recognition functions in an end-to-end manner. We further propose a new objective function to improve the contrast between the hidden vectors to avoid generating similar hypotheses. Experimental results show that the model is directly able to learn a mapping from a speech mixture to multiple label sequences, achieving 83.1 % relative improvement compared to a model trained without the proposed objective. Interestingly, the results are comparable to those produced by previous end-to-end works featuring explicit separation and recognition modules.
Tasks	Speech Recognition
Published	2018-05-15
URL	http://arxiv.org/abs/1805.05826v1
PDF	http://arxiv.org/pdf/1805.05826v1.pdf
PWC	https://paperswithcode.com/paper/a-purely-end-to-end-system-for-multi-speaker
Repo
Framework

Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware


Title	Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware
Authors	Natan Liss, Chaim Baskin, Avi Mendelson, Alex M. Bronstein, Raja Giryes
Abstract	Convolutional Neural Networks (CNN) has become more popular choice for various tasks such as computer vision, speech recognition and natural language processing. Thanks to their large computational capability and throughput, GPUs ,which are not power efficient and therefore does not suit low power systems such as mobile devices, are the most common platform for both training and inferencing tasks. Recent studies has shown that FPGAs can provide a good alternative to GPUs as a CNN accelerator, due to their re-configurable nature, low power and small latency. In order for FPGA-based accelerators outperform GPUs in inference task, both the parameters of the network and the activations must be quantized. While most works use uniform quantizers for both parameters and activations, it is not always the optimal one, and a non-uniform quantizer need to be considered. In this work we introduce a custom hardware-friendly approach to implement non-uniform quantizers. In addition, we use a single scale integer representation of both parameters and activations, for both training and inference. The combined method yields a hardware efficient non-uniform quantizer, fit for real-time applications. We have tested our method on CIFAR-10 and CIFAR-100 image classification datasets with ResNet-18 and VGG-like architectures, and saw little degradation in accuracy.
Tasks	Image Classification, Speech Recognition
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10869v1
PDF	http://arxiv.org/pdf/1811.10869v1.pdf
PWC	https://paperswithcode.com/paper/efficient-non-uniform-quantizer-for-quantized
Repo
Framework

Learning to Prune Filters in Convolutional Neural Networks


Title	Learning to Prune Filters in Convolutional Neural Networks
Authors	Qiangui Huang, Kevin Zhou, Suya You, Ulrich Neumann
Abstract	Many state-of-the-art computer vision algorithms use large scale convolutional neural networks (CNNs) as basic building blocks. These CNNs are known for their huge number of parameters, high redundancy in weights, and tremendous computing resource consumptions. This paper presents a learning algorithm to simplify and speed up these CNNs. Specifically, we introduce a “try-and-learn” algorithm to train pruning agents that remove unnecessary CNN filters in a data-driven way. With the help of a novel reward function, our agents removes a significant number of filters in CNNs while maintaining performance at a desired level. Moreover, this method provides an easy control of the tradeoff between network performance and its scale. Per- formance of our algorithm is validated with comprehensive pruning experiments on several popular CNNs for visual recognition and semantic segmentation tasks.
Tasks	Semantic Segmentation
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07365v1
PDF	http://arxiv.org/pdf/1801.07365v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-prune-filters-in-convolutional
Repo
Framework

SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions


Title	SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions
Authors	Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Zhiyong Feng, Wanli Xue, Rong Chen
Abstract	Although many reinforcement learning methods have been proposed for learning the optimal solutions in single-agent continuous-action domains, multiagent coordination domains with continuous actions have received relatively few investigations. In this paper, we propose an independent learner hierarchical method, named Sample Continuous Coordination with recursive Frequency Maximum Q-Value (SCC-rFMQ), which divides the cooperative problem with continuous actions into two layers. The first layer samples a finite set of actions from the continuous action spaces by a re-sampling mechanism with variable exploratory rates, and the second layer evaluates the actions in the sampled action set and updates the policy using a reinforcement learning cooperative method. By constructing cooperative mechanisms at both levels, SCC-rFMQ can handle cooperative problems in continuous action cooperative Markov games effectively. The effectiveness of SCC-rFMQ is experimentally demonstrated on two well-designed games, i.e., a continuous version of the climbing game and a cooperative version of the boat problem. Experimental results show that SCC-rFMQ outperforms other reinforcement learning algorithms.
Tasks
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06625v1
PDF	http://arxiv.org/pdf/1809.06625v1.pdf
PWC	https://paperswithcode.com/paper/scc-rfmq-learning-in-cooperative-markov-games
Repo
Framework

A Deep Learning Approach with an Attention Mechanism for Automatic Sleep Stage Classification


Title	A Deep Learning Approach with an Attention Mechanism for Automatic Sleep Stage Classification
Authors	Martin Längkvist, Amy Loutfi
Abstract	Automatic sleep staging is a challenging problem and state-of-the-art algorithms have not yet reached satisfactory performance to be used instead of manual scoring by a sleep technician. Much research has been done to find good feature representations that extract the useful information to correctly classify each epoch into the correct sleep stage. While many useful features have been discovered, the amount of features have grown to an extent that a feature reduction step is necessary in order to avoid the curse of dimensionality. One reason for the need of such a large feature set is that many features are good for discriminating only one of the sleep stages and are less informative during other stages. This paper explores how a second feature representation over a large set of pre-defined features can be learned using an auto-encoder with a selective attention for the current sleep stage in the training batch. This selective attention allows the model to learn feature representations that focuses on the more relevant inputs without having to perform any dimensionality reduction of the input data. The performance of the proposed algorithm is evaluated on a large data set of polysomnography (PSG) night recordings of patients with sleep-disordered breathing. The performance of the auto-encoder with selective attention is compared with a regular auto-encoder and previous works using a deep belief network (DBN).
Tasks	Automatic Sleep Stage Classification, Dimensionality Reduction
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05036v1
PDF	http://arxiv.org/pdf/1805.05036v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-approach-with-an-attention
Repo
Framework

Supervised learning with quantum enhanced feature spaces


Title	Supervised learning with quantum enhanced feature spaces
Authors	Vojtech Havlicek, Antonio D. Córcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, Jay M. Gambetta
Abstract	Machine learning and quantum computing are two technologies each with the potential for altering how computation is performed to address previously untenable problems. Kernel methods for machine learning are ubiquitous for pattern recognition, with support vector machines (SVMs) being the most well-known method for classification problems. However, there are limitations to the successful solution to such problems when the feature space becomes large, and the kernel functions become computationally expensive to estimate. A core element to computational speed-ups afforded by quantum algorithms is the exploitation of an exponentially large quantum state space through controllable entanglement and interference. Here, we propose and experimentally implement two novel methods on a superconducting processor. Both methods represent the feature space of a classification problem by a quantum state, taking advantage of the large dimensionality of quantum Hilbert space to obtain an enhanced solution. One method, the quantum variational classifier builds on [1,2] and operates through using a variational quantum circuit to classify a training set in direct analogy to conventional SVMs. In the second, a quantum kernel estimator, we estimate the kernel function and optimize the classifier directly. The two methods present a new class of tools for exploring the applications of noisy intermediate scale quantum computers [3] to machine learning.
Tasks
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11326v2
PDF	http://arxiv.org/pdf/1804.11326v2.pdf
PWC	https://paperswithcode.com/paper/supervised-learning-with-quantum-enhanced
Repo
Framework