Paper Group ANR 662
SSCNets: A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks. Comparative study of motion detection methods for video surveillance systems. Software Engineering Challenges of Deep Learning. Diagnosing Error in Temporal Action Detectors. Bayesian shape modelling of cross-sectional …
SSCNets: A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks
Title | SSCNets: A Selective Sobel Convolution-based Technique to Enhance the Robustness of Deep Neural Networks against Security Attacks |
Authors | Hammad Tariq, Hassan Ali, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman, Rehan Ahmed, Muhammad Shafique |
Abstract | Recent studies have shown that slight perturbations in the input data can significantly affect the robustness of Deep Neural Networks (DNNs), leading to misclassification and confidence reduction. In this paper, we introduce a novel technique based on the Selective Sobel Convolution (SSC) operation in the training loop, that increases the robustness of a given DNN by allowing it to learn important edges in the input in a controlled fashion. This is achieved by introducing a trainable parameter, which acts as a threshold for eliminating the weaker edges. We validate our technique against the attacks of Cleverhans library on Convolutional DNNs against adversarial attacks. Our experimental results on the MNIST and CIFAR10 datasets illustrate that this controlled learning considerably increases the accuracy of the DNNs by 1.53% even when subjected to adversarial attacks. |
Tasks | |
Published | 2018-11-04 |
URL | http://arxiv.org/abs/1811.01443v1 |
http://arxiv.org/pdf/1811.01443v1.pdf | |
PWC | https://paperswithcode.com/paper/sscnets-a-selective-sobel-convolution-based |
Repo | |
Framework | |
Comparative study of motion detection methods for video surveillance systems
Title | Comparative study of motion detection methods for video surveillance systems |
Authors | Kamal Sehairi, Chouireb Fatima, Jean Meunier |
Abstract | The objective of this study is to compare several change detection methods for a mono static camera and identify the best method for different complex environments and backgrounds in indoor and outdoor scenes. To this end, we used the CDnet video dataset as a benchmark that consists of many challenging problems, ranging from basic simple scenes to complex scenes affected by bad weather and dynamic backgrounds. Twelve change detection methods, ranging from simple temporal differencing to more sophisticated methods, were tested and several performance metrics were used to precisely evaluate the results. Because most of the considered methods have not previously been evaluated on this recent large scale dataset, this work compares these methods to fill a lack in the literature, and thus this evaluation joins as complementary compared with the previous comparative evaluations. Our experimental results show that there is no perfect method for all challenging cases, each method performs well in certain cases and fails in others. However, this study enables the user to identify the most suitable method for his or her needs. |
Tasks | Motion Detection |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05459v1 |
http://arxiv.org/pdf/1804.05459v1.pdf | |
PWC | https://paperswithcode.com/paper/comparative-study-of-motion-detection-methods |
Repo | |
Framework | |
Software Engineering Challenges of Deep Learning
Title | Software Engineering Challenges of Deep Learning |
Authors | Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, Jan Bosch |
Abstract | Surprisingly promising results have been achieved by deep learning (DL) systems in recent years. Many of these achievements have been reached in academic settings, or by large technology companies with highly skilled research groups and advanced supporting infrastructure. For companies without large research groups or advanced infrastructure, building high-quality production-ready systems with DL components has proven challenging. There is a clear lack of well-functioning tools and best practices for building DL systems. It is the goal of this research to identify what the main challenges are, by applying an interpretive research approach in close collaboration with companies of varying size and type. A set of seven projects have been selected to describe the potential with this new technology and to identify associated main challenges. A set of 12 main challenges has been identified and categorized into the three areas of development, production, and organizational challenges. Furthermore, a mapping between the challenges and the projects is defined, together with selected motivating descriptions of how and why the challenges apply to specific projects. Compared to other areas such as software engineering or database technologies, it is clear that DL is still rather immature and in need of further work to facilitate development of high-quality systems. The challenges identified in this paper can be used to guide future research by the software engineering and DL communities. Together, we could enable a large number of companies to start taking advantage of the high potential of the DL technology. |
Tasks | |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12034v1 |
http://arxiv.org/pdf/1810.12034v1.pdf | |
PWC | https://paperswithcode.com/paper/software-engineering-challenges-of-deep |
Repo | |
Framework | |
Diagnosing Error in Temporal Action Detectors
Title | Diagnosing Error in Temporal Action Detectors |
Authors | Humam Alwassel, Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem |
Abstract | Despite the recent progress in video understanding and the continuous rate of improvement in temporal action localization throughout the years, it is still unclear how far (or close?) we are to solving the problem. To this end, we introduce a new diagnostic tool to analyze the performance of temporal action detectors in videos and compare different methods beyond a single scalar metric. We exemplify the use of our tool by analyzing the performance of the top rewarded entries in the latest ActivityNet action localization challenge. Our analysis shows that the most impactful areas to work on are: strategies to better handle temporal context around the instances, improving the robustness w.r.t. the instance absolute and relative size, and strategies to reduce the localization errors. Moreover, our experimental analysis finds the lack of agreement among annotator is not a major roadblock to attain progress in the field. Our diagnostic tool is publicly available to keep fueling the minds of other researchers with additional insights about their algorithms. |
Tasks | Action Localization, Temporal Action Localization, Video Understanding |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10706v1 |
http://arxiv.org/pdf/1807.10706v1.pdf | |
PWC | https://paperswithcode.com/paper/diagnosing-error-in-temporal-action-detectors |
Repo | |
Framework | |
Bayesian shape modelling of cross-sectional geological data
Title | Bayesian shape modelling of cross-sectional geological data |
Authors | Thomai Tsiftsi, Ian H. Jermyn, Jochen Einbeck |
Abstract | Shape information is of great importance in many applications. For example, the oil-bearing capacity of sand bodies, the subterranean remnants of ancient rivers, is related to their cross-sectional shapes. The analysis of these shapes is therefore of some interest, but current classifications are simplistic and ad hoc. In this paper, we describe the first steps towards a coherent statistical analysis of these shapes by deriving the integrated likelihood for data shapes given class parameters. The result is of interest beyond this particular application. |
Tasks | |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09631v1 |
http://arxiv.org/pdf/1802.09631v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-shape-modelling-of-cross-sectional |
Repo | |
Framework | |
FusionNet and AugmentedFlowNet: Selective Proxy Ground Truth for Training on Unlabeled Images
Title | FusionNet and AugmentedFlowNet: Selective Proxy Ground Truth for Training on Unlabeled Images |
Authors | Osama Makansi, Eddy Ilg, Thomas Brox |
Abstract | Recent work has shown that convolutional neural networks (CNNs) can be used to estimate optical flow with high quality and fast runtime. This makes them preferable for real-world applications. However, such networks require very large training datasets. Engineering the training data is difficult and/or laborious. This paper shows how to augment a network trained on an existing synthetic dataset with large amounts of additional unlabelled data. In particular, we introduce a selection mechanism to assemble from multiple estimates a joint optical flow field, which outperforms that of all input methods. The latter can be used as proxy-ground-truth to train a network on real-world data and to adapt it to specific domains of interest. Our experimental results show that the performance of networks improves considerably, both, in cross-domain and in domain-specific scenarios. As a consequence, we obtain state-of-the-art results on the KITTI benchmarks. |
Tasks | Optical Flow Estimation |
Published | 2018-08-20 |
URL | http://arxiv.org/abs/1808.06389v1 |
http://arxiv.org/pdf/1808.06389v1.pdf | |
PWC | https://paperswithcode.com/paper/fusionnet-and-augmentedflownet-selective |
Repo | |
Framework | |
Keyphrase Generation with Correlation Constraints
Title | Keyphrase Generation with Correlation Constraints |
Authors | Jun Chen, Xiaoming Zhang, Yu Wu, Zhao Yan, Zhoujun Li |
Abstract | In this paper, we study automatic keyphrase generation. Although conventional approaches to this task show promising results, they neglect correlation among keyphrases, resulting in duplication and coverage issues. To solve these problems, we propose a new sequence-to-sequence architecture for keyphrase generation named CorrRNN, which captures correlation among multiple keyphrases in two ways. First, we employ a coverage vector to indicate whether the word in the source document has been summarized by previous phrases to improve the coverage for keyphrases. Second, preceding phrases are taken into account to eliminate duplicate phrases and improve result coherence. Experiment results show that our model significantly outperforms the state-of-the-art method on benchmark datasets in terms of both accuracy and diversity. |
Tasks | |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07185v1 |
http://arxiv.org/pdf/1808.07185v1.pdf | |
PWC | https://paperswithcode.com/paper/keyphrase-generation-with-correlation |
Repo | |
Framework | |
Generative Steganography by Sampling
Title | Generative Steganography by Sampling |
Authors | Jia Liu, Yu Lei, Yan Ke, Jun Li, Minqing Zhang, Xiaoyuan Yan |
Abstract | In this paper, a new data-driven information hiding scheme called generative steganography by sampling (GSS) is proposed. The stego is directly sampled by a powerful generator without an explicit cover. Secret key shared by both parties is used for message embedding and extraction, respectively. Jensen-Shannon Divergence is introduced as new criteria for evaluation of the security of the generative steganography. Based on these principles, a simple practical generative steganography method is proposed using semantic image inpainting. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated stego images. |
Tasks | Image Inpainting |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.10531v1 |
http://arxiv.org/pdf/1804.10531v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-steganography-by-sampling |
Repo | |
Framework | |
Evaluating the Impact of Intensity Normalization on MR Image Synthesis
Title | Evaluating the Impact of Intensity Normalization on MR Image Synthesis |
Authors | Jacob C. Reinhold, Blake E. Dewey, Aaron Carass, Jerry L. Prince |
Abstract | Image synthesis learns a transformation from the intensity features of an input image to yield a different tissue contrast of the output image. This process has been shown to have application in many medical image analysis tasks including imputation, registration, and segmentation. To carry out synthesis, the intensities of the input images are typically scaled–i.e., normalized–both in training to learn the transformation and in testing when applying the transformation, but it is not presently known what type of input scaling is optimal. In this paper, we consider seven different intensity normalization algorithms and three different synthesis methods to evaluate the impact of normalization. Our experiments demonstrate that intensity normalization as a preprocessing step improves the synthesis results across all investigated synthesis algorithms. Furthermore, we show evidence that suggests intensity normalization is vital for successful deep learning-based MR image synthesis. |
Tasks | Image Generation, Imputation |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04652v1 |
http://arxiv.org/pdf/1812.04652v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-the-impact-of-intensity |
Repo | |
Framework | |
A Purely End-to-end System for Multi-speaker Speech Recognition
Title | A Purely End-to-end System for Multi-speaker Speech Recognition |
Authors | Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey |
Abstract | Recently, there has been growing interest in multi-speaker speech recognition, where the utterances of multiple speakers are recognized from their mixture. Promising techniques have been proposed for this task, but earlier works have required additional training data such as isolated source signals or senone alignments for effective learning. In this paper, we propose a new sequence-to-sequence framework to directly decode multiple label sequences from a single speech sequence by unifying source separation and speech recognition functions in an end-to-end manner. We further propose a new objective function to improve the contrast between the hidden vectors to avoid generating similar hypotheses. Experimental results show that the model is directly able to learn a mapping from a speech mixture to multiple label sequences, achieving 83.1 % relative improvement compared to a model trained without the proposed objective. Interestingly, the results are comparable to those produced by previous end-to-end works featuring explicit separation and recognition modules. |
Tasks | Speech Recognition |
Published | 2018-05-15 |
URL | http://arxiv.org/abs/1805.05826v1 |
http://arxiv.org/pdf/1805.05826v1.pdf | |
PWC | https://paperswithcode.com/paper/a-purely-end-to-end-system-for-multi-speaker |
Repo | |
Framework | |
Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware
Title | Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware |
Authors | Natan Liss, Chaim Baskin, Avi Mendelson, Alex M. Bronstein, Raja Giryes |
Abstract | Convolutional Neural Networks (CNN) has become more popular choice for various tasks such as computer vision, speech recognition and natural language processing. Thanks to their large computational capability and throughput, GPUs ,which are not power efficient and therefore does not suit low power systems such as mobile devices, are the most common platform for both training and inferencing tasks. Recent studies has shown that FPGAs can provide a good alternative to GPUs as a CNN accelerator, due to their re-configurable nature, low power and small latency. In order for FPGA-based accelerators outperform GPUs in inference task, both the parameters of the network and the activations must be quantized. While most works use uniform quantizers for both parameters and activations, it is not always the optimal one, and a non-uniform quantizer need to be considered. In this work we introduce a custom hardware-friendly approach to implement non-uniform quantizers. In addition, we use a single scale integer representation of both parameters and activations, for both training and inference. The combined method yields a hardware efficient non-uniform quantizer, fit for real-time applications. We have tested our method on CIFAR-10 and CIFAR-100 image classification datasets with ResNet-18 and VGG-like architectures, and saw little degradation in accuracy. |
Tasks | Image Classification, Speech Recognition |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.10869v1 |
http://arxiv.org/pdf/1811.10869v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-non-uniform-quantizer-for-quantized |
Repo | |
Framework | |
Learning to Prune Filters in Convolutional Neural Networks
Title | Learning to Prune Filters in Convolutional Neural Networks |
Authors | Qiangui Huang, Kevin Zhou, Suya You, Ulrich Neumann |
Abstract | Many state-of-the-art computer vision algorithms use large scale convolutional neural networks (CNNs) as basic building blocks. These CNNs are known for their huge number of parameters, high redundancy in weights, and tremendous computing resource consumptions. This paper presents a learning algorithm to simplify and speed up these CNNs. Specifically, we introduce a “try-and-learn” algorithm to train pruning agents that remove unnecessary CNN filters in a data-driven way. With the help of a novel reward function, our agents removes a significant number of filters in CNNs while maintaining performance at a desired level. Moreover, this method provides an easy control of the tradeoff between network performance and its scale. Per- formance of our algorithm is validated with comprehensive pruning experiments on several popular CNNs for visual recognition and semantic segmentation tasks. |
Tasks | Semantic Segmentation |
Published | 2018-01-23 |
URL | http://arxiv.org/abs/1801.07365v1 |
http://arxiv.org/pdf/1801.07365v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-prune-filters-in-convolutional |
Repo | |
Framework | |
SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions
Title | SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions |
Authors | Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Zhiyong Feng, Wanli Xue, Rong Chen |
Abstract | Although many reinforcement learning methods have been proposed for learning the optimal solutions in single-agent continuous-action domains, multiagent coordination domains with continuous actions have received relatively few investigations. In this paper, we propose an independent learner hierarchical method, named Sample Continuous Coordination with recursive Frequency Maximum Q-Value (SCC-rFMQ), which divides the cooperative problem with continuous actions into two layers. The first layer samples a finite set of actions from the continuous action spaces by a re-sampling mechanism with variable exploratory rates, and the second layer evaluates the actions in the sampled action set and updates the policy using a reinforcement learning cooperative method. By constructing cooperative mechanisms at both levels, SCC-rFMQ can handle cooperative problems in continuous action cooperative Markov games effectively. The effectiveness of SCC-rFMQ is experimentally demonstrated on two well-designed games, i.e., a continuous version of the climbing game and a cooperative version of the boat problem. Experimental results show that SCC-rFMQ outperforms other reinforcement learning algorithms. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06625v1 |
http://arxiv.org/pdf/1809.06625v1.pdf | |
PWC | https://paperswithcode.com/paper/scc-rfmq-learning-in-cooperative-markov-games |
Repo | |
Framework | |
A Deep Learning Approach with an Attention Mechanism for Automatic Sleep Stage Classification
Title | A Deep Learning Approach with an Attention Mechanism for Automatic Sleep Stage Classification |
Authors | Martin Längkvist, Amy Loutfi |
Abstract | Automatic sleep staging is a challenging problem and state-of-the-art algorithms have not yet reached satisfactory performance to be used instead of manual scoring by a sleep technician. Much research has been done to find good feature representations that extract the useful information to correctly classify each epoch into the correct sleep stage. While many useful features have been discovered, the amount of features have grown to an extent that a feature reduction step is necessary in order to avoid the curse of dimensionality. One reason for the need of such a large feature set is that many features are good for discriminating only one of the sleep stages and are less informative during other stages. This paper explores how a second feature representation over a large set of pre-defined features can be learned using an auto-encoder with a selective attention for the current sleep stage in the training batch. This selective attention allows the model to learn feature representations that focuses on the more relevant inputs without having to perform any dimensionality reduction of the input data. The performance of the proposed algorithm is evaluated on a large data set of polysomnography (PSG) night recordings of patients with sleep-disordered breathing. The performance of the auto-encoder with selective attention is compared with a regular auto-encoder and previous works using a deep belief network (DBN). |
Tasks | Automatic Sleep Stage Classification, Dimensionality Reduction |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05036v1 |
http://arxiv.org/pdf/1805.05036v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-approach-with-an-attention |
Repo | |
Framework | |
Supervised learning with quantum enhanced feature spaces
Title | Supervised learning with quantum enhanced feature spaces |
Authors | Vojtech Havlicek, Antonio D. Córcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, Jay M. Gambetta |
Abstract | Machine learning and quantum computing are two technologies each with the potential for altering how computation is performed to address previously untenable problems. Kernel methods for machine learning are ubiquitous for pattern recognition, with support vector machines (SVMs) being the most well-known method for classification problems. However, there are limitations to the successful solution to such problems when the feature space becomes large, and the kernel functions become computationally expensive to estimate. A core element to computational speed-ups afforded by quantum algorithms is the exploitation of an exponentially large quantum state space through controllable entanglement and interference. Here, we propose and experimentally implement two novel methods on a superconducting processor. Both methods represent the feature space of a classification problem by a quantum state, taking advantage of the large dimensionality of quantum Hilbert space to obtain an enhanced solution. One method, the quantum variational classifier builds on [1,2] and operates through using a variational quantum circuit to classify a training set in direct analogy to conventional SVMs. In the second, a quantum kernel estimator, we estimate the kernel function and optimize the classifier directly. The two methods present a new class of tools for exploring the applications of noisy intermediate scale quantum computers [3] to machine learning. |
Tasks | |
Published | 2018-04-30 |
URL | http://arxiv.org/abs/1804.11326v2 |
http://arxiv.org/pdf/1804.11326v2.pdf | |
PWC | https://paperswithcode.com/paper/supervised-learning-with-quantum-enhanced |
Repo | |
Framework | |