October 21, 2019

3066 words 15 mins read

Paper Group AWR 5

Paper Group AWR 5

Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video. Detecting Visual Relationships Using Box Attention. Copula Variational Bayes inference via information geometry. Network Traffic Anomaly Detection Using Recurrent Neural Networks. Learning Discriminative 3D Shape Representations by View Discerning Networks. Cont …

Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video

Title Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video
Authors Radu Tudor Ionescu, Fahad Shahbaz Khan, Mariana-Iuliana Georgescu, Ling Shao
Abstract Abnormal event detection in video is a challenging vision problem. Most existing approaches formulate abnormal event detection as an outlier detection task, due to the scarcity of anomalous data during training. Because of the lack of prior information regarding abnormal events, these methods are not fully-equipped to differentiate between normal and abnormal events. In this work, we formalize abnormal event detection as a one-versus-rest binary classification problem. Our contribution is two-fold. First, we introduce an unsupervised feature learning framework based on object-centric convolutional auto-encoders to encode both motion and appearance information. Second, we propose a supervised classification approach based on clustering the training samples into normality clusters. A one-versus-rest abnormal event classifier is then employed to separate each normality cluster from the rest. For the purpose of training the classifier, the other clusters act as dummy anomalies. During inference, an object is labeled as abnormal if the highest classification score assigned by the one-versus-rest classifiers is negative. Comprehensive experiments are performed on four benchmarks: Avenue, ShanghaiTech, UCSD and UMN. Our approach provides superior results on all four data sets. On the large-scale ShanghaiTech data set, our method provides an absolute gain of 8.4% in terms of frame-level AUC compared to the state-of-the-art method [Sultani et al., CVPR 2018].
Tasks Abnormal Event Detection In Video, Outlier Detection
Published 2018-12-11
URL http://arxiv.org/abs/1812.04960v2
PDF http://arxiv.org/pdf/1812.04960v2.pdf
PWC https://paperswithcode.com/paper/object-centric-auto-encoders-and-dummy
Repo https://github.com/fjchange/object_centric_VAD
Framework tf

Detecting Visual Relationships Using Box Attention

Title Detecting Visual Relationships Using Box Attention
Authors Alexander Kolesnikov, Alina Kuznetsova, Christoph H. Lampert, Vittorio Ferrari
Abstract We propose a new model for detecting visual relationships, such as “person riding motorcycle” or “bottle on table”. This task is an important step towards comprehensive structured image understanding, going beyond detecting individual objects. Our main novelty is a Box Attention mechanism that allows to model pairwise interactions between objects using standard object detection pipelines. The resulting model is conceptually clean, expressive and relies on well-justified training and prediction procedures. Moreover, unlike previously proposed approaches, our model does not introduce any additional complex components or hyperparameters on top of those already required by the underlying detection model. We conduct an experimental evaluation on three challenging datasets, V-COCO, Visual Relationships and Open Images, demonstrating strong quantitative and qualitative results.
Tasks Object Detection
Published 2018-07-05
URL https://arxiv.org/abs/1807.02136v2
PDF https://arxiv.org/pdf/1807.02136v2.pdf
PWC https://paperswithcode.com/paper/detecting-visual-relationships-using-box
Repo https://github.com/darien-schettler/bar-cnn
Framework tf

Copula Variational Bayes inference via information geometry

Title Copula Variational Bayes inference via information geometry
Authors Viet Hung Tran
Abstract Variational Bayes (VB), also known as independent mean-field approximation, has become a popular method for Bayesian network inference in recent years. Its application is vast, e.g. in neural network, compressed sensing, clustering, etc. to name just a few. In this paper, the independence constraint in VB will be relaxed to a conditional constraint class, called copula in statistics. Since a joint probability distribution always belongs to a copula class, the novel copula VB (CVB) approximation is a generalized form of VB. Via information geometry, we will see that CVB algorithm iteratively projects the original joint distribution to a copula constraint space until it reaches a local minimum Kullback-Leibler (KL) divergence. By this way, all mean-field approximations, e.g. iterative VB, Expectation-Maximization (EM), Iterated Conditional Mode (ICM) and k-means algorithms, are special cases of CVB approximation. For a generic Bayesian network, an augmented hierarchy form of CVB will also be designed. While mean-field algorithms can only return a locally optimal approximation for a correlated network, the augmented CVB network, which is an optimally weighted average of a mixture of simpler network structures, can potentially achieve the globally optimal approximation for the first time. Via simulations of Gaussian mixture clustering, the classification’s accuracy of CVB will be shown to be far superior to that of state-of-the-art VB, EM and k-means algorithms.
Tasks
Published 2018-03-29
URL http://arxiv.org/abs/1803.10998v1
PDF http://arxiv.org/pdf/1803.10998v1.pdf
PWC https://paperswithcode.com/paper/copula-variational-bayes-inference-via
Repo https://github.com/VietTran86/Copula-Variational-Bayes
Framework none

Network Traffic Anomaly Detection Using Recurrent Neural Networks

Title Network Traffic Anomaly Detection Using Recurrent Neural Networks
Authors Benjamin J. Radford, Leonardo M. Apolonio, Antonio J. Trias, Jim A. Simpson
Abstract We show that a recurrent neural network is able to learn a model to represent sequences of communications between computers on a network and can be used to identify outlier network traffic. Defending computer networks is a challenging problem and is typically addressed by manually identifying known malicious actor behavior and then specifying rules to recognize such behavior in network communications. However, these rule-based approaches often generalize poorly and identify only those patterns that are already known to researchers. An alternative approach that does not rely on known malicious behavior patterns can potentially also detect previously unseen patterns. We tokenize and compress netflow into sequences of “words” that form “sentences” representative of a conversation between computers. These sentences are then used to generate a model that learns the semantic and syntactic grammar of the newly generated language. We use Long-Short-Term Memory (LSTM) cell Recurrent Neural Networks (RNN) to capture the complex relationships and nuances of this language. The language model is then used predict the communications between two IPs and the prediction error is used as a measurement of how typical or atyptical the observed communication are. By learning a model that is specific to each network, yet generalized to typical computer-to-computer traffic within and outside the network, a language model is able to identify sequences of network activity that are outliers with respect to the model. We demonstrate positive unsupervised attack identification performance (AUC 0.84) on the ISCX IDS dataset which contains seven days of network activity with normal traffic and four distinct attack patterns.
Tasks Anomaly Detection, Language Modelling
Published 2018-03-28
URL http://arxiv.org/abs/1803.10769v1
PDF http://arxiv.org/pdf/1803.10769v1.pdf
PWC https://paperswithcode.com/paper/network-traffic-anomaly-detection-using
Repo https://github.com/benradford/replication_arxiv_1803_10769
Framework tf

Learning Discriminative 3D Shape Representations by View Discerning Networks

Title Learning Discriminative 3D Shape Representations by View Discerning Networks
Authors Biao Leng, Cheng Zhang, Xiaocheng Zhou, Cheng Xu, Kai Xu
Abstract In view-based 3D shape recognition, extracting discriminative visual representation of 3D shapes from projected images is considered the core problem. Projections with low discriminative ability can adversely influence the final 3D shape representation. Especially under the real situations with background clutter and object occlusion, the adverse effect is even more severe. To resolve this problem, we propose a novel deep neural network, View Discerning Network, which learns to judge the quality of views and adjust their contributions to the representation of shapes. In this network, a Score Generation Unit is devised to evaluate the quality of each projected image with score vectors. These score vectors are used to weight the image features and the weighted features perform much better than original features in 3D shape recognition task. In particular, we introduce two structures of Score Generation Unit, Channel-wise Score Unit and Part-wise Score Unit, to assess the quality of feature maps from different perspectives. Our network aggregates features and scores in an end-to-end framework, so that final shape descriptors are directly obtained from its output. Our experiments on ModelNet and ShapeNet Core55 show that View Discerning Network outperforms the state-of-the-arts in terms of the retrieval task, with excellent robustness against background clutter and object occlusion.
Tasks 3D Shape Recognition, 3D Shape Representation
Published 2018-08-11
URL http://arxiv.org/abs/1808.03823v2
PDF http://arxiv.org/pdf/1808.03823v2.pdf
PWC https://paperswithcode.com/paper/learning-discriminative-3d-shape
Repo https://github.com/chengz3906/View-Discerning-Network
Framework none

Content preserving text generation with attribute controls

Title Content preserving text generation with attribute controls
Authors Lajanugen Logeswaran, Honglak Lee, Samy Bengio
Abstract In this work, we address the problem of modifying textual attributes of sentences. Given an input sentence and a set of attribute labels, we attempt to generate sentences that are compatible with the conditioning information. To ensure that the model generates content compatible sentences, we introduce a reconstruction loss which interpolates between auto-encoding and back-translation loss components. We propose an adversarial loss to enforce generated samples to be attribute compatible and realistic. Through quantitative, qualitative and human evaluations we demonstrate that our model is capable of generating fluent sentences that better reflect the conditioning information compared to prior methods. We further demonstrate that the model is capable of simultaneously controlling multiple attributes.
Tasks Text Generation
Published 2018-11-03
URL http://arxiv.org/abs/1811.01135v1
PDF http://arxiv.org/pdf/1811.01135v1.pdf
PWC https://paperswithcode.com/paper/content-preserving-text-generation-with
Repo https://github.com/seanie12/CPTG
Framework pytorch

Analyzing Solar Irradiance Variation From GPS and Cameras

Title Analyzing Solar Irradiance Variation From GPS and Cameras
Authors Shilpa Manandhar, Soumyabrata Dev, Yee Hui Lee, Yu Song Meng
Abstract The total amount of solar irradiance falling on the earth’s surface is an important area of study amongst the photo-voltaic (PV) engineers and remote sensing analysts. The received solar irradiance impacts the total amount of generated solar energy. However, this generation is often hindered by the high degree of solar irradiance variability. In this paper, we study the main factors behind such variability with the assistance of Global Positioning System (GPS) and ground-based, high-resolution sky cameras. This analysis will also be helpful for understanding cloud phenomenon and other events in the earth’s atmosphere.
Tasks
Published 2018-04-19
URL http://arxiv.org/abs/1804.07629v1
PDF http://arxiv.org/pdf/1804.07629v1.pdf
PWC https://paperswithcode.com/paper/analyzing-solar-irradiance-variation-from-gps
Repo https://github.com/Soumyabrata/irradiance-variation
Framework none

Speaker Recognition from Raw Waveform with SincNet

Title Speaker Recognition from Raw Waveform with SincNet
Authors Mirco Ravanelli, Yoshua Bengio
Abstract Deep learning is progressively gaining popularity as a viable alternative to i-vectors for speaker recognition. Promising results have been recently obtained with Convolutional Neural Networks (CNNs) when fed by raw speech samples directly. Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants. Proper design of the neural network is crucial to achieve this goal. This paper proposes a novel CNN architecture, called SincNet, that encourages the first convolutional layer to discover more meaningful filters. SincNet is based on parametrized sinc functions, which implement band-pass filters. In contrast to standard CNNs, that learn all elements of each filter, only low and high cutoff frequencies are directly learned from data with the proposed method. This offers a very compact and efficient way to derive a customized filter bank specifically tuned for the desired application. Our experiments, conducted on both speaker identification and speaker verification tasks, show that the proposed architecture converges faster and performs better than a standard CNN on raw waveforms.
Tasks Speaker Identification, Speaker Recognition, Speaker Verification
Published 2018-07-29
URL https://arxiv.org/abs/1808.00158v3
PDF https://arxiv.org/pdf/1808.00158v3.pdf
PWC https://paperswithcode.com/paper/speaker-recognition-from-raw-waveform-with
Repo https://github.com/pnalaba/sincnet
Framework pytorch

Adaptive Network Sparsification with Dependent Variational Beta-Bernoulli Dropout

Title Adaptive Network Sparsification with Dependent Variational Beta-Bernoulli Dropout
Authors Juho Lee, Saehoon Kim, Jaehong Yoon, Hae Beom Lee, Eunho Yang, Sung Ju Hwang
Abstract While variational dropout approaches have been shown to be effective for network sparsification, they are still suboptimal in the sense that they set the dropout rate for each neuron without consideration of the input data. With such input-independent dropout, each neuron is evolved to be generic across inputs, which makes it difficult to sparsify networks without accuracy loss. To overcome this limitation, we propose adaptive variational dropout whose probabilities are drawn from sparsity-inducing beta Bernoulli prior. It allows each neuron to be evolved either to be generic or specific for certain inputs, or dropped altogether. Such input-adaptive sparsity-inducing dropout allows the resulting network to tolerate larger degree of sparsity without losing its expressive power by removing redundancies among features. We validate our dependent variational beta-Bernoulli dropout on multiple public datasets, on which it obtains significantly more compact networks than baseline methods, with consistent accuracy improvements over the base networks.
Tasks
Published 2018-05-28
URL http://arxiv.org/abs/1805.10896v3
PDF http://arxiv.org/pdf/1805.10896v3.pdf
PWC https://paperswithcode.com/paper/adaptive-network-sparsification-with
Repo https://github.com/OpenXAIProject/Variational_Dropouts
Framework tf

Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration

Title Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration
Authors Lu Sheng, Ziyi Lin, Jing Shao, Xiaogang Wang
Abstract Zero-shot artistic style transfer is an important image synthesis problem aiming at transferring arbitrary style into content images. However, the trade-off between the generalization and efficiency in existing methods impedes a high quality zero-shot style transfer in real-time. In this paper, we resolve this dilemma and propose an efficient yet effective Avatar-Net that enables visually plausible multi-scale transfer for arbitrary style. The key ingredient of our method is a style decorator that makes up the content features by semantically aligned style features from an arbitrary style image, which does not only holistically match their feature distributions but also preserve detailed style patterns in the decorated features. By embedding this module into an image reconstruction network that fuses multi-scale style abstractions, the Avatar-Net renders multi-scale stylization for any style image in one feed-forward pass. We demonstrate the state-of-the-art effectiveness and efficiency of the proposed method in generating high-quality stylized images, with a series of applications include multiple style integration, video stylization and etc.
Tasks Image Generation, Image Reconstruction, Style Transfer
Published 2018-05-10
URL http://arxiv.org/abs/1805.03857v2
PDF http://arxiv.org/pdf/1805.03857v2.pdf
PWC https://paperswithcode.com/paper/avatar-net-multi-scale-zero-shot-style
Repo https://github.com/JianqiangRen/AAMS
Framework tf

Non-Adversarial Unsupervised Word Translation

Title Non-Adversarial Unsupervised Word Translation
Authors Yedid Hoshen, Lior Wolf
Abstract Unsupervised word translation from non-parallel inter-lingual corpora has attracted much research interest. Very recently, neural network methods trained with adversarial loss functions achieved high accuracy on this task. Despite the impressive success of the recent techniques, they suffer from the typical drawbacks of generative adversarial models: sensitivity to hyper-parameters, long training time and lack of interpretability. In this paper, we make the observation that two sufficiently similar distributions can be aligned correctly with iterative matching methods. We present a novel method that first aligns the second moment of the word distributions of the two languages and then iteratively refines the alignment. Extensive experiments on word translation of European and Non-European languages show that our method achieves better performance than recent state-of-the-art deep adversarial approaches and is competitive with the supervised baseline. It is also efficient, easy to parallelize on CPU and interpretable.
Tasks
Published 2018-01-18
URL http://arxiv.org/abs/1801.06126v3
PDF http://arxiv.org/pdf/1801.06126v3.pdf
PWC https://paperswithcode.com/paper/non-adversarial-unsupervised-word-translation
Repo https://github.com/facebookresearch/MUSE
Framework pytorch

OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas

Title OmniDepth: Dense Depth Estimation for Indoors Spherical Panoramas
Authors Nikolaos Zioulis, Antonis Karakottas, Dimitrios Zarpalas, Petros Daras
Abstract Recent work on depth estimation up to now has only focused on projective images ignoring 360 content which is now increasingly and more easily produced. We show that monocular depth estimation models trained on traditional images produce sub-optimal results on omnidirectional images, showcasing the need for training directly on 360 datasets, which however, are hard to acquire. In this work, we circumvent the challenges associated with acquiring high quality 360 datasets with ground truth depth annotations, by re-using recently released large scale 3D datasets and re-purposing them to 360 via rendering. This dataset, which is considerably larger than similar projective datasets, is publicly offered to the community to enable future research in this direction. We use this dataset to learn in an end-to-end fashion the task of depth estimation from 360 images. We show promising results in our synthesized data as well as in unseen realistic images.
Tasks Depth Estimation, Monocular Depth Estimation
Published 2018-07-25
URL http://arxiv.org/abs/1807.09620v1
PDF http://arxiv.org/pdf/1807.09620v1.pdf
PWC https://paperswithcode.com/paper/omnidepth-dense-depth-estimation-for-indoors
Repo https://github.com/VCL3D/SphericalViewSynthesis
Framework pytorch

Adversarial Distillation of Bayesian Neural Network Posteriors

Title Adversarial Distillation of Bayesian Neural Network Posteriors
Authors Kuan-Chieh Wang, Paul Vicol, James Lucas, Li Gu, Roger Grosse, Richard Zemel
Abstract Bayesian neural networks (BNNs) allow us to reason about uncertainty in a principled way. Stochastic Gradient Langevin Dynamics (SGLD) enables efficient BNN learning by drawing samples from the BNN posterior using mini-batches. However, SGLD and its extensions require storage of many copies of the model parameters, a potentially prohibitive cost, especially for large neural networks. We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using a Generative Adversarial Network (GAN). At test-time, samples are generated by the GAN. We show that this distillation framework incurs no loss in performance on recent BNN applications including anomaly detection, active learning, and defense against adversarial attacks. By construction, our framework not only distills the Bayesian predictive distribution, but the posterior itself. This allows one to compute quantities such as the approximate model variance, which is useful in downstream tasks. To our knowledge, these are the first results applying MCMC-based BNNs to the aforementioned downstream applications.
Tasks Active Learning, Anomaly Detection
Published 2018-06-27
URL http://arxiv.org/abs/1806.10317v1
PDF http://arxiv.org/pdf/1806.10317v1.pdf
PWC https://paperswithcode.com/paper/adversarial-distillation-of-bayesian-neural
Repo https://github.com/wangkua1/apd_public
Framework pytorch

A Practical Incremental Learning Framework For Sparse Entity Extraction

Title A Practical Incremental Learning Framework For Sparse Entity Extraction
Authors Hussein S. Al-Olimat, Steven Gustafson, Jason Mackay, Krishnaprasad Thirunarayan, Amit Sheth
Abstract This work addresses challenges arising from extracting entities from textual data, including the high cost of data annotation, model accuracy, selecting appropriate evaluation criteria, and the overall quality of annotation. We present a framework that integrates Entity Set Expansion (ESE) and Active Learning (AL) to reduce the annotation cost of sparse data and provide an online evaluation method as feedback. This incremental and interactive learning framework allows for rapid annotation and subsequent extraction of sparse data while maintaining high accuracy. We evaluate our framework on three publicly available datasets and show that it drastically reduces the cost of sparse entity annotation by an average of 85% and 45% to reach 0.9 and 1.0 F-Scores respectively. Moreover, the method exhibited robust performance across all datasets.
Tasks Active Learning, Entity Extraction
Published 2018-06-26
URL http://arxiv.org/abs/1806.09751v1
PDF http://arxiv.org/pdf/1806.09751v1.pdf
PWC https://paperswithcode.com/paper/a-practical-incremental-learning-framework
Repo https://github.com/halolimat/SpExtor
Framework none

Unsupervised Evaluation and Weighted Aggregation of Ranked Predictions

Title Unsupervised Evaluation and Weighted Aggregation of Ranked Predictions
Authors Mehmet Eren Ahsen, Robert Vogel, Gustavo Stolovitzky
Abstract Learning algorithms that aggregate predictions from an ensemble of diverse base classifiers consistently outperform individual methods. Many of these strategies have been developed in a supervised setting, where the accuracy of each base classifier can be empirically measured and this information is incorporated in the training process. However, the reliance on labeled data precludes the application of ensemble methods to many real world problems where labeled data has not been curated. To this end we developed a new theoretical framework for binary classification, the Strategy for Unsupervised Multiple Method Aggregation (SUMMA), to estimate the performances of base classifiers and an optimal strategy for ensemble learning from unlabeled data.
Tasks
Published 2018-02-13
URL http://arxiv.org/abs/1802.04684v1
PDF http://arxiv.org/pdf/1802.04684v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-evaluation-and-weighted
Repo https://github.com/learn-ensemble/PY-SUMMA
Framework none
comments powered by Disqus