October 21, 2019

3083 words 15 mins read

Paper Group AWR 158

MDLatLRR: A novel decomposition method for infrared and visible image fusion. Machine Learning DDoS Detection for Consumer Internet of Things Devices. Unveiling the invisible - mathematical methods for restoring and interpreting illuminated manuscripts. Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks. Unsupervised Meta-l …

MDLatLRR: A novel decomposition method for infrared and visible image fusion


Title	MDLatLRR: A novel decomposition method for infrared and visible image fusion
Authors	Hui Li, Xiao-Jun Wu, Josef Kittler
Abstract	Image decomposition is crucial for many image processing tasks, as it allows to extract salient features from source images. A good image decomposition method could lead to a better performance, especially in image fusion tasks. We propose a multi-level image decomposition method based on latent low-rank representation(LatLRR), which is called MDLatLRR. This decomposition method is applicable to many image processing fields. In this paper, we focus on the image fusion task. We develop a novel image fusion framework based on MDLatLRR, which is used to decompose source images into detail parts(salient features) and base parts. A nuclear-norm based fusion strategy is used to fuse the detail parts, and the base parts are fused by an averaging strategy. Compared with other state-of-the-art fusion methods, the proposed algorithm exhibits better fusion performance in both subjective and objective evaluation.
Tasks	Infrared And Visible Image Fusion
Published	2018-11-06
URL	https://arxiv.org/abs/1811.02291v5
PDF	https://arxiv.org/pdf/1811.02291v5.pdf
PWC	https://paperswithcode.com/paper/infrared-and-visible-image-fusion-using-a-1
Repo	https://github.com/exceptionLi/imagefusion_deepdecomposition
Framework	none

Machine Learning DDoS Detection for Consumer Internet of Things Devices


Title	Machine Learning DDoS Detection for Consumer Internet of Things Devices
Authors	Rohan Doshi, Noah Apthorpe, Nick Feamster
Abstract	An increasing number of Internet of Things (IoT) devices are connecting to the Internet, yet many of these devices are fundamentally insecure, exposing the Internet to a variety of attacks. Botnets such as Mirai have used insecure consumer IoT devices to conduct distributed denial of service (DDoS) attacks on critical Internet infrastructure. This motivates the development of new techniques to automatically detect consumer IoT attack traffic. In this paper, we demonstrate that using IoT-specific network behaviors (e.g. limited number of endpoints and regular time intervals between packets) to inform feature selection can result in high accuracy DDoS detection in IoT network traffic with a variety of machine learning algorithms, including neural networks. These results indicate that home gateway routers or other network middleboxes could automatically detect local IoT device sources of DDoS attacks using low-cost machine learning algorithms and traffic data that is flow-based and protocol-agnostic.
Tasks	Feature Selection
Published	2018-04-11
URL	http://arxiv.org/abs/1804.04159v1
PDF	http://arxiv.org/pdf/1804.04159v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-ddos-detection-for-consumer
Repo	https://github.com/ruchikagargdiwakar/ml_cyber_security_usecases
Framework	none

Unveiling the invisible - mathematical methods for restoring and interpreting illuminated manuscripts


Title	Unveiling the invisible - mathematical methods for restoring and interpreting illuminated manuscripts
Authors	Luca Calatroni, Marie d’Autume, Rob Hocking, Stella Panayotova, Simone Parisotto, Paola Ricciardi, Carola-Bibiane Schönlieb
Abstract	The last fifty years have seen an impressive development of mathematical methods for the analysis and processing of digital images, mostly in the context of photography, biomedical imaging and various forms of engineering. The arts have been mostly overlooked in this process, apart from a few exceptional works in the last ten years. With the rapid emergence of digitisation in the arts, however, the arts domain is becoming increasingly receptive to digital image processing methods and the importance of paying attention to this therefore increases. In this paper we discuss a range of mathematical methods for digital image restoration and digital visualisation for illuminated manuscripts. The latter provide an interesting opportunity for digital manipulation because they traditionally remain physically untouched. At the same time they also serve as an example for the possibilities mathematics and digital restoration offer as a generic and objective toolkit for the arts.
Tasks	Image Restoration
Published	2018-03-19
URL	http://arxiv.org/abs/1803.07187v1
PDF	http://arxiv.org/pdf/1803.07187v1.pdf
PWC	https://paperswithcode.com/paper/unveiling-the-invisible-mathematical-methods
Repo	https://github.com/simoneparisotto/Manuscripts-restoration
Framework	none

Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks


Title	Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks
Authors	Xiaoyuan Liang, Xunsheng Du, Guiling Wang, Zhu Han
Abstract	Existing inefficient traffic light control causes numerous problems, such as long delay and waste of energy. To improve efficiency, taking real-time traffic information as an input and dynamically adjusting the traffic light duration accordingly is a must. In terms of how to dynamically adjust traffic signals’ duration, existing works either split the traffic signal into equal duration or extract limited traffic information from the real data. In this paper, we study how to decide the traffic signals’ duration based on the collected data from different sensors and vehicular networks. We propose a deep reinforcement learning model to control the traffic light. In the model, we quantify the complex traffic scenario as states by collecting data and dividing the whole intersection into small grids. The timing changes of a traffic light are the actions, which are modeled as a high-dimension Markov decision process. The reward is the cumulative waiting time difference between two cycles. To solve the model, a convolutional neural network is employed to map the states to rewards. The proposed model is composed of several components to improve the performance, such as dueling network, target network, double Q-learning network, and prioritized experience replay. We evaluate our model via simulation in the Simulation of Urban MObility (SUMO) in a vehicular network, and the simulation results show the efficiency of our model in controlling traffic lights.
Tasks	Q-Learning
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11115v1
PDF	http://arxiv.org/pdf/1803.11115v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-traffic-light
Repo	https://github.com/kathyrnrouse/RL_CUIP
Framework	none

Unsupervised Meta-learning of Figure-Ground Segmentation via Imitating Visual Effects


Title	Unsupervised Meta-learning of Figure-Ground Segmentation via Imitating Visual Effects
Authors	Ding-Jie Chen, Jui-Ting Chien, Hwann-Tzong Chen, Tyng-Luh Liu
Abstract	This paper presents a “learning to learn” approach to figure-ground image segmentation. By exploring webly-abundant images of specific visual effects, our method can effectively learn the visual-effect internal representations in an unsupervised manner and uses this knowledge to differentiate the figure from the ground in an image. Specifically, we formulate the meta-learning process as a compositional image editing task that learns to imitate a certain visual effect and derive the corresponding internal representation. Such a generative process can help instantiate the underlying figure-ground notion and enables the system to accomplish the intended image segmentation. Whereas existing generative methods are mostly tailored to image synthesis or style transfer, our approach offers a flexible learning mechanism to model a general concept of figure-ground segmentation from unorganized images that have no explicit pixel-level annotations. We validate our approach via extensive experiments on six datasets to demonstrate that the proposed model can be end-to-end trained without ground-truth pixel labeling yet outperforms the existing methods of unsupervised segmentation tasks.
Tasks	Image Generation, Meta-Learning, Semantic Segmentation, Style Transfer
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08442v1
PDF	http://arxiv.org/pdf/1812.08442v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-meta-learning-of-figure-ground
Repo	https://github.com/timy90022/VEGAN
Framework	pytorch

Implementing Adaptive Separable Convolution for Video Frame Interpolation


Title	Implementing Adaptive Separable Convolution for Video Frame Interpolation
Authors	Mart Kartašev, Carlo Rapisarda, Dominik Fay
Abstract	As Deep Neural Networks are becoming more popular, much of the attention is being devoted to Computer Vision problems that used to be solved with more traditional approaches. Video frame interpolation is one of such challenges that has seen new research involving various techniques in deep learning. In this paper, we replicate the work of Niklaus et al. on Adaptive Separable Convolution, which claims high quality results on the video frame interpolation task. We apply the same network structure trained on a smaller dataset and experiment with various different loss functions, in order to determine the optimal approach in data-scarce scenarios. The best resulting model is still able to provide visually pleasing videos, although achieving lower evaluation scores.
Tasks	Video Frame Interpolation
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07759v1
PDF	http://arxiv.org/pdf/1809.07759v1.pdf
PWC	https://paperswithcode.com/paper/implementing-adaptive-separable-convolution
Repo	https://github.com/carlo-/sepconv-ios
Framework	pytorch

Convex Formulations for Fair Principal Component Analysis


Title	Convex Formulations for Fair Principal Component Analysis
Authors	Matt Olfat, Anil Aswani
Abstract	Though there is a growing body of literature on fairness for supervised learning, the problem of incorporating fairness into unsupervised learning has been less well-studied. This paper studies fairness in the context of principal component analysis (PCA). We first present a definition of fairness for dimensionality reduction, and our definition can be interpreted as saying that a reduction is fair if information about a protected class (e.g., race or gender) cannot be inferred from the dimensionality-reduced data points. Next, we develop convex optimization formulations that can improve the fairness (with respect to our definition) of PCA and kernel PCA. These formulations are semidefinite programs (SDP’s), and we demonstrate the effectiveness of our formulations using several datasets. We conclude by showing how our approach can be used to perform a fair (with respect to age) clustering of health data that may be used to set health insurance rates.
Tasks	Dimensionality Reduction
Published	2018-02-11
URL	http://arxiv.org/abs/1802.03765v3
PDF	http://arxiv.org/pdf/1802.03765v3.pdf
PWC	https://paperswithcode.com/paper/convex-formulations-for-fair-principal
Repo	https://github.com/molfat66/FairML
Framework	none

Janossy Pooling: Learning Deep Permutation-Invariant Functions for Variable-Size Inputs


Title	Janossy Pooling: Learning Deep Permutation-Invariant Functions for Variable-Size Inputs
Authors	Ryan L. Murphy, Balasubramaniam Srinivasan, Vinayak Rao, Bruno Ribeiro
Abstract	We consider a simple and overarching representation for permutation-invariant functions of sequences (or multiset functions). Our approach, which we call Janossy pooling, expresses a permutation-invariant function as the average of a permutation-sensitive function applied to all reorderings of the input sequence. This allows us to leverage the rich and mature literature on permutation-sensitive functions to construct novel and flexible permutation-invariant functions. If carried out naively, Janossy pooling can be computationally prohibitive. To allow computational tractability, we consider three kinds of approximations: canonical orderings of sequences, functions with $k$-order interactions, and stochastic optimization algorithms with random permutations. Our framework unifies a variety of existing work in the literature, and suggests possible modeling and algorithmic extensions. We explore a few in our experiments, which demonstrate improved performance over current state-of-the-art methods.
Tasks	Stochastic Optimization
Published	2018-11-05
URL	http://arxiv.org/abs/1811.01900v3
PDF	http://arxiv.org/pdf/1811.01900v3.pdf
PWC	https://paperswithcode.com/paper/janossy-pooling-learning-deep-permutation
Repo	https://github.com/balasrini33/JanossyPooling
Framework	pytorch

Open Source Automatic Speech Recognition for German


Title	Open Source Automatic Speech Recognition for German
Authors	Benjamin Milde, Arne Köhn
Abstract	High quality Automatic Speech Recognition (ASR) is a prerequisite for speech-based applications and research. While state-of-the-art ASR software is freely available, the language dependent acoustic models are lacking for languages other than English, due to the limited amount of freely available training data. We train acoustic models for German with Kaldi on two datasets, which are both distributed under a Creative Commons license. The resulting model is freely redistributable, lowering the cost of entry for German ASR. The models are trained on a total of 412 hours of German read speech data and we achieve a relative word error reduction of 26% by adding data from the Spoken Wikipedia Corpus to the previously best freely available German acoustic model recipe and dataset. Our best model achieves a word error rate of 14.38 on the Tuda-De test set. Due to the large amount of speakers and the diversity of topics included in the training data, our model is robust against speaker variation and topic shift.
Tasks	Speech Recognition
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10311v1
PDF	http://arxiv.org/pdf/1807.10311v1.pdf
PWC	https://paperswithcode.com/paper/open-source-automatic-speech-recognition-for
Repo	https://github.com/tudarmstadt-lt/kaldi-tuda-de
Framework	none

AffinityNet: semi-supervised few-shot learning for disease type prediction


Title	AffinityNet: semi-supervised few-shot learning for disease type prediction
Authors	Tianle Ma, Aidong Zhang
Abstract	While deep learning has achieved great success in computer vision and many other fields, currently it does not work very well on patient genomic data with the “big p, small N” problem (i.e., a relatively small number of samples with high-dimensional features). In order to make deep learning work with a small amount of training data, we have to design new models that facilitate few-shot learning. Here we present the Affinity Network Model (AffinityNet), a data efficient deep learning model that can learn from a limited number of training examples and generalize well. The backbone of the AffinityNet model consists of stacked k-Nearest-Neighbor (kNN) attention pooling layers. The kNN attention pooling layer is a generalization of the Graph Attention Model (GAM), and can be applied to not only graphs but also any set of objects regardless of whether a graph is given or not. As a new deep learning module, kNN attention pooling layers can be plugged into any neural network model just like convolutional layers. As a simple special case of kNN attention pooling layer, feature attention layer can directly select important features that are useful for classification tasks. Experiments on both synthetic data and cancer genomic data from TCGA projects show that our AffinityNet model has better generalization power than conventional neural network models with little training data. The code is freely available at https://github.com/BeautyOfWeb/AffinityNet .
Tasks	Few-Shot Learning
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08905v2
PDF	http://arxiv.org/pdf/1805.08905v2.pdf
PWC	https://paperswithcode.com/paper/affinitynet-semi-supervised-few-shot-learning
Repo	https://github.com/BeautyOfWeb/AffinityNet
Framework	pytorch

Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning


Title	Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
Authors	Pengda Qin, Weiran Xu, William Yang Wang
Abstract	Distant supervision has become the standard method for relation extraction. However, even though it is an efficient method, it does not come at no cost—The resulted distantly-supervised training samples are often very noisy. To combat the noise, most of the recent state-of-the-art approaches focus on selecting one-best sentence or calculating soft attention weights over the set of the sentences of one specific entity pair. However, these methods are suboptimal, and the false positive problem is still a key stumbling bottleneck for the performance. We argue that those incorrectly-labeled candidate sentences must be treated with a hard decision, rather than being dealt with soft attention weights. To do this, our paper describes a radical solution—We explore a deep reinforcement learning strategy to generate the false-positive indicator, where we automatically recognize false positives for each relation type without any supervised information. Unlike the removal operation in the previous studies, we redistribute them into the negative examples. The experimental results show that the proposed strategy significantly improves the performance of distant supervision comparing to state-of-the-art systems.
Tasks	Relation Extraction
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09927v1
PDF	http://arxiv.org/pdf/1805.09927v1.pdf
PWC	https://paperswithcode.com/paper/robust-distant-supervision-relation
Repo	https://github.com/Panda0406/Adversarial-Learning-Distant-Supervision-RE
Framework	pytorch

Conditional Random Fields as Recurrent Neural Networks for 3D Medical Imaging Segmentation


Title	Conditional Random Fields as Recurrent Neural Networks for 3D Medical Imaging Segmentation
Authors	Miguel Monteiro, Mário A. T. Figueiredo, Arlindo L. Oliveira
Abstract	The Conditional Random Field as a Recurrent Neural Network layer is a recently proposed algorithm meant to be placed on top of an existing Fully-Convolutional Neural Network to improve the quality of semantic segmentation. In this paper, we test whether this algorithm, which was shown to improve semantic segmentation for 2D RGB images, is able to improve segmentation quality for 3D multi-modal medical images. We developed an implementation of the algorithm which works for any number of spatial dimensions, input/output image channels, and reference image channels. As far as we know this is the first publicly available implementation of this sort. We tested the algorithm with two distinct 3D medical imaging datasets, we concluded that the performance differences observed were not statistically significant. Finally, in the discussion section of the paper, we go into the reasons as to why this technique transfers poorly from natural images to medical images.
Tasks	3D Medical Imaging Segmentation, Medical Image Segmentation, Semantic Segmentation, Volumetric Medical Image Segmentation
Published	2018-07-19
URL	http://arxiv.org/abs/1807.07464v1
PDF	http://arxiv.org/pdf/1807.07464v1.pdf
PWC	https://paperswithcode.com/paper/conditional-random-fields-as-recurrent-neural
Repo	https://github.com/MiguelMonteiro/CRFasRNNLayer
Framework	tf

Synthesizing Tabular Data using Generative Adversarial Networks


Title	Synthesizing Tabular Data using Generative Adversarial Networks
Authors	Lei Xu, Kalyan Veeramachaneni
Abstract	Generative adversarial networks (GANs) implicitly learn the probability distribution of a dataset and can draw samples from the distribution. This paper presents, Tabular GAN (TGAN), a generative adversarial network which can generate tabular data like medical or educational records. Using the power of deep neural networks, TGAN generates high-quality and fully synthetic tables while simultaneously generating discrete and continuous variables. When we evaluate our model on three datasets, we find that TGAN outperforms conventional statistical generative models in both capturing the correlation between columns and scaling up for large datasets.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11264v1
PDF	http://arxiv.org/pdf/1811.11264v1.pdf
PWC	https://paperswithcode.com/paper/synthesizing-tabular-data-using-generative
Repo	https://github.com/DAI-Lab/TGAN
Framework	tf

Evolving Mario Levels in the Latent Space of a Deep Convolutional Generative Adversarial Network


Title	Evolving Mario Levels in the Latent Space of a Deep Convolutional Generative Adversarial Network
Authors	Vanessa Volz, Jacob Schrum, Jialin Liu, Simon M. Lucas, Adam Smith, Sebastian Risi
Abstract	Generative Adversarial Networks (GANs) are a machine learning approach capable of generating novel example outputs across a space of provided training examples. Procedural Content Generation (PCG) of levels for video games could benefit from such models, especially for games where there is a pre-existing corpus of levels to emulate. This paper trains a GAN to generate levels for Super Mario Bros using a level from the Video Game Level Corpus. The approach successfully generates a variety of levels similar to one in the original corpus, but is further improved by application of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). Specifically, various fitness functions are used to discover levels within the latent space of the GAN that maximize desired properties. Simple static properties are optimized, such as a given distribution of tile types. Additionally, the champion A* agent from the 2009 Mario AI competition is used to assess whether a level is playable, and how many jumping actions are required to beat it. These fitness functions allow for the discovery of levels that exist within the space of examples designed by experts, and also guide the search towards levels that fulfill one or more specified objectives.
Tasks	SNES Games
Published	2018-05-02
URL	http://arxiv.org/abs/1805.00728v1
PDF	http://arxiv.org/pdf/1805.00728v1.pdf
PWC	https://paperswithcode.com/paper/evolving-mario-levels-in-the-latent-space-of
Repo	https://github.com/TheHedgeify/DagstuhlGAN
Framework	pytorch

HCU400: An Annotated Dataset for Exploring Aural Phenomenology Through Causal Uncertainty


Title	HCU400: An Annotated Dataset for Exploring Aural Phenomenology Through Causal Uncertainty
Authors	Ishwarya Ananthabhotla, David B. Ramsay, Joseph A. Paradiso
Abstract	The way we perceive a sound depends on many aspects– its ecological frequency, acoustic features, typicality, and most notably, its identified source. In this paper, we present the HCU400: a dataset of 402 sounds ranging from easily identifiable everyday sounds to intentionally obscured artificial ones. It aims to lower the barrier for the study of aural phenomenology as the largest available audio dataset to include an analysis of causal attribution. Each sample has been annotated with crowd-sourced descriptions, as well as familiarity, imageability, arousal, and valence ratings. We extend existing calculations of causal uncertainty, automating and generalizing them with word embeddings. Upon analysis we find that individuals will provide less polarized emotion ratings as a sound’s source becomes increasingly ambiguous; individual ratings of familiarity and imageability, on the other hand, diverge as uncertainty increases despite a clear negative trend on average.
Tasks	Word Embeddings
Published	2018-11-15
URL	https://arxiv.org/abs/1811.06439v2
PDF	https://arxiv.org/pdf/1811.06439v2.pdf
PWC	https://paperswithcode.com/paper/hcu400-an-annotated-dataset-for-exploring
Repo	https://github.com/mitmedialab/HCU400
Framework	none