July 29, 2019

2894 words 14 mins read

Paper Group ANR 131

Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things. InScript: Narrative texts annotated with script information. Multiple Adaptive Bayesian Linear Regression for Scalable Bayesian Optimization with Warm Start. PaMM: Pose-aware Multi-shot Matching for Improving Person Re-ide …

Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things


Title	Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things
Authors	Dohyung Kim, Hyochang Yang, Minki Chung, Sungzoon Cho
Abstract	In this paper, we propose Squeezed Convolutional Variational AutoEncoder (SCVAE) for anomaly detection in time series data for Edge Computing in Industrial Internet of Things (IIoT). The proposed model is applied to labeled time series data from UCI datasets for exact performance evaluation, and applied to real world data for indirect model performance comparison. In addition, by comparing the models before and after applying Fire Modules from SqueezeNet, we show that model size and inference times are reduced while similar levels of performance is maintained.
Tasks	Anomaly Detection, Time Series, Unsupervised Anomaly Detection
Published	2017-12-18
URL	http://arxiv.org/abs/1712.06343v1
PDF	http://arxiv.org/pdf/1712.06343v1.pdf
PWC	https://paperswithcode.com/paper/squeezed-convolutional-variational
Repo
Framework

InScript: Narrative texts annotated with script information


Title	InScript: Narrative texts annotated with script information
Authors	Ashutosh Modi, Tatjana Anikina, Simon Ostermann, Manfred Pinkal
Abstract	This paper presents the InScript corpus (Narrative Texts Instantiating Script structure). InScript is a corpus of 1,000 stories centered around 10 different scenarios. Verbs and noun phrases are annotated with event and participant types, respectively. Additionally, the text is annotated with coreference information. The corpus shows rich lexical variation and will serve as a unique resource for the study of the role of script knowledge in natural language processing.
Tasks
Published	2017-03-15
URL	http://arxiv.org/abs/1703.05260v1
PDF	http://arxiv.org/pdf/1703.05260v1.pdf
PWC	https://paperswithcode.com/paper/inscript-narrative-texts-annotated-with
Repo
Framework

Multiple Adaptive Bayesian Linear Regression for Scalable Bayesian Optimization with Warm Start


Title	Multiple Adaptive Bayesian Linear Regression for Scalable Bayesian Optimization with Warm Start
Authors	Valerio Perrone, Rodolphe Jenatton, Matthias Seeger, Cedric Archambeau
Abstract	Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization. Typically, BO is powered by a Gaussian process (GP), whose algorithmic complexity is cubic in the number of evaluations. Hence, GP-based BO cannot leverage large amounts of past or related function evaluations, for example, to warm start the BO procedure. We develop a multiple adaptive Bayesian linear regression model as a scalable alternative whose complexity is linear in the number of observations. The multiple Bayesian linear regression models are coupled through a shared feedforward neural network, which learns a joint representation and transfers knowledge across machine learning problems.
Tasks
Published	2017-12-08
URL	http://arxiv.org/abs/1712.02902v1
PDF	http://arxiv.org/pdf/1712.02902v1.pdf
PWC	https://paperswithcode.com/paper/multiple-adaptive-bayesian-linear-regression
Repo
Framework

PaMM: Pose-aware Multi-shot Matching for Improving Person Re-identification


Title	PaMM: Pose-aware Multi-shot Matching for Improving Person Re-identification
Authors	Yeong-Jun Cho, Kuk-Jin Yoon
Abstract	Person re-identification is the problem of recognizing people across different images or videos with non-overlapping views. Although there has been much progress in person re-identification over the last decade, it remains a challenging task because appearances of people can seem extremely different across diverse camera viewpoints and person poses. In this paper, we propose a novel framework for person re-identification by analyzing camera viewpoints and person poses in a so-called Pose-aware Multi-shot Matching (PaMM), which robustly estimates people’s poses and efficiently conducts multi-shot matching based on pose information. Experimental results using public person re-identification datasets show that the proposed methods outperform state-of-the-art methods and are promising for person re-identification from diverse viewpoints and pose variances.
Tasks	Person Re-Identification
Published	2017-05-17
URL	http://arxiv.org/abs/1705.06011v1
PDF	http://arxiv.org/pdf/1705.06011v1.pdf
PWC	https://paperswithcode.com/paper/pamm-pose-aware-multi-shot-matching-for
Repo
Framework

Adaptive and Resilient Soft Tensegrity Robots


Title	Adaptive and Resilient Soft Tensegrity Robots
Authors	John Rieffel, Jean-Baptiste Mouret
Abstract	Living organisms intertwine soft (e.g., muscle) and hard (e.g., bones) materials, giving them an intrinsic flexibility and resiliency often lacking in conventional rigid robots. The emerging field of soft robotics seeks to harness these same properties in order to create resilient machines. The nature of soft materials, however, presents considerable challenges to aspects of design, construction, and control – and up until now, the vast majority of gaits for soft robots have been hand-designed through empirical trial-and-error. This manuscript describes an easy-to-assemble tensegrity-based soft robot capable of highly dynamic locomotive gaits and demonstrating structural and behavioral resilience in the face of physical damage. Enabling this is the use of a machine learning algorithm able to discover effective gaits with a minimal number of physical trials. These results lend further credence to soft-robotic approaches that seek to harness the interaction of complex material dynamics in order to generate a wealth of dynamical behaviors.
Tasks
Published	2017-02-10
URL	http://arxiv.org/abs/1702.03258v2
PDF	http://arxiv.org/pdf/1702.03258v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-and-resilient-soft-tensegrity-robots
Repo
Framework

Argument Labeling of Explicit Discourse Relations using LSTM Neural Networks


Title	Argument Labeling of Explicit Discourse Relations using LSTM Neural Networks
Authors	Sohail Hooda, Leila Kosseim
Abstract	Argument labeling of explicit discourse relations is a challenging task. The state of the art systems achieve slightly above 55% F-measure but require hand-crafted features. In this paper, we propose a Long Short Term Memory (LSTM) based model for argument labeling. We experimented with multiple configurations of our model. Using the PDTB dataset, our best model achieved an F1 measure of 23.05% without any feature engineering. This is significantly higher than the 20.52% achieved by the state of the art RNN approach, but significantly lower than the feature based state of the art systems. On the other hand, because our approach learns only from the raw dataset, it is more widely applicable to multiple textual genres and languages.
Tasks	Feature Engineering
Published	2017-08-11
URL	http://arxiv.org/abs/1708.03425v2
PDF	http://arxiv.org/pdf/1708.03425v2.pdf
PWC	https://paperswithcode.com/paper/argument-labeling-of-explicit-discourse
Repo
Framework

Concise Radiometric Calibration Using The Power of Ranking


Title	Concise Radiometric Calibration Using The Power of Ranking
Authors	Han Gong, Graham D. Finlayson, Maryam M. Darrodi
Abstract	Compared with raw images, the more common JPEG images are less useful for machine vision algorithms and professional photographers because JPEG-sRGB does not preserve a linear relation between pixel values and the light measured from the scene. A camera is said to be radiometrically calibrated if there is a computational model which can predict how the raw linear sensor image is mapped to the corresponding rendered image (e.g. JPEGs) and vice versa. This paper begins with the observation that the rank order of pixel values are mostly preserved post colour correction. We show that this observation is the key to solving for the whole camera pipeline (colour correction, tone and gamut mapping). Our rank-based calibration method is simpler than the prior art and so is parametrised by fewer variables which, concomitantly, can be solved for using less calibration data. Another advantage is that we can derive the camera pipeline from a single pair of raw-JPEG images. Experiments demonstrate that our method delivers state-of-the-art results (especially for the most interesting case of JPEG to raw).
Tasks	Calibration
Published	2017-07-27
URL	http://arxiv.org/abs/1707.08943v3
PDF	http://arxiv.org/pdf/1707.08943v3.pdf
PWC	https://paperswithcode.com/paper/concise-radiometric-calibration-using-the
Repo
Framework

Perceptual audio loss function for deep learning


Title	Perceptual audio loss function for deep learning
Authors	Dan Elbaz, Michael Zibulevsky
Abstract	PESQ and POLQA , are standards are standards for automated assessment of voice quality of speech as experienced by human beings. The predictions of those objective measures should come as close as possible to subjective quality scores as obtained in subjective listening tests. Wavenet is a deep neural network originally developed as a deep generative model of raw audio wave-forms. Wavenet architecture is based on dilated causal convolutions, which exhibit very large receptive fields. In this short paper we suggest using the Wavenet architecture, in particular its large receptive filed in order to learn PESQ algorithm. By doing so we can use it as a differentiable loss function for speech enhancement.
Tasks	Speech Enhancement
Published	2017-08-20
URL	http://arxiv.org/abs/1708.05987v1
PDF	http://arxiv.org/pdf/1708.05987v1.pdf
PWC	https://paperswithcode.com/paper/perceptual-audio-loss-function-for-deep
Repo
Framework

Critical Learning Periods in Deep Neural Networks


Title	Critical Learning Periods in Deep Neural Networks
Authors	Alessandro Achille, Matteo Rovere, Stefano Soatto
Abstract	Similar to humans and animals, deep artificial neural networks exhibit critical periods during which a temporary stimulus deficit can impair the development of a skill. The extent of the impairment depends on the onset and length of the deficit window, as in animal models, and on the size of the neural network. Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training. To better understand this phenomenon, we use the Fisher Information of the weights to measure the effective connectivity between layers of a network during training. Counterintuitively, information rises rapidly in the early phases of training, and then decreases, preventing redistribution of information resources in a phenomenon we refer to as a loss of “Information Plasticity”. Our analysis suggests that the first few epochs are critical for the creation of strong connections that are optimal relative to the input data distribution. Once such strong connections are created, they do not appear to change during additional training. These findings suggest that the initial learning transient, under-scrutinized compared to asymptotic behavior, plays a key role in determining the outcome of the training process. Our findings, combined with recent theoretical results in the literature, also suggest that forgetting (decrease of information in the weights) is critical to achieving invariance and disentanglement in representation learning. Finally, critical periods are not restricted to biological systems, but can emerge naturally in learning systems, whether biological or artificial, due to fundamental constrains arising from learning dynamics and information processing.
Tasks	Representation Learning
Published	2017-11-24
URL	http://arxiv.org/abs/1711.08856v3
PDF	http://arxiv.org/pdf/1711.08856v3.pdf
PWC	https://paperswithcode.com/paper/critical-learning-periods-in-deep-neural
Repo
Framework

Hidden-Markov-Model Based Speech Enhancement


Title	Hidden-Markov-Model Based Speech Enhancement
Authors	Daniel Dzibela, Armin Sehr
Abstract	The goal of this contribution is to use a parametric speech synthesis system for reducing background noise and other interferences from recorded speech signals. In a first step, Hidden Markov Models of the synthesis system are trained. Two adequate training corpora consisting of text and corresponding speech files have been set up and cleared of various faults, including inaudible utterances or incorrect assignments between audio and text data. Those are tested and compared against each other regarding e.g. flaws in the synthesized speech, it’s naturalness and intelligibility. Thus different voices have been synthesized, whose quality depends less on the number of training samples used, but much more on the cleanliness and signal-to-noise ratio of those. Generalized voice models have been used for synthesis and the results greatly differ between the two speech corpora. Tests regarding the adaptation to different speakers show that a resemblance to the original speaker is audible throughout all recordings, yet the synthesized voices sound robotic and unnatural in smaller parts. The spoken text, however, is usually intelligible, which shows that the models are working well. In a novel approach, speech is synthesized using side information of the original audio signal, particularly the pitch frequency. Results show an increase of speech quality and intelligibility in comparison to speech synthesized solely from text, up to the point of being nearly indistinguishable from the original.
Tasks	Speech Enhancement, Speech Synthesis
Published	2017-07-04
URL	http://arxiv.org/abs/1707.01090v1
PDF	http://arxiv.org/pdf/1707.01090v1.pdf
PWC	https://paperswithcode.com/paper/hidden-markov-model-based-speech-enhancement
Repo
Framework

A Clustering-based Consistency Adaptation Strategy for Distributed SDN Controllers


Title	A Clustering-based Consistency Adaptation Strategy for Distributed SDN Controllers
Authors	Mohamed Aslan, Ashraf Matrawy
Abstract	Distributed controllers are oftentimes used in large-scale SDN deployments where they run a myriad of network applications simultaneously. Such applications could have different consistency and availability preferences. These controllers need to communicate via east/west interfaces in order to synchronize their state information. The consistency and the availability of the distributed state information are governed by an underlying consistency model. Earlier, we suggested the use of adaptively-consistent controllers that can autonomously tune their consistency parameters in order to meet the performance requirements of a certain application. In this paper, we examine the feasibility of employing adaptive controllers that are built on-top of tunable consistency models similar to that of Apache Cassandra. We present an adaptation strategy that uses clustering techniques (sequential k-means and incremental k-means) in order to map a given application performance indicator into a feasible consistency level that can be used with the underlying tunable consistency model. In the cases that we modeled and tested, our results show that in the case of sequential k-means, with a reasonable number of clusters (>= 50), a plausible mapping (low RMSE) could be estimated between the application performance indicators and the consistency level indicator. In the case of incremental k-means, the results also showed that a plausible mapping (low RMSE) could be estimated using a similar number of clusters (>= 50) by using a small threshold (~$ 0.01).
Tasks
Published	2017-05-25
URL	http://arxiv.org/abs/1705.09050v1
PDF	http://arxiv.org/pdf/1705.09050v1.pdf
PWC	https://paperswithcode.com/paper/a-clustering-based-consistency-adaptation
Repo
Framework

Face Recognition with Machine Learning in OpenCV_ Fusion of the results with the Localization Data of an Acoustic Camera for Speaker Identification


Title	Face Recognition with Machine Learning in OpenCV_ Fusion of the results with the Localization Data of an Acoustic Camera for Speaker Identification
Authors	Johannes Reschke, Armin Sehr
Abstract	This contribution gives an overview of face recogni-tion algorithms, their implementation and practical uses. First, a training set of different persons’ faces has to be collected and used to train a face recognizer. The resulting face model can be utilized to classify people in specific individuals or unknowns. After tracking the recognized face and estimating the acoustic sound source’s position, both can be combined to give detailed information about possible speakers and if they are talking or not. This leads to a precise real-time description of the situation, which can be used for further applications, e.g. for multi-channel speech enhancement by adaptive beamformers.
Tasks	Face Recognition, Speaker Identification, Speech Enhancement
Published	2017-07-04
URL	http://arxiv.org/abs/1707.00835v1
PDF	http://arxiv.org/pdf/1707.00835v1.pdf
PWC	https://paperswithcode.com/paper/face-recognition-with-machine-learning-in
Repo
Framework

Deep learning for source camera identification on mobile devices


Title	Deep learning for source camera identification on mobile devices
Authors	David Freire-Obregón, Fabio Narducci, Silvio Barra, Modesto Castrillón-Santana
Abstract	In the present paper, we propose a source camera identification method for mobile devices based on deep learning. Recently, convolutional neural networks (CNNs) have shown a remarkable performance on several tasks such as image recognition, video analysis or natural language processing. A CNN consists on a set of layers where each layer is composed by a set of high pass filters which are applied all over the input image. This convolution process provides the unique ability to extract features automatically from data and to learn from those features. Our proposal describes a CNN architecture which is able to infer the noise pattern of mobile camera sensors (also known as camera fingerprint) with the aim at detecting and identifying not only the mobile device used to capture an image (with a 98% of accuracy), but also from which embedded camera the image was captured. More specifically, we provide an extensive analysis on the proposed architecture considering different configurations. The experiment has been carried out using the images captured from different mobile devices cameras (MICHE-I Dataset was used) and the obtained results have proved the robustness of the proposed method.
Tasks
Published	2017-09-30
URL	http://arxiv.org/abs/1710.01257v2
PDF	http://arxiv.org/pdf/1710.01257v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-source-camera
Repo
Framework

Query-limited Black-box Attacks to Classifiers


Title	Query-limited Black-box Attacks to Classifiers
Authors	Fnu Suya, Yuan Tian, David Evans, Paolo Papotti
Abstract	We study black-box attacks on machine learning classifiers where each query to the model incurs some cost or risk of detection to the adversary. We focus explicitly on minimizing the number of queries as a major objective. Specifically, we consider the problem of attacking machine learning classifiers subject to a budget of feature modification cost while minimizing the number of queries, where each query returns only a class and confidence score. We describe an approach that uses Bayesian optimization to minimize the number of queries, and find that the number of queries can be reduced to approximately one tenth of the number needed through a random strategy for scenarios where the feature modification cost budget is low.
Tasks
Published	2017-12-23
URL	http://arxiv.org/abs/1712.08713v1
PDF	http://arxiv.org/pdf/1712.08713v1.pdf
PWC	https://paperswithcode.com/paper/query-limited-black-box-attacks-to
Repo
Framework

Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders


Title	Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders
Authors	Minje Kim
Abstract	We show that a Modular Neural Network (MNN) can combine various speech enhancement modules, each of which is a Deep Neural Network (DNN) specialized on a particular enhancement job. Differently from an ordinary ensemble technique that averages variations in models, the propose MNN selects the best module for the unseen test signal to produce a greedy ensemble. We see this as Collaborative Deep Learning (CDL), because it can reuse various already-trained DNN models without any further refining. In the proposed MNN selecting the best module during run time is challenging. To this end, we employ a speech AutoEncoder (AE) as an arbitrator, whose input and output are trained to be as similar as possible if its input is clean speech. Therefore, the AE can gauge the quality of the module-specific denoised result by seeing its AE reconstruction error, e.g. low error means that the module output is similar to clean speech. We propose an MNN structure with various modules that are specialized on dealing with a specific noise type, gender, and input Signal-to-Noise Ratio (SNR) value, and empirically prove that it almost always works better than an arbitrarily chosen DNN module and sometimes as good as an oracle result.
Tasks	Model Selection, Speech Enhancement
Published	2017-05-29
URL	http://arxiv.org/abs/1705.10385v1
PDF	http://arxiv.org/pdf/1705.10385v1.pdf
PWC	https://paperswithcode.com/paper/collaborative-deep-learning-for-speech
Repo
Framework