Paper Group ANR 131
Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things. InScript: Narrative texts annotated with script information. Multiple Adaptive Bayesian Linear Regression for Scalable Bayesian Optimization with Warm Start. PaMM: Pose-aware Multi-shot Matching for Improving Person Re-ide …
Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things
Title | Squeezed Convolutional Variational AutoEncoder for Unsupervised Anomaly Detection in Edge Device Industrial Internet of Things |
Authors | Dohyung Kim, Hyochang Yang, Minki Chung, Sungzoon Cho |
Abstract | In this paper, we propose Squeezed Convolutional Variational AutoEncoder (SCVAE) for anomaly detection in time series data for Edge Computing in Industrial Internet of Things (IIoT). The proposed model is applied to labeled time series data from UCI datasets for exact performance evaluation, and applied to real world data for indirect model performance comparison. In addition, by comparing the models before and after applying Fire Modules from SqueezeNet, we show that model size and inference times are reduced while similar levels of performance is maintained. |
Tasks | Anomaly Detection, Time Series, Unsupervised Anomaly Detection |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06343v1 |
http://arxiv.org/pdf/1712.06343v1.pdf | |
PWC | https://paperswithcode.com/paper/squeezed-convolutional-variational |
Repo | |
Framework | |
InScript: Narrative texts annotated with script information
Title | InScript: Narrative texts annotated with script information |
Authors | Ashutosh Modi, Tatjana Anikina, Simon Ostermann, Manfred Pinkal |
Abstract | This paper presents the InScript corpus (Narrative Texts Instantiating Script structure). InScript is a corpus of 1,000 stories centered around 10 different scenarios. Verbs and noun phrases are annotated with event and participant types, respectively. Additionally, the text is annotated with coreference information. The corpus shows rich lexical variation and will serve as a unique resource for the study of the role of script knowledge in natural language processing. |
Tasks | |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.05260v1 |
http://arxiv.org/pdf/1703.05260v1.pdf | |
PWC | https://paperswithcode.com/paper/inscript-narrative-texts-annotated-with |
Repo | |
Framework | |
Multiple Adaptive Bayesian Linear Regression for Scalable Bayesian Optimization with Warm Start
Title | Multiple Adaptive Bayesian Linear Regression for Scalable Bayesian Optimization with Warm Start |
Authors | Valerio Perrone, Rodolphe Jenatton, Matthias Seeger, Cedric Archambeau |
Abstract | Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization. Typically, BO is powered by a Gaussian process (GP), whose algorithmic complexity is cubic in the number of evaluations. Hence, GP-based BO cannot leverage large amounts of past or related function evaluations, for example, to warm start the BO procedure. We develop a multiple adaptive Bayesian linear regression model as a scalable alternative whose complexity is linear in the number of observations. The multiple Bayesian linear regression models are coupled through a shared feedforward neural network, which learns a joint representation and transfers knowledge across machine learning problems. |
Tasks | |
Published | 2017-12-08 |
URL | http://arxiv.org/abs/1712.02902v1 |
http://arxiv.org/pdf/1712.02902v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-adaptive-bayesian-linear-regression |
Repo | |
Framework | |
PaMM: Pose-aware Multi-shot Matching for Improving Person Re-identification
Title | PaMM: Pose-aware Multi-shot Matching for Improving Person Re-identification |
Authors | Yeong-Jun Cho, Kuk-Jin Yoon |
Abstract | Person re-identification is the problem of recognizing people across different images or videos with non-overlapping views. Although there has been much progress in person re-identification over the last decade, it remains a challenging task because appearances of people can seem extremely different across diverse camera viewpoints and person poses. In this paper, we propose a novel framework for person re-identification by analyzing camera viewpoints and person poses in a so-called Pose-aware Multi-shot Matching (PaMM), which robustly estimates people’s poses and efficiently conducts multi-shot matching based on pose information. Experimental results using public person re-identification datasets show that the proposed methods outperform state-of-the-art methods and are promising for person re-identification from diverse viewpoints and pose variances. |
Tasks | Person Re-Identification |
Published | 2017-05-17 |
URL | http://arxiv.org/abs/1705.06011v1 |
http://arxiv.org/pdf/1705.06011v1.pdf | |
PWC | https://paperswithcode.com/paper/pamm-pose-aware-multi-shot-matching-for |
Repo | |
Framework | |
Adaptive and Resilient Soft Tensegrity Robots
Title | Adaptive and Resilient Soft Tensegrity Robots |
Authors | John Rieffel, Jean-Baptiste Mouret |
Abstract | Living organisms intertwine soft (e.g., muscle) and hard (e.g., bones) materials, giving them an intrinsic flexibility and resiliency often lacking in conventional rigid robots. The emerging field of soft robotics seeks to harness these same properties in order to create resilient machines. The nature of soft materials, however, presents considerable challenges to aspects of design, construction, and control – and up until now, the vast majority of gaits for soft robots have been hand-designed through empirical trial-and-error. This manuscript describes an easy-to-assemble tensegrity-based soft robot capable of highly dynamic locomotive gaits and demonstrating structural and behavioral resilience in the face of physical damage. Enabling this is the use of a machine learning algorithm able to discover effective gaits with a minimal number of physical trials. These results lend further credence to soft-robotic approaches that seek to harness the interaction of complex material dynamics in order to generate a wealth of dynamical behaviors. |
Tasks | |
Published | 2017-02-10 |
URL | http://arxiv.org/abs/1702.03258v2 |
http://arxiv.org/pdf/1702.03258v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-and-resilient-soft-tensegrity-robots |
Repo | |
Framework | |
Argument Labeling of Explicit Discourse Relations using LSTM Neural Networks
Title | Argument Labeling of Explicit Discourse Relations using LSTM Neural Networks |
Authors | Sohail Hooda, Leila Kosseim |
Abstract | Argument labeling of explicit discourse relations is a challenging task. The state of the art systems achieve slightly above 55% F-measure but require hand-crafted features. In this paper, we propose a Long Short Term Memory (LSTM) based model for argument labeling. We experimented with multiple configurations of our model. Using the PDTB dataset, our best model achieved an F1 measure of 23.05% without any feature engineering. This is significantly higher than the 20.52% achieved by the state of the art RNN approach, but significantly lower than the feature based state of the art systems. On the other hand, because our approach learns only from the raw dataset, it is more widely applicable to multiple textual genres and languages. |
Tasks | Feature Engineering |
Published | 2017-08-11 |
URL | http://arxiv.org/abs/1708.03425v2 |
http://arxiv.org/pdf/1708.03425v2.pdf | |
PWC | https://paperswithcode.com/paper/argument-labeling-of-explicit-discourse |
Repo | |
Framework | |
Concise Radiometric Calibration Using The Power of Ranking
Title | Concise Radiometric Calibration Using The Power of Ranking |
Authors | Han Gong, Graham D. Finlayson, Maryam M. Darrodi |
Abstract | Compared with raw images, the more common JPEG images are less useful for machine vision algorithms and professional photographers because JPEG-sRGB does not preserve a linear relation between pixel values and the light measured from the scene. A camera is said to be radiometrically calibrated if there is a computational model which can predict how the raw linear sensor image is mapped to the corresponding rendered image (e.g. JPEGs) and vice versa. This paper begins with the observation that the rank order of pixel values are mostly preserved post colour correction. We show that this observation is the key to solving for the whole camera pipeline (colour correction, tone and gamut mapping). Our rank-based calibration method is simpler than the prior art and so is parametrised by fewer variables which, concomitantly, can be solved for using less calibration data. Another advantage is that we can derive the camera pipeline from a single pair of raw-JPEG images. Experiments demonstrate that our method delivers state-of-the-art results (especially for the most interesting case of JPEG to raw). |
Tasks | Calibration |
Published | 2017-07-27 |
URL | http://arxiv.org/abs/1707.08943v3 |
http://arxiv.org/pdf/1707.08943v3.pdf | |
PWC | https://paperswithcode.com/paper/concise-radiometric-calibration-using-the |
Repo | |
Framework | |
Perceptual audio loss function for deep learning
Title | Perceptual audio loss function for deep learning |
Authors | Dan Elbaz, Michael Zibulevsky |
Abstract | PESQ and POLQA , are standards are standards for automated assessment of voice quality of speech as experienced by human beings. The predictions of those objective measures should come as close as possible to subjective quality scores as obtained in subjective listening tests. Wavenet is a deep neural network originally developed as a deep generative model of raw audio wave-forms. Wavenet architecture is based on dilated causal convolutions, which exhibit very large receptive fields. In this short paper we suggest using the Wavenet architecture, in particular its large receptive filed in order to learn PESQ algorithm. By doing so we can use it as a differentiable loss function for speech enhancement. |
Tasks | Speech Enhancement |
Published | 2017-08-20 |
URL | http://arxiv.org/abs/1708.05987v1 |
http://arxiv.org/pdf/1708.05987v1.pdf | |
PWC | https://paperswithcode.com/paper/perceptual-audio-loss-function-for-deep |
Repo | |
Framework | |
Critical Learning Periods in Deep Neural Networks
Title | Critical Learning Periods in Deep Neural Networks |
Authors | Alessandro Achille, Matteo Rovere, Stefano Soatto |
Abstract | Similar to humans and animals, deep artificial neural networks exhibit critical periods during which a temporary stimulus deficit can impair the development of a skill. The extent of the impairment depends on the onset and length of the deficit window, as in animal models, and on the size of the neural network. Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training. To better understand this phenomenon, we use the Fisher Information of the weights to measure the effective connectivity between layers of a network during training. Counterintuitively, information rises rapidly in the early phases of training, and then decreases, preventing redistribution of information resources in a phenomenon we refer to as a loss of “Information Plasticity”. Our analysis suggests that the first few epochs are critical for the creation of strong connections that are optimal relative to the input data distribution. Once such strong connections are created, they do not appear to change during additional training. These findings suggest that the initial learning transient, under-scrutinized compared to asymptotic behavior, plays a key role in determining the outcome of the training process. Our findings, combined with recent theoretical results in the literature, also suggest that forgetting (decrease of information in the weights) is critical to achieving invariance and disentanglement in representation learning. Finally, critical periods are not restricted to biological systems, but can emerge naturally in learning systems, whether biological or artificial, due to fundamental constrains arising from learning dynamics and information processing. |
Tasks | Representation Learning |
Published | 2017-11-24 |
URL | http://arxiv.org/abs/1711.08856v3 |
http://arxiv.org/pdf/1711.08856v3.pdf | |
PWC | https://paperswithcode.com/paper/critical-learning-periods-in-deep-neural |
Repo | |
Framework | |
Hidden-Markov-Model Based Speech Enhancement
Title | Hidden-Markov-Model Based Speech Enhancement |
Authors | Daniel Dzibela, Armin Sehr |
Abstract | The goal of this contribution is to use a parametric speech synthesis system for reducing background noise and other interferences from recorded speech signals. In a first step, Hidden Markov Models of the synthesis system are trained. Two adequate training corpora consisting of text and corresponding speech files have been set up and cleared of various faults, including inaudible utterances or incorrect assignments between audio and text data. Those are tested and compared against each other regarding e.g. flaws in the synthesized speech, it’s naturalness and intelligibility. Thus different voices have been synthesized, whose quality depends less on the number of training samples used, but much more on the cleanliness and signal-to-noise ratio of those. Generalized voice models have been used for synthesis and the results greatly differ between the two speech corpora. Tests regarding the adaptation to different speakers show that a resemblance to the original speaker is audible throughout all recordings, yet the synthesized voices sound robotic and unnatural in smaller parts. The spoken text, however, is usually intelligible, which shows that the models are working well. In a novel approach, speech is synthesized using side information of the original audio signal, particularly the pitch frequency. Results show an increase of speech quality and intelligibility in comparison to speech synthesized solely from text, up to the point of being nearly indistinguishable from the original. |
Tasks | Speech Enhancement, Speech Synthesis |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.01090v1 |
http://arxiv.org/pdf/1707.01090v1.pdf | |
PWC | https://paperswithcode.com/paper/hidden-markov-model-based-speech-enhancement |
Repo | |
Framework | |
A Clustering-based Consistency Adaptation Strategy for Distributed SDN Controllers
Title | A Clustering-based Consistency Adaptation Strategy for Distributed SDN Controllers |
Authors | Mohamed Aslan, Ashraf Matrawy |
Abstract | Distributed controllers are oftentimes used in large-scale SDN deployments where they run a myriad of network applications simultaneously. Such applications could have different consistency and availability preferences. These controllers need to communicate via east/west interfaces in order to synchronize their state information. The consistency and the availability of the distributed state information are governed by an underlying consistency model. Earlier, we suggested the use of adaptively-consistent controllers that can autonomously tune their consistency parameters in order to meet the performance requirements of a certain application. In this paper, we examine the feasibility of employing adaptive controllers that are built on-top of tunable consistency models similar to that of Apache Cassandra. We present an adaptation strategy that uses clustering techniques (sequential k-means and incremental k-means) in order to map a given application performance indicator into a feasible consistency level that can be used with the underlying tunable consistency model. In the cases that we modeled and tested, our results show that in the case of sequential k-means, with a reasonable number of clusters (>= 50), a plausible mapping (low RMSE) could be estimated between the application performance indicators and the consistency level indicator. In the case of incremental k-means, the results also showed that a plausible mapping (low RMSE) could be estimated using a similar number of clusters (>= 50) by using a small threshold (~$ 0.01). |
Tasks | |
Published | 2017-05-25 |
URL | http://arxiv.org/abs/1705.09050v1 |
http://arxiv.org/pdf/1705.09050v1.pdf | |
PWC | https://paperswithcode.com/paper/a-clustering-based-consistency-adaptation |
Repo | |
Framework | |
Face Recognition with Machine Learning in OpenCV_ Fusion of the results with the Localization Data of an Acoustic Camera for Speaker Identification
Title | Face Recognition with Machine Learning in OpenCV_ Fusion of the results with the Localization Data of an Acoustic Camera for Speaker Identification |
Authors | Johannes Reschke, Armin Sehr |
Abstract | This contribution gives an overview of face recogni-tion algorithms, their implementation and practical uses. First, a training set of different persons’ faces has to be collected and used to train a face recognizer. The resulting face model can be utilized to classify people in specific individuals or unknowns. After tracking the recognized face and estimating the acoustic sound source’s position, both can be combined to give detailed information about possible speakers and if they are talking or not. This leads to a precise real-time description of the situation, which can be used for further applications, e.g. for multi-channel speech enhancement by adaptive beamformers. |
Tasks | Face Recognition, Speaker Identification, Speech Enhancement |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.00835v1 |
http://arxiv.org/pdf/1707.00835v1.pdf | |
PWC | https://paperswithcode.com/paper/face-recognition-with-machine-learning-in |
Repo | |
Framework | |
Deep learning for source camera identification on mobile devices
Title | Deep learning for source camera identification on mobile devices |
Authors | David Freire-Obregón, Fabio Narducci, Silvio Barra, Modesto Castrillón-Santana |
Abstract | In the present paper, we propose a source camera identification method for mobile devices based on deep learning. Recently, convolutional neural networks (CNNs) have shown a remarkable performance on several tasks such as image recognition, video analysis or natural language processing. A CNN consists on a set of layers where each layer is composed by a set of high pass filters which are applied all over the input image. This convolution process provides the unique ability to extract features automatically from data and to learn from those features. Our proposal describes a CNN architecture which is able to infer the noise pattern of mobile camera sensors (also known as camera fingerprint) with the aim at detecting and identifying not only the mobile device used to capture an image (with a 98% of accuracy), but also from which embedded camera the image was captured. More specifically, we provide an extensive analysis on the proposed architecture considering different configurations. The experiment has been carried out using the images captured from different mobile devices cameras (MICHE-I Dataset was used) and the obtained results have proved the robustness of the proposed method. |
Tasks | |
Published | 2017-09-30 |
URL | http://arxiv.org/abs/1710.01257v2 |
http://arxiv.org/pdf/1710.01257v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-source-camera |
Repo | |
Framework | |
Query-limited Black-box Attacks to Classifiers
Title | Query-limited Black-box Attacks to Classifiers |
Authors | Fnu Suya, Yuan Tian, David Evans, Paolo Papotti |
Abstract | We study black-box attacks on machine learning classifiers where each query to the model incurs some cost or risk of detection to the adversary. We focus explicitly on minimizing the number of queries as a major objective. Specifically, we consider the problem of attacking machine learning classifiers subject to a budget of feature modification cost while minimizing the number of queries, where each query returns only a class and confidence score. We describe an approach that uses Bayesian optimization to minimize the number of queries, and find that the number of queries can be reduced to approximately one tenth of the number needed through a random strategy for scenarios where the feature modification cost budget is low. |
Tasks | |
Published | 2017-12-23 |
URL | http://arxiv.org/abs/1712.08713v1 |
http://arxiv.org/pdf/1712.08713v1.pdf | |
PWC | https://paperswithcode.com/paper/query-limited-black-box-attacks-to |
Repo | |
Framework | |
Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders
Title | Collaborative Deep Learning for Speech Enhancement: A Run-Time Model Selection Method Using Autoencoders |
Authors | Minje Kim |
Abstract | We show that a Modular Neural Network (MNN) can combine various speech enhancement modules, each of which is a Deep Neural Network (DNN) specialized on a particular enhancement job. Differently from an ordinary ensemble technique that averages variations in models, the propose MNN selects the best module for the unseen test signal to produce a greedy ensemble. We see this as Collaborative Deep Learning (CDL), because it can reuse various already-trained DNN models without any further refining. In the proposed MNN selecting the best module during run time is challenging. To this end, we employ a speech AutoEncoder (AE) as an arbitrator, whose input and output are trained to be as similar as possible if its input is clean speech. Therefore, the AE can gauge the quality of the module-specific denoised result by seeing its AE reconstruction error, e.g. low error means that the module output is similar to clean speech. We propose an MNN structure with various modules that are specialized on dealing with a specific noise type, gender, and input Signal-to-Noise Ratio (SNR) value, and empirically prove that it almost always works better than an arbitrarily chosen DNN module and sometimes as good as an oracle result. |
Tasks | Model Selection, Speech Enhancement |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10385v1 |
http://arxiv.org/pdf/1705.10385v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-deep-learning-for-speech |
Repo | |
Framework | |