January 30, 2020

3493 words 17 mins read

Paper Group ANR 433

Paper Group ANR 433

Deep Sparse Coding for Non-Intrusive Load Monitoring. Clustering Uncertain Data via Representative Possible Worlds with Consistency Learning. Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber). Intelligent Systems Design for Malware Classification Under Adversarial Conditi …

Deep Sparse Coding for Non-Intrusive Load Monitoring

Title Deep Sparse Coding for Non-Intrusive Load Monitoring
Authors Shikha Singh, Angshul Majumdar
Abstract Energy disaggregation is the task of segregating the aggregate energy of the entire building (as logged by the smartmeter) into the energy consumed by individual appliances. This is a single channel (the only channel being the smart-meter) blind source (different electrical appliances) separation problem. The traditional way to address this is via stochastic finite state machines (e.g. Factorial Hidden Markov Model). In recent times dictionary learning based approaches have shown promise in addressing the disaggregation problem. The usual technique is to learn a dictionary for every device and use the learnt dictionaries as basis for blind source separation during disaggregation. Prior studies in this area are shallow learning techniques, i.e. they learn a single layer of dictionary for every device. In this work, we propose a deep learning approach, instead of learning one level of dictionary, we learn multiple layers of dictionaries for each device. These multi-level dictionaries are used as a basis for source separation during disaggregation. Results on two benchmark datasets and one actual implementation show that our method outperforms state-of-the-art techniques.
Tasks Dictionary Learning, Non-Intrusive Load Monitoring
Published 2019-12-11
URL https://arxiv.org/abs/1912.12128v1
PDF https://arxiv.org/pdf/1912.12128v1.pdf
PWC https://paperswithcode.com/paper/deep-sparse-coding-for-non-intrusive-load
Repo
Framework

Clustering Uncertain Data via Representative Possible Worlds with Consistency Learning

Title Clustering Uncertain Data via Representative Possible Worlds with Consistency Learning
Authors Han Liu, Xianchao Zhang, Xiaotong Zhang, Qimai Li, Xiao-Ming Wu
Abstract Clustering uncertain data is an essential task in data mining for the internet of things. Possible world based algorithms seem promising for clustering uncertain data. However, there are two issues in existing possible world based algorithms: (1) They rely on all the possible worlds and treat them equally, but some marginal possible worlds may cause negative effects. (2) They do not well utilize the consistency among possible worlds, since they conduct clustering or construct the affinity matrix on each possible world independently. In this paper, we propose a representative possible world based consistent clustering (RPC) algorithm for uncertain data. First, by introducing representative loss and using Jensen-Shannon divergence as the distribution measure, we design a heuristic strategy for the selection of representative possible worlds, thus avoiding the negative effects caused by marginal possible worlds. Second, we integrate a consistency learning procedure into spectral clustering to deal with the representative possible worlds synergistically, thus utilizing the consistency to achieve better performance. Experimental results show that our proposed algorithm performs better than the state-of-the-art algorithms.
Tasks
Published 2019-09-27
URL https://arxiv.org/abs/1909.12514v1
PDF https://arxiv.org/pdf/1909.12514v1.pdf
PWC https://paperswithcode.com/paper/clustering-uncertain-data-via-representative
Repo
Framework

Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber)

Title Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber)
Authors Fabrizio Borgia, Claudia S. Bianchini, Patrice Dalle, Maria de Marsico
Abstract The SignWriting improved fast transcriber (SWift), presented in this paper, is an advanced editor for computer-aided writing and transcribing of any Sign Language (SL) using SignWriting (SW). The application is an editor which allows composing and saving desired signs using the SW elementary components, called “glyphs”. These make up a sort of alphabet, which does not depend on the national Sign Language and which codes the basic components of any sign. The user is guided through a fully-automated procedure, making the composition process fast and intuitive. SWift pursues the goal of helping to break down the “electronic barriers” that keep deaf people away from the web, and at the same time to support linguistic research about Sign Languages features. For this reason it has been designed with a special attention to deaf user needs, and to general usability issues. The editor has been developed in a modular way, so it can be integrated everywhere the use of SW as an alternative to written “verbal” language may be advisable.
Tasks
Published 2019-11-22
URL https://arxiv.org/abs/1911.09919v1
PDF https://arxiv.org/pdf/1911.09919v1.pdf
PWC https://paperswithcode.com/paper/resource-production-of-written-forms-of-sign-1
Repo
Framework

Intelligent Systems Design for Malware Classification Under Adversarial Conditions

Title Intelligent Systems Design for Malware Classification Under Adversarial Conditions
Authors Sean M. Devine, Nathaniel D. Bastian
Abstract The use of machine learning and intelligent systems has become an established practice in the realm of malware detection and cyber threat prevention. In an environment characterized by widespread accessibility and big data, the feasibility of malware classification without the use of artificial intelligence-based techniques has been diminished exponentially. Also characteristic of the contemporary realm of automated, intelligent malware detection is the threat of adversarial machine learning. Adversaries are looking to target the underlying data and/or algorithm responsible for the functionality of malware classification to map its behavior or corrupt its functionality. The ends of such adversaries are bypassing the cyber security measures and increasing malware effectiveness. The focus of this research is the design of an intelligent systems approach using machine learning that can accurately and robustly classify malware under adversarial conditions. Such an outcome ultimately relies on increased flexibility and adaptability to build a model robust enough to identify attacks on the underlying algorithm.
Tasks Malware Classification, Malware Detection
Published 2019-07-06
URL https://arxiv.org/abs/1907.03149v1
PDF https://arxiv.org/pdf/1907.03149v1.pdf
PWC https://paperswithcode.com/paper/intelligent-systems-design-for-malware
Repo
Framework

Ising-Dropout: A Regularization Method for Training and Compression of Deep Neural Networks

Title Ising-Dropout: A Regularization Method for Training and Compression of Deep Neural Networks
Authors Hojjat Salehinejad, Shahrokh Valaee
Abstract Overfitting is a major problem in training machine learning models, specifically deep neural networks. This problem may be caused by imbalanced datasets and initialization of the model parameters, which conforms the model too closely to the training data and negatively affects the generalization performance of the model for unseen data. The original dropout is a regularization technique to drop hidden units randomly during training. In this paper, we propose an adaptive technique to wisely drop the visible and hidden units in a deep neural network using Ising energy of the network. The preliminary results show that the proposed approach can keep the classification performance competitive to the original network while eliminating optimization of unnecessary network parameters in each training cycle. The dropout state of units can also be applied to the trained (inference) model. This technique could compress the network in terms of number of parameters up to 41.18% and 55.86% for the classification task on the MNIST and Fashion-MNIST datasets, respectively.
Tasks
Published 2019-02-07
URL http://arxiv.org/abs/1902.08673v1
PDF http://arxiv.org/pdf/1902.08673v1.pdf
PWC https://paperswithcode.com/paper/ising-dropout-a-regularization-method-for
Repo
Framework

Correlated bandits or: How to minimize mean-squared error online

Title Correlated bandits or: How to minimize mean-squared error online
Authors Vinay Praneeth Boda, Prashanth L. A
Abstract While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means of the arms. Sensors placement for industrial surveillance and cellular network monitoring are a few applications, where the underlying correlation structure plays an important role. Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms. To this end, we derive first an MSE estimator, based on sample variances and covariances, and show that our estimator exponentially concentrates around the true MSE. Under a best-arm identification framework, we propose a successive rejects type algorithm and provide bounds on the probability of error in identifying the best arm. Using minmax theory, we also derive fundamental performance limits for the correlated bandit problem.
Tasks
Published 2019-02-08
URL https://arxiv.org/abs/1902.02953v2
PDF https://arxiv.org/pdf/1902.02953v2.pdf
PWC https://paperswithcode.com/paper/correlated-bandits-or-how-to-minimize-mean
Repo
Framework

List-Decodable Linear Regression

Title List-Decodable Linear Regression
Authors Sushrut Karmalkar, Adam R. Klivans, Pravesh K. Kothari
Abstract We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $\alpha < 1$, our algorithm takes as input a sample ${(x_i,y_i)}_{i \leq n}$ of $n$ linear equations where $\alpha n$ of the equations satisfy $y_i = \langle x_i,\ell^*\rangle +\zeta$ for some small noise $\zeta$ and $(1-\alpha)n$ of the equations are {\em arbitrarily} chosen. It outputs a list $L$ of size $O(1/\alpha)$ - a fixed constant - that contains an $\ell$ that is close to $\ell^*$. Our algorithm succeeds whenever the inliers are chosen from a \emph{certifiably} anti-concentrated distribution $D$. In particular, this gives a $(d/\alpha)^{O(1/\alpha^8)}$ time algorithm to find a $O(1/\alpha)$ size list when the inlier distribution is standard Gaussian. For discrete product distributions that are anti-concentrated only in \emph{regular} directions, we give an algorithm that achieves similar guarantee under the promise that $\ell^*$ has all coordinates of the same magnitude. To complement our result, we prove that the anti-concentration assumption on the inliers is information-theoretically necessary. Our algorithm is based on a new framework for list-decodable learning that strengthens the `identifiability to algorithms’ paradigm based on the sum-of-squares method. In an independent and concurrent work, Raghavendra and Yau also used the Sum-of-Squares method to give a similar result for list-decodable regression. |
Tasks
Published 2019-05-14
URL https://arxiv.org/abs/1905.05679v3
PDF https://arxiv.org/pdf/1905.05679v3.pdf
PWC https://paperswithcode.com/paper/list-decodable-linear-regression
Repo
Framework

Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings

Title Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Authors Myunghun Jung, Hyungjun Lim, Jahyun Goo, Youngmoon Jung, Hoirin Kim
Abstract Acoustic word embeddings — fixed-dimensional vector representations of arbitrary-length words — have attracted increasing interest in query-by-example spoken term detection. Recently, on the fact that the orthography of text labels partly reflects the phonetic similarity between the words’ pronunciation, a multi-view approach has been introduced that jointly learns acoustic and text embeddings. It showed that it is possible to learn discriminative embeddings by designing the objective which takes text labels as well as word segments. In this paper, we propose a network architecture that expands the multi-view approach by combining the Siamese multi-view encoders with a shared decoder network to maximize the effect of the relationship between acoustic and text embeddings in embedding space. Discriminatively trained with multi-view triplet loss and decoding loss, our proposed approach achieves better performance on acoustic word discrimination task with the WSJ dataset, resulting in 11.1% relative improvement in average precision. We also present experimental results on cross-view word discrimination and word level speech recognition tasks.
Tasks Speech Recognition, Word Embeddings
Published 2019-10-01
URL https://arxiv.org/abs/1910.00341v1
PDF https://arxiv.org/pdf/1910.00341v1.pdf
PWC https://paperswithcode.com/paper/additional-shared-decoder-on-siamese-multi
Repo
Framework

Counterfactual Explanation Algorithms for Behavioral and Textual Data

Title Counterfactual Explanation Algorithms for Behavioral and Textual Data
Authors Yanou Ramon, David Martens, Foster Provost, Theodoros Evgeniou
Abstract We study the interpretability of predictive systems that use high-dimensonal behavioral and textual data. Examples include predicting product interest based on online browsing data and detecting spam emails or objectionable web content. Recently, counterfactual explanations have been proposed for generating insight into model predictions, which focus on what is relevant to a particular instance. Conducting a complete search to compute counterfactuals is very time-consuming because of the huge dimensionality. To our knowledge, for behavioral and text data, only one model-agnostic heuristic algorithm (SEDC) for finding counterfactual explanations has been proposed in the literature. However, there may be better algorithms for finding counterfactuals quickly. This study aligns the recently proposed Linear Interpretable Model-agnostic Explainer (LIME) and Shapley Additive Explanations (SHAP) with the notion of counterfactual explanations, and empirically benchmarks their effectiveness and efficiency against SEDC using a collection of 13 data sets. Results show that LIME-Counterfactual (LIME-C) and SHAP-Counterfactual (SHAP-C) have low and stable computation times, but mostly, they are less efficient than SEDC. However, for certain instances on certain data sets, SEDC’s run time is comparably large. With regard to effectiveness, LIME-C and SHAP-C find reasonable, if not always optimal, counterfactual explanations. SHAP-C, however, seems to have difficulties with highly unbalanced data. Because of its good overall performance, LIME-C seems to be a favorable alternative to SEDC, which failed for some nonlinear models to find counterfactuals because of the particular heuristic search algorithm it uses. A main upshot of this paper is that there is a good deal of room for further research. For example, we propose algorithmic adjustments that are direct upshots of the paper’s findings.
Tasks
Published 2019-12-04
URL https://arxiv.org/abs/1912.01819v1
PDF https://arxiv.org/pdf/1912.01819v1.pdf
PWC https://paperswithcode.com/paper/counterfactual-explanation-algorithms-for
Repo
Framework

End-to-End Multi-Channel Speech Separation

Title End-to-End Multi-Channel Speech Separation
Authors Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu
Abstract The end-to-end approach for single-channel speech separation has been studied recently and shown promising results. This paper extended the previous approach and proposed a new end-to-end model for multi-channel speech separation. The primary contributions of this work include 1) an integrated waveform-in waveform-out separation system in a single neural network architecture. 2) We reformulate the traditional short time Fourier transform (STFT) and inter-channel phase difference (IPD) as a function of time-domain convolution with a special kernel. 3) We further relaxed those fixed kernels to be learnable, so that the entire architecture becomes purely data-driven and can be trained from end-to-end. We demonstrate on the WSJ0 far-field speech separation task that, with the benefit of learnable spatial features, our proposed end-to-end multi-channel model significantly improved the performance of previous end-to-end single-channel method and traditional multi-channel methods.
Tasks Speech Separation
Published 2019-05-15
URL https://arxiv.org/abs/1905.06286v2
PDF https://arxiv.org/pdf/1905.06286v2.pdf
PWC https://paperswithcode.com/paper/end-to-end-multi-channel-speech-separation
Repo
Framework

Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning

Title Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning
Authors Nathan O. Lambert, Daniel S. Drew, Joseph Yaconelli, Roberto Calandra, Sergey Levine, Kristofer S. J. Pister
Abstract Designing effective low-level robot controllers often entail platform-specific implementations that require manual heuristic parameter tuning, significant system knowledge, or long design times. With the rising number of robotic and mechatronic systems deployed across areas ranging from industrial automation to intelligent toys, the need for a general approach to generating low-level controllers is increasing. To address the challenge of rapidly generating low-level controllers, we argue for using model-based reinforcement learning (MBRL) trained on relatively small amounts of automatically generated (i.e., without system simulation) data. In this paper, we explore the capabilities of MBRL on a Crazyflie centimeter-scale quadrotor with rapid dynamics to predict and control at <50Hz. To our knowledge, this is the first use of MBRL for controlled hover of a quadrotor using only on-board sensors, direct motor input signals, and no initial dynamics knowledge. Our controller leverages rapid simulation of a neural network forward dynamics model on a GPU-enabled base station, which then transmits the best current action to the quadrotor firmware via radio. In our experiments, the quadrotor achieved hovering capability of up to 6 seconds with 3 minutes of experimental training data.
Tasks
Published 2019-01-11
URL https://arxiv.org/abs/1901.03737v2
PDF https://arxiv.org/pdf/1901.03737v2.pdf
PWC https://paperswithcode.com/paper/low-level-control-of-a-quadrotor-with-deep
Repo
Framework

Unsupervised Image Translation using Adversarial Networks for Improved Plant Disease Recognition

Title Unsupervised Image Translation using Adversarial Networks for Improved Plant Disease Recognition
Authors Haseeb Nazki, Sook Yoon, Alvaro Fuentes, Dong Sun Park
Abstract Acquisition of data in task-specific applications of machine learning like plant disease recognition is a costly endeavor owing to the requirements of professional human diligence and time constraints. In this paper, we present a simple pipeline that uses GANs in an unsupervised image translation environment to improve learning with respect to the data distribution in a plant disease dataset, reducing the partiality introduced by acute class imbalance and hence shifting the classification decision boundary towards better performance. The empirical analysis of our method is demonstrated on a limited dataset of 2789 tomato plant disease images, highly corrupted with an imbalance in the 9 disease categories. First, we extend the state of the art for the GAN-based image-to-image translation method by enhancing the perceptual quality of the generated images and preserving the semantics. We introduce AR-GAN, where in addition to the adversarial loss, our synthetic image generator optimizes on Activation Reconstruction loss (ARL) function that optimizes feature activations against the natural image. We present visually more compelling synthetic images in comparison to most prominent existing models and evaluate the performance of our GAN framework in terms of various datasets and metrics. Second, we evaluate the performance of a baseline convolutional neural network classifier for improved recognition using the resulting synthetic samples to augment our training set and compare it with the classical data augmentation scheme. We observe a significant improvement in classification accuracy (+5.2%) using generated synthetic samples as compared to (+0.8%) increase using classic augmentation in an equal class distribution environment.
Tasks Data Augmentation, Image-to-Image Translation
Published 2019-09-26
URL https://arxiv.org/abs/1909.11915v1
PDF https://arxiv.org/pdf/1909.11915v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-image-translation-using
Repo
Framework

Enhancing Traffic Scene Predictions with Generative Adversarial Networks

Title Enhancing Traffic Scene Predictions with Generative Adversarial Networks
Authors Peter König, Sandra Aigner, Marco Körner
Abstract We present a new two-stage pipeline for predicting frames of traffic scenes where relevant objects can still reliably be detected. Using a recent video prediction network, we first generate a sequence of future frames based on past frames. A second network then enhances these frames in order to make them appear more realistic. This ensures the quality of the predicted frames to be sufficient to enable accurate detection of objects, which is especially important for autonomously driving cars. To verify this two-stage approach, we conducted experiments on the Cityscapes dataset. For enhancing, we trained two image-to-image translation methods based on generative adversarial networks, one for blind motion deblurring and one for image super-resolution. All resulting predictions were quantitatively evaluated using both traditional metrics and a state-of-the-art object detection network showing that the enhanced frames appear qualitatively improved. While the traditional image comparison metrics, i.e., MSE, PSNR, and SSIM, failed to confirm this visual impression, the object detection evaluation resembles it well. The best performing prediction-enhancement pipeline is able to increase the average precision values for detecting cars by about 9% for each prediction step, compared to the non-enhanced predictions.
Tasks Deblurring, Image Super-Resolution, Image-to-Image Translation, Object Detection, Super-Resolution, Video Prediction
Published 2019-09-24
URL https://arxiv.org/abs/1909.10833v1
PDF https://arxiv.org/pdf/1909.10833v1.pdf
PWC https://paperswithcode.com/paper/enhancing-traffic-scene-predictions-with
Repo
Framework

Robust GPU-based Virtual Reality Simulation of Radio Frequency Ablations for Various Needle Geometries and Locations

Title Robust GPU-based Virtual Reality Simulation of Radio Frequency Ablations for Various Needle Geometries and Locations
Authors Niclas Kath, Heinz Handels, Andre Mastmeyer
Abstract Purpose: Radio-frequency ablations play an important role in the therapy of malignant liver lesions. The navigation of a needle to the lesion poses a challenge for both the trainees and intervening physicians. Methods: This publication presents a new GPU-based, accurate method for the simulation of radio-frequency ablations for lesions at the needle tip in general and for an existing visuo-haptic 4D VR simulator. The method is implemented real-time capable with Nvidia CUDA. Results: It performs better than a literature method concerning the theoretical characteristic of monotonic convergence of the bioheat PDE and a in vitro gold standard with significant improvements (p < 0.05) in terms of Pearson correlations. It shows no failure modes or theoretically inconsistent individual simulation results after the initial phase of 10 seconds. On the Nvidia 1080 Ti GPU it achieves a very high frame rendering performance of >480 Hz. Conclusion: Our method provides a more robust and safer real-time ablation planning and intraoperative guidance technique, especially avoiding the over-estimation of the ablated tissue death zone, which is risky for the patient in terms of tumor recurrence. Future in vitro measurements and optimization shall further improve the conservative estimate.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05709v1
PDF https://arxiv.org/pdf/1907.05709v1.pdf
PWC https://paperswithcode.com/paper/robust-gpu-based-virtual-reality-simulation
Repo
Framework

Multi Scale Supervised 3D U-Net for Kidney and Tumor Segmentation

Title Multi Scale Supervised 3D U-Net for Kidney and Tumor Segmentation
Authors Wenshuai Zhao, Zengfeng Zeng
Abstract U-Net has achieved huge success in various medical image segmentation challenges. Kinds of new architectures with bells and whistles might succeed in certain dataset when employed with optimal hyper-parameter, but their generalization always can’t be guaranteed. Here, we focused on the basic U-Net architecture and proposed a multi scale supervised 3D U-Net for the segmentation task in KiTS19 challenge. To enhance the performance, our work can be summarized as three folds: first, we used multi scale supervision in the decoder pathway, which could encourage the network to predict right results from the deep layers; second, with the aim to alleviate the bad effect from the sample imbalance of kidney and tumor, we adopted exponential logarithmic loss; third, a connected-component based post processing method was designed to remove the obviously wrong voxels. In the published KiTS19 training dataset (totally 210 patients), we divided 42 patients to be test dataset and finally obtained DICE scores of 0.969 and 0.805 for the kidney and tumor respectively. In the challenge, we finally achieved the 7th place among 106 teams with the Composite Dice of 0.8961, namely 0.9741 for kidney and 0.8181 for tumor.
Tasks Medical Image Segmentation, Semantic Segmentation
Published 2019-08-09
URL https://arxiv.org/abs/1908.03204v2
PDF https://arxiv.org/pdf/1908.03204v2.pdf
PWC https://paperswithcode.com/paper/multi-scale-supervised-3d-u-net-for-kidney
Repo
Framework
comments powered by Disqus