January 30, 2020

3493 words 17 mins read

Paper Group ANR 433

Deep Sparse Coding for Non-Intrusive Load Monitoring. Clustering Uncertain Data via Representative Possible Worlds with Consistency Learning. Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber). Intelligent Systems Design for Malware Classification Under Adversarial Conditi …

Deep Sparse Coding for Non-Intrusive Load Monitoring


Title	Deep Sparse Coding for Non-Intrusive Load Monitoring
Authors	Shikha Singh, Angshul Majumdar
Abstract	Energy disaggregation is the task of segregating the aggregate energy of the entire building (as logged by the smartmeter) into the energy consumed by individual appliances. This is a single channel (the only channel being the smart-meter) blind source (different electrical appliances) separation problem. The traditional way to address this is via stochastic finite state machines (e.g. Factorial Hidden Markov Model). In recent times dictionary learning based approaches have shown promise in addressing the disaggregation problem. The usual technique is to learn a dictionary for every device and use the learnt dictionaries as basis for blind source separation during disaggregation. Prior studies in this area are shallow learning techniques, i.e. they learn a single layer of dictionary for every device. In this work, we propose a deep learning approach, instead of learning one level of dictionary, we learn multiple layers of dictionaries for each device. These multi-level dictionaries are used as a basis for source separation during disaggregation. Results on two benchmark datasets and one actual implementation show that our method outperforms state-of-the-art techniques.
Tasks	Dictionary Learning, Non-Intrusive Load Monitoring
Published	2019-12-11
URL	https://arxiv.org/abs/1912.12128v1
PDF	https://arxiv.org/pdf/1912.12128v1.pdf
PWC	https://paperswithcode.com/paper/deep-sparse-coding-for-non-intrusive-load
Repo
Framework

Clustering Uncertain Data via Representative Possible Worlds with Consistency Learning


Title	Clustering Uncertain Data via Representative Possible Worlds with Consistency Learning
Authors	Han Liu, Xianchao Zhang, Xiaotong Zhang, Qimai Li, Xiao-Ming Wu
Abstract	Clustering uncertain data is an essential task in data mining for the internet of things. Possible world based algorithms seem promising for clustering uncertain data. However, there are two issues in existing possible world based algorithms: (1) They rely on all the possible worlds and treat them equally, but some marginal possible worlds may cause negative effects. (2) They do not well utilize the consistency among possible worlds, since they conduct clustering or construct the affinity matrix on each possible world independently. In this paper, we propose a representative possible world based consistent clustering (RPC) algorithm for uncertain data. First, by introducing representative loss and using Jensen-Shannon divergence as the distribution measure, we design a heuristic strategy for the selection of representative possible worlds, thus avoiding the negative effects caused by marginal possible worlds. Second, we integrate a consistency learning procedure into spectral clustering to deal with the representative possible worlds synergistically, thus utilizing the consistency to achieve better performance. Experimental results show that our proposed algorithm performs better than the state-of-the-art algorithms.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12514v1
PDF	https://arxiv.org/pdf/1909.12514v1.pdf
PWC	https://paperswithcode.com/paper/clustering-uncertain-data-via-representative
Repo
Framework

Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber)


Title	Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber)
Authors	Fabrizio Borgia, Claudia S. Bianchini, Patrice Dalle, Maria de Marsico
Abstract	The SignWriting improved fast transcriber (SWift), presented in this paper, is an advanced editor for computer-aided writing and transcribing of any Sign Language (SL) using SignWriting (SW). The application is an editor which allows composing and saving desired signs using the SW elementary components, called “glyphs”. These make up a sort of alphabet, which does not depend on the national Sign Language and which codes the basic components of any sign. The user is guided through a fully-automated procedure, making the composition process fast and intuitive. SWift pursues the goal of helping to break down the “electronic barriers” that keep deaf people away from the web, and at the same time to support linguistic research about Sign Languages features. For this reason it has been designed with a special attention to deaf user needs, and to general usability issues. The editor has been developed in a modular way, so it can be integrated everywhere the use of SW as an alternative to written “verbal” language may be advisable.
Tasks
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09919v1
PDF	https://arxiv.org/pdf/1911.09919v1.pdf
PWC	https://paperswithcode.com/paper/resource-production-of-written-forms-of-sign-1
Repo
Framework

Intelligent Systems Design for Malware Classification Under Adversarial Conditions


Title	Intelligent Systems Design for Malware Classification Under Adversarial Conditions
Authors	Sean M. Devine, Nathaniel D. Bastian
Abstract	The use of machine learning and intelligent systems has become an established practice in the realm of malware detection and cyber threat prevention. In an environment characterized by widespread accessibility and big data, the feasibility of malware classification without the use of artificial intelligence-based techniques has been diminished exponentially. Also characteristic of the contemporary realm of automated, intelligent malware detection is the threat of adversarial machine learning. Adversaries are looking to target the underlying data and/or algorithm responsible for the functionality of malware classification to map its behavior or corrupt its functionality. The ends of such adversaries are bypassing the cyber security measures and increasing malware effectiveness. The focus of this research is the design of an intelligent systems approach using machine learning that can accurately and robustly classify malware under adversarial conditions. Such an outcome ultimately relies on increased flexibility and adaptability to build a model robust enough to identify attacks on the underlying algorithm.
Tasks	Malware Classification, Malware Detection
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03149v1
PDF	https://arxiv.org/pdf/1907.03149v1.pdf
PWC	https://paperswithcode.com/paper/intelligent-systems-design-for-malware
Repo
Framework

Ising-Dropout: A Regularization Method for Training and Compression of Deep Neural Networks


Title	Ising-Dropout: A Regularization Method for Training and Compression of Deep Neural Networks
Authors	Hojjat Salehinejad, Shahrokh Valaee
Abstract	Overfitting is a major problem in training machine learning models, specifically deep neural networks. This problem may be caused by imbalanced datasets and initialization of the model parameters, which conforms the model too closely to the training data and negatively affects the generalization performance of the model for unseen data. The original dropout is a regularization technique to drop hidden units randomly during training. In this paper, we propose an adaptive technique to wisely drop the visible and hidden units in a deep neural network using Ising energy of the network. The preliminary results show that the proposed approach can keep the classification performance competitive to the original network while eliminating optimization of unnecessary network parameters in each training cycle. The dropout state of units can also be applied to the trained (inference) model. This technique could compress the network in terms of number of parameters up to 41.18% and 55.86% for the classification task on the MNIST and Fashion-MNIST datasets, respectively.
Tasks
Published	2019-02-07
URL	http://arxiv.org/abs/1902.08673v1
PDF	http://arxiv.org/pdf/1902.08673v1.pdf
PWC	https://paperswithcode.com/paper/ising-dropout-a-regularization-method-for
Repo
Framework

Correlated bandits or: How to minimize mean-squared error online


Title	Correlated bandits or: How to minimize mean-squared error online
Authors	Vinay Praneeth Boda, Prashanth L. A
Abstract	While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means of the arms. Sensors placement for industrial surveillance and cellular network monitoring are a few applications, where the underlying correlation structure plays an important role. Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms. To this end, we derive first an MSE estimator, based on sample variances and covariances, and show that our estimator exponentially concentrates around the true MSE. Under a best-arm identification framework, we propose a successive rejects type algorithm and provide bounds on the probability of error in identifying the best arm. Using minmax theory, we also derive fundamental performance limits for the correlated bandit problem.
Tasks
Published	2019-02-08
URL	https://arxiv.org/abs/1902.02953v2
PDF	https://arxiv.org/pdf/1902.02953v2.pdf
PWC	https://paperswithcode.com/paper/correlated-bandits-or-how-to-minimize-mean
Repo
Framework

List-Decodable Linear Regression


Title	List-Decodable Linear Regression
Authors	Sushrut Karmalkar, Adam R. Klivans, Pravesh K. Kothari
Abstract	We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $\alpha < 1$, our algorithm takes as input a sample ${(x_i,y_i)}_{i \leq n}$ of $n$ linear equations where $\alpha n$ of the equations satisfy $y_i = \langle x_i,\ell^\rangle +\zeta$ for some small noise $\zeta$ and $(1-\alpha)n$ of the equations are {\em arbitrarily} chosen. It outputs a list $L$ of size $O(1/\alpha)$ - a fixed constant - that contains an $\ell$ that is close to $\ell^$. Our algorithm succeeds whenever the inliers are chosen from a \emph{certifiably} anti-concentrated distribution $D$. In particular, this gives a $(d/\alpha)^{O(1/\alpha^8)}$ time algorithm to find a $O(1/\alpha)$ size list when the inlier distribution is standard Gaussian. For discrete product distributions that are anti-concentrated only in \emph{regular} directions, we give an algorithm that achieves similar guarantee under the promise that $\ell^*$ has all coordinates of the same magnitude. To complement our result, we prove that the anti-concentration assumption on the inliers is information-theoretically necessary. Our algorithm is based on a new framework for list-decodable learning that strengthens the `identifiability to algorithms’ paradigm based on the sum-of-squares method. In an independent and concurrent work, Raghavendra and Yau also used the Sum-of-Squares method to give a similar result for list-decodable regression. \|
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05679v3
PDF	https://arxiv.org/pdf/1905.05679v3.pdf
PWC	https://paperswithcode.com/paper/list-decodable-linear-regression
Repo
Framework

Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings


Title	Additional Shared Decoder on Siamese Multi-view Encoders for Learning Acoustic Word Embeddings
Authors	Myunghun Jung, Hyungjun Lim, Jahyun Goo, Youngmoon Jung, Hoirin Kim
Abstract	Acoustic word embeddings — fixed-dimensional vector representations of arbitrary-length words — have attracted increasing interest in query-by-example spoken term detection. Recently, on the fact that the orthography of text labels partly reflects the phonetic similarity between the words’ pronunciation, a multi-view approach has been introduced that jointly learns acoustic and text embeddings. It showed that it is possible to learn discriminative embeddings by designing the objective which takes text labels as well as word segments. In this paper, we propose a network architecture that expands the multi-view approach by combining the Siamese multi-view encoders with a shared decoder network to maximize the effect of the relationship between acoustic and text embeddings in embedding space. Discriminatively trained with multi-view triplet loss and decoding loss, our proposed approach achieves better performance on acoustic word discrimination task with the WSJ dataset, resulting in 11.1% relative improvement in average precision. We also present experimental results on cross-view word discrimination and word level speech recognition tasks.
Tasks	Speech Recognition, Word Embeddings
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00341v1
PDF	https://arxiv.org/pdf/1910.00341v1.pdf
PWC	https://paperswithcode.com/paper/additional-shared-decoder-on-siamese-multi
Repo
Framework

Counterfactual Explanation Algorithms for Behavioral and Textual Data


Title	Counterfactual Explanation Algorithms for Behavioral and Textual Data
Authors	Yanou Ramon, David Martens, Foster Provost, Theodoros Evgeniou
Abstract	We study the interpretability of predictive systems that use high-dimensonal behavioral and textual data. Examples include predicting product interest based on online browsing data and detecting spam emails or objectionable web content. Recently, counterfactual explanations have been proposed for generating insight into model predictions, which focus on what is relevant to a particular instance. Conducting a complete search to compute counterfactuals is very time-consuming because of the huge dimensionality. To our knowledge, for behavioral and text data, only one model-agnostic heuristic algorithm (SEDC) for finding counterfactual explanations has been proposed in the literature. However, there may be better algorithms for finding counterfactuals quickly. This study aligns the recently proposed Linear Interpretable Model-agnostic Explainer (LIME) and Shapley Additive Explanations (SHAP) with the notion of counterfactual explanations, and empirically benchmarks their effectiveness and efficiency against SEDC using a collection of 13 data sets. Results show that LIME-Counterfactual (LIME-C) and SHAP-Counterfactual (SHAP-C) have low and stable computation times, but mostly, they are less efficient than SEDC. However, for certain instances on certain data sets, SEDC’s run time is comparably large. With regard to effectiveness, LIME-C and SHAP-C find reasonable, if not always optimal, counterfactual explanations. SHAP-C, however, seems to have difficulties with highly unbalanced data. Because of its good overall performance, LIME-C seems to be a favorable alternative to SEDC, which failed for some nonlinear models to find counterfactuals because of the particular heuristic search algorithm it uses. A main upshot of this paper is that there is a good deal of room for further research. For example, we propose algorithmic adjustments that are direct upshots of the paper’s findings.
Tasks
Published	2019-12-04
URL	https://arxiv.org/abs/1912.01819v1
PDF	https://arxiv.org/pdf/1912.01819v1.pdf
PWC	https://paperswithcode.com/paper/counterfactual-explanation-algorithms-for
Repo
Framework

End-to-End Multi-Channel Speech Separation


Title	End-to-End Multi-Channel Speech Separation
Authors	Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu
Abstract	The end-to-end approach for single-channel speech separation has been studied recently and shown promising results. This paper extended the previous approach and proposed a new end-to-end model for multi-channel speech separation. The primary contributions of this work include 1) an integrated waveform-in waveform-out separation system in a single neural network architecture. 2) We reformulate the traditional short time Fourier transform (STFT) and inter-channel phase difference (IPD) as a function of time-domain convolution with a special kernel. 3) We further relaxed those fixed kernels to be learnable, so that the entire architecture becomes purely data-driven and can be trained from end-to-end. We demonstrate on the WSJ0 far-field speech separation task that, with the benefit of learnable spatial features, our proposed end-to-end multi-channel model significantly improved the performance of previous end-to-end single-channel method and traditional multi-channel methods.
Tasks	Speech Separation
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06286v2
PDF	https://arxiv.org/pdf/1905.06286v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-multi-channel-speech-separation
Repo
Framework

Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning


Title	Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning
Authors	Nathan O. Lambert, Daniel S. Drew, Joseph Yaconelli, Roberto Calandra, Sergey Levine, Kristofer S. J. Pister
Abstract	Designing effective low-level robot controllers often entail platform-specific implementations that require manual heuristic parameter tuning, significant system knowledge, or long design times. With the rising number of robotic and mechatronic systems deployed across areas ranging from industrial automation to intelligent toys, the need for a general approach to generating low-level controllers is increasing. To address the challenge of rapidly generating low-level controllers, we argue for using model-based reinforcement learning (MBRL) trained on relatively small amounts of automatically generated (i.e., without system simulation) data. In this paper, we explore the capabilities of MBRL on a Crazyflie centimeter-scale quadrotor with rapid dynamics to predict and control at <50Hz. To our knowledge, this is the first use of MBRL for controlled hover of a quadrotor using only on-board sensors, direct motor input signals, and no initial dynamics knowledge. Our controller leverages rapid simulation of a neural network forward dynamics model on a GPU-enabled base station, which then transmits the best current action to the quadrotor firmware via radio. In our experiments, the quadrotor achieved hovering capability of up to 6 seconds with 3 minutes of experimental training data.
Tasks
Published	2019-01-11
URL	https://arxiv.org/abs/1901.03737v2
PDF	https://arxiv.org/pdf/1901.03737v2.pdf
PWC	https://paperswithcode.com/paper/low-level-control-of-a-quadrotor-with-deep
Repo
Framework

Unsupervised Image Translation using Adversarial Networks for Improved Plant Disease Recognition


Title	Unsupervised Image Translation using Adversarial Networks for Improved Plant Disease Recognition
Authors	Haseeb Nazki, Sook Yoon, Alvaro Fuentes, Dong Sun Park
Abstract	Acquisition of data in task-specific applications of machine learning like plant disease recognition is a costly endeavor owing to the requirements of professional human diligence and time constraints. In this paper, we present a simple pipeline that uses GANs in an unsupervised image translation environment to improve learning with respect to the data distribution in a plant disease dataset, reducing the partiality introduced by acute class imbalance and hence shifting the classification decision boundary towards better performance. The empirical analysis of our method is demonstrated on a limited dataset of 2789 tomato plant disease images, highly corrupted with an imbalance in the 9 disease categories. First, we extend the state of the art for the GAN-based image-to-image translation method by enhancing the perceptual quality of the generated images and preserving the semantics. We introduce AR-GAN, where in addition to the adversarial loss, our synthetic image generator optimizes on Activation Reconstruction loss (ARL) function that optimizes feature activations against the natural image. We present visually more compelling synthetic images in comparison to most prominent existing models and evaluate the performance of our GAN framework in terms of various datasets and metrics. Second, we evaluate the performance of a baseline convolutional neural network classifier for improved recognition using the resulting synthetic samples to augment our training set and compare it with the classical data augmentation scheme. We observe a significant improvement in classification accuracy (+5.2%) using generated synthetic samples as compared to (+0.8%) increase using classic augmentation in an equal class distribution environment.
Tasks	Data Augmentation, Image-to-Image Translation
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11915v1
PDF	https://arxiv.org/pdf/1909.11915v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-image-translation-using
Repo
Framework

Enhancing Traffic Scene Predictions with Generative Adversarial Networks


Title	Enhancing Traffic Scene Predictions with Generative Adversarial Networks
Authors	Peter König, Sandra Aigner, Marco Körner
Abstract	We present a new two-stage pipeline for predicting frames of traffic scenes where relevant objects can still reliably be detected. Using a recent video prediction network, we first generate a sequence of future frames based on past frames. A second network then enhances these frames in order to make them appear more realistic. This ensures the quality of the predicted frames to be sufficient to enable accurate detection of objects, which is especially important for autonomously driving cars. To verify this two-stage approach, we conducted experiments on the Cityscapes dataset. For enhancing, we trained two image-to-image translation methods based on generative adversarial networks, one for blind motion deblurring and one for image super-resolution. All resulting predictions were quantitatively evaluated using both traditional metrics and a state-of-the-art object detection network showing that the enhanced frames appear qualitatively improved. While the traditional image comparison metrics, i.e., MSE, PSNR, and SSIM, failed to confirm this visual impression, the object detection evaluation resembles it well. The best performing prediction-enhancement pipeline is able to increase the average precision values for detecting cars by about 9% for each prediction step, compared to the non-enhanced predictions.
Tasks	Deblurring, Image Super-Resolution, Image-to-Image Translation, Object Detection, Super-Resolution, Video Prediction
Published	2019-09-24
URL	https://arxiv.org/abs/1909.10833v1
PDF	https://arxiv.org/pdf/1909.10833v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-traffic-scene-predictions-with
Repo
Framework

Robust GPU-based Virtual Reality Simulation of Radio Frequency Ablations for Various Needle Geometries and Locations


Title	Robust GPU-based Virtual Reality Simulation of Radio Frequency Ablations for Various Needle Geometries and Locations
Authors	Niclas Kath, Heinz Handels, Andre Mastmeyer
Abstract	Purpose: Radio-frequency ablations play an important role in the therapy of malignant liver lesions. The navigation of a needle to the lesion poses a challenge for both the trainees and intervening physicians. Methods: This publication presents a new GPU-based, accurate method for the simulation of radio-frequency ablations for lesions at the needle tip in general and for an existing visuo-haptic 4D VR simulator. The method is implemented real-time capable with Nvidia CUDA. Results: It performs better than a literature method concerning the theoretical characteristic of monotonic convergence of the bioheat PDE and a in vitro gold standard with significant improvements (p < 0.05) in terms of Pearson correlations. It shows no failure modes or theoretically inconsistent individual simulation results after the initial phase of 10 seconds. On the Nvidia 1080 Ti GPU it achieves a very high frame rendering performance of >480 Hz. Conclusion: Our method provides a more robust and safer real-time ablation planning and intraoperative guidance technique, especially avoiding the over-estimation of the ablated tissue death zone, which is risky for the patient in terms of tumor recurrence. Future in vitro measurements and optimization shall further improve the conservative estimate.
Tasks
Published	2019-07-11
URL	https://arxiv.org/abs/1907.05709v1
PDF	https://arxiv.org/pdf/1907.05709v1.pdf
PWC	https://paperswithcode.com/paper/robust-gpu-based-virtual-reality-simulation
Repo
Framework

Multi Scale Supervised 3D U-Net for Kidney and Tumor Segmentation


Title	Multi Scale Supervised 3D U-Net for Kidney and Tumor Segmentation
Authors	Wenshuai Zhao, Zengfeng Zeng
Abstract	U-Net has achieved huge success in various medical image segmentation challenges. Kinds of new architectures with bells and whistles might succeed in certain dataset when employed with optimal hyper-parameter, but their generalization always can’t be guaranteed. Here, we focused on the basic U-Net architecture and proposed a multi scale supervised 3D U-Net for the segmentation task in KiTS19 challenge. To enhance the performance, our work can be summarized as three folds: first, we used multi scale supervision in the decoder pathway, which could encourage the network to predict right results from the deep layers; second, with the aim to alleviate the bad effect from the sample imbalance of kidney and tumor, we adopted exponential logarithmic loss; third, a connected-component based post processing method was designed to remove the obviously wrong voxels. In the published KiTS19 training dataset (totally 210 patients), we divided 42 patients to be test dataset and finally obtained DICE scores of 0.969 and 0.805 for the kidney and tumor respectively. In the challenge, we finally achieved the 7th place among 106 teams with the Composite Dice of 0.8961, namely 0.9741 for kidney and 0.8181 for tumor.
Tasks	Medical Image Segmentation, Semantic Segmentation
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03204v2
PDF	https://arxiv.org/pdf/1908.03204v2.pdf
PWC	https://paperswithcode.com/paper/multi-scale-supervised-3d-u-net-for-kidney
Repo
Framework