January 29, 2020

3203 words 16 mins read

Paper Group ANR 631

Survey of Dropout Methods for Deep Neural Networks. Deep learning for image segmentation-a short survey. Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting. Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data. N2VSCDNNR: A Local Recommender System Based on Node2vec and Rich Info …

Survey of Dropout Methods for Deep Neural Networks


Title	Survey of Dropout Methods for Deep Neural Networks
Authors	Alex Labach, Hojjat Salehinejad, Shahrokh Valaee
Abstract	Dropout methods are a family of stochastic techniques used in neural network training or inference that have generated significant research interest and are widely used in practice. They have been successfully applied in neural network regularization, model compression, and in measuring the uncertainty of neural network outputs. While original formulated for dense neural network layers, recent advances have made dropout methods also applicable to convolutional and recurrent neural network layers. This paper summarizes the history of dropout methods, their various applications, and current areas of research interest. Important proposed methods are described in additional detail.
Tasks	Model Compression
Published	2019-04-25
URL	https://arxiv.org/abs/1904.13310v2
PDF	https://arxiv.org/pdf/1904.13310v2.pdf
PWC	https://paperswithcode.com/paper/survey-of-dropout-methods-for-deep-neural
Repo
Framework

Deep learning for image segmentation-a short survey


Title	Deep learning for image segmentation-a short survey
Authors	Zhenzhou Wang
Abstract	Deep learning works as a discrete non-linear mapping function and has achieved great success as a powerful classification tool. However, is deep learning omnipotent? This paper gives a short survey of the accuracy achieved by deep learning so far in image segmentation. Compared to the close to 100% classification accuracy achieved by deep learning, the image segmentation accuracy achieved by deep learning is only about 80%. We analyze the possible reasons why deep learning could not achieve acceptable accuracy in image segmentation and found that deep learning only generates a prediction map and relies on other segmentation methods to complete the segmentation task. In addition, the performance of deep learning is determined by the number of outputs. Consequently, deep learning could not achieve high segmentation accuracy unless the resolution of the image is extremely small.
Tasks	Semantic Segmentation
Published	2019-04-16
URL	https://arxiv.org/abs/1904.08483v2
PDF	https://arxiv.org/pdf/1904.08483v2.pdf
PWC	https://paperswithcode.com/paper/is-deep-learning-a-good-choice-for-image
Repo
Framework

Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting


Title	Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting
Authors	Arindam Jati, Amrutha Nadarajan, Karel Mundnich, Shrikanth Narayanan
Abstract	Devices capable of detecting and categorizing acoustic scenes have numerous applications such as providing context-aware user experiences. In this paper, we address the task of characterizing acoustic scenes in a workplace setting from audio recordings collected with wearable microphones. The acoustic scenes, tracked with Bluetooth transceivers, vary dynamically with time from the egocentric perspective of a mobile user. Our dataset contains experience sampled long audio recordings collected from clinical providers in a hospital, who wore the audio badges during multiple work shifts. To handle the long egocentric recordings, we propose a Time Delay Neural Network~(TDNN)-based segment-level modeling. The experiments show that TDNN outperforms other models in the acoustic scene classification task. We investigate the effect of primary speaker’s speech in determining acoustic scenes from audio badges, and provide a comparison between performance of different models. Moreover, we explore the relationship between the sequence of acoustic scenes experienced by the users and the nature of their jobs, and find that the scene sequence predicted by our model tend to possess similar relationship. The initial promising results reveal numerous research directions for acoustic scene classification via wearable devices as well as egocentric analysis of dynamic acoustic scenes encountered by the users.
Tasks	Acoustic Scene Classification, Scene Classification
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03843v1
PDF	https://arxiv.org/pdf/1911.03843v1.pdf
PWC	https://paperswithcode.com/paper/characterizing-dynamically-varying-acoustic
Repo
Framework

Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data


Title	Optimal Combination Forecasts on Retail Multi-Dimensional Sales Data
Authors	Luis Roque, Cristina A. C. Fernandes, Tony Silva
Abstract	Time series data in the retail world are particularly rich in terms of dimensionality, and these dimensions can be aggregated in groups or hierarchies. Valuable information is nested in these complex structures, which helps to predict the aggregated time series data. From a portfolio of brands under HUUB’s monitoring, we selected two to explore their sales behaviour, leveraging the grouping properties of their product structure. Using statistical models, namely SARIMA, to forecast each level of the hierarchy, an optimal combination approach was used to generate more consistent forecasts in the higher levels. Our results show that the proposed methods can indeed capture nested information in the more granular series, helping to improve the forecast accuracy of the aggregated series. The Weighted Least Squares (WLS) method surpasses all other methods proposed in the study, including the Minimum Trace (MinT) reconciliation.
Tasks	Time Series
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09478v1
PDF	http://arxiv.org/pdf/1903.09478v1.pdf
PWC	https://paperswithcode.com/paper/optimal-combination-forecasts-on-retail-multi
Repo
Framework

N2VSCDNNR: A Local Recommender System Based on Node2vec and Rich Information Network


Title	N2VSCDNNR: A Local Recommender System Based on Node2vec and Rich Information Network
Authors	Jinyin Chen, Yangyang Wu, Lu Fan, Xiang Lin, Haibin Zheng, Shanqing Yu, Qi Xuan
Abstract	Recommender systems are becoming more and more important in our daily lives. However, traditional recommendation methods are challenged by data sparsity and efficiency, as the numbers of users, items, and interactions between the two in many real-world applications increase fast. In this work, we propose a novel clustering recommender system based on node2vec technology and rich information network, namely N2VSCDNNR, to solve these challenges. In particular, we use a bipartite network to construct the user-item network, and represent the interactions among users (or items) by the corresponding one-mode projection network. In order to alleviate the data sparsity problem, we enrich the network structure according to user and item categories, and construct the one-mode projection category network. Then, considering the data sparsity problem in the network, we employ node2vec to capture the complex latent relationships among users (or items) from the corresponding one-mode projection category network. Moreover, considering the dependency on parameter settings and information loss problem in clustering methods, we use a novel spectral clustering method, which is based on dynamic nearest-neighbors (DNN) and a novel automatically determining cluster number (ADCN) method that determines the cluster centers based on the normal distribution method, to cluster the users and items separately. After clustering, we propose the two-phase personalized recommendation to realize the personalized recommendation of items for each user. A series of experiments validate the outstanding performance of our N2VSCDNNR over several advanced embedding and side information based recommendation algorithms. Meanwhile, N2VSCDNNR seems to have lower time complexity than the baseline methods in online recommendations, indicating its potential to be widely applied in large-scale systems.
Tasks	Recommendation Systems
Published	2019-04-12
URL	http://arxiv.org/abs/1904.12605v1
PDF	http://arxiv.org/pdf/1904.12605v1.pdf
PWC	https://paperswithcode.com/paper/190412605
Repo
Framework

Ego-motion Sensor for Unmanned Aerial Vehicles Based on a Single-Board Computer


Title	Ego-motion Sensor for Unmanned Aerial Vehicles Based on a Single-Board Computer
Authors	Gaël Écorchard, Adam Heinrich, Libor Přeučil
Abstract	This paper describes the design and implementation of a ground-related odometry sensor suitable for micro aerial vehicles. The sensor is based on a ground-facing camera and a single-board Linux-based embedded computer with a multimedia System on a Chip (SoC). The SoC features a hardware video encoder which is used to estimate the optical flow online. The optical flow is then used in combination with a distance sensor to estimate the vehicle’s velocity. The proposed sensor is compared to a similar existing solution and evaluated in both indoor and outdoor environments.
Tasks	Optical Flow Estimation
Published	2019-01-22
URL	http://arxiv.org/abs/1901.07278v1
PDF	http://arxiv.org/pdf/1901.07278v1.pdf
PWC	https://paperswithcode.com/paper/ego-motion-sensor-for-unmanned-aerial
Repo
Framework

GP-HD: Using Genetic Programming to Generate Dynamical Systems Models for Health Care


Title	GP-HD: Using Genetic Programming to Generate Dynamical Systems Models for Health Care
Authors	Mark Hoogendoorn, Ward van Breda, Jeroen Ruwaard
Abstract	The huge wealth of data in the health domain can be exploited to create models that predict development of health states over time. Temporal learning algorithms are well suited to learn relationships between health states and make predictions about their future developments. However, these algorithms: (1) either focus on learning one generic model for all patients, providing general insights but often with limited predictive performance, or (2) learn individualized models from which it is hard to derive generic concepts. In this paper, we present a middle ground, namely parameterized dynamical systems models that are generated from data using a Genetic Programming (GP) framework. A fitness function suitable for the health domain is exploited. An evaluation of the approach in the mental health domain shows that performance of the model generated by the GP is on par with a dynamical systems model developed based on domain knowledge, significantly outperforms a generic Long Term Short Term Memory (LSTM) model and in some cases also outperforms an individualized LSTM model.
Tasks
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05815v1
PDF	http://arxiv.org/pdf/1904.05815v1.pdf
PWC	https://paperswithcode.com/paper/gp-hd-using-genetic-programming-to-generate
Repo
Framework

Data-Driven Machine Learning Techniques for Self-healing in Cellular Wireless Networks: Challenges and Solutions


Title	Data-Driven Machine Learning Techniques for Self-healing in Cellular Wireless Networks: Challenges and Solutions
Authors	Tao Zhang, Kun Zhu, Ekram Hossain
Abstract	For enabling automatic deployment and management of cellular networks, the concept of self-organizing network (SON) was introduced. SON capabilities can enhance network performance, improve service quality, and reduce operational and capital expenditure (OPEX/CAPEX). As an important component in SON, self-healing is defined as a network paradigm where the faults of target networks are mitigated or recovered by automatically triggering a series of actions such as detection, diagnosis and compensation. Data-driven machine learning has been recognized as a powerful tool to bring intelligence into network and to realize self-healing. However, there are major challenges for practical applications of machine learning techniques for self-healing. In this article, we first classify these challenges into five categories: 1) data imbalance, 2) data insufficiency, 3) cost insensitivity, 4) non-real-time response, and 5) multi-source data fusion. Then we provide potential technical solutions to address these challenges. Furthermore, a case study of cost-sensitive fault detection with imbalanced data is provided to illustrate the feasibility and effectiveness of the suggested solutions.
Tasks	Fault Detection
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06357v1
PDF	https://arxiv.org/pdf/1906.06357v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-machine-learning-techniques-for
Repo
Framework

Photorealistic Material Editing Through Direct Image Manipulation


Title	Photorealistic Material Editing Through Direct Image Manipulation
Authors	Károly Zsolnai-Fehér, Peter Wonka, Michael Wimmer
Abstract	Creating photorealistic materials for light transport algorithms requires carefully fine-tuning a set of material properties to achieve a desired artistic effect. This is typically a lengthy process that involves a trained artist with specialized knowledge. In this work, we present a technique that aims to empower novice and intermediate-level users to synthesize high-quality photorealistic materials by only requiring basic image processing knowledge. In the proposed workflow, the user starts with an input image and applies a few intuitive transforms (e.g., colorization, image inpainting) within a 2D image editor of their choice, and in the next step, our technique produces a photorealistic result that approximates this target image. Our method combines the advantages of a neural network-augmented optimizer and an encoder neural network to produce high-quality output results within 30 seconds. We also demonstrate that it is resilient against poorly-edited target images and propose a simple extension to predict image sequences with a strict time budget of 1-2 seconds per image.
Tasks	Colorization, Image Inpainting
Published	2019-09-12
URL	https://arxiv.org/abs/1909.11622v1
PDF	https://arxiv.org/pdf/1909.11622v1.pdf
PWC	https://paperswithcode.com/paper/photorealistic-material-editing-through
Repo
Framework

Nonconvex Stochastic Nested Optimization via Stochastic ADMM


Title	Nonconvex Stochastic Nested Optimization via Stochastic ADMM
Authors	Zhongruo Wang
Abstract	We consider the stochastic nested composition optimization problem where the objective is a composition of two expected-value functions. We proposed the stochastic ADMM to solve this complicated objective. In order to find an $\epsilon$ stationary point where the expected norm of the subgradient of corresponding augmented Lagrangian is smaller than $\epsilon$, the total sample complexity of our method is $\mathcal{O}(\epsilon^{-3})$ for the online case and $\mathcal{O} \Bigl((2N_1 + N_2) + (2N_1 + N_2)^{1/2}\epsilon^{-2}\Bigr)$ for the finite sum case. The computational complexity is consistent with proximal version proposed in \cite{zhang2019multi}, but our algorithm can solve more general problem when the proximal mapping of the penalty is not easy to compute.
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.05167v1
PDF	https://arxiv.org/pdf/1911.05167v1.pdf
PWC	https://paperswithcode.com/paper/nonconvex-stochastic-nested-optimization-via
Repo
Framework

Perception-in-the-Loop Adversarial Examples


Title	Perception-in-the-Loop Adversarial Examples
Authors	Mahmoud Salamati, Sadegh Soudjani, Rupak Majumdar
Abstract	We present a scalable, black box, perception-in-the-loop technique to find adversarial examples for deep neural network classifiers. Black box means that our procedure only has input-output access to the classifier, and not to the internal structure, parameters, or intermediate confidence values. Perception-in-the-loop means that the notion of proximity between inputs can be directly queried from human participants rather than an arbitrarily chosen metric. Our technique is based on covariance matrix adaptation evolution strategy (CMA-ES), a black box optimization approach. CMA-ES explores the search space iteratively in a black box manner, by generating populations of candidates according to a distribution, choosing the best candidates according to a cost function, and updating the posterior distribution to favor the best candidates. We run CMA-ES using human participants to provide the fitness function, using the insight that the choice of best candidates in CMA-ES can be naturally modeled as a perception task: pick the top $k$ inputs perceptually closest to a fixed input. We empirically demonstrate that finding adversarial examples is feasible using small populations and few iterations. We compare the performance of CMA-ES on the MNIST benchmark with other black-box approaches using $L_p$ norms as a cost function, and show that it performs favorably both in terms of success in finding adversarial examples and in minimizing the distance between the original and the adversarial input. In experiments on the MNIST, CIFAR10, and GTSRB benchmarks, we demonstrate that CMA-ES can find perceptually similar adversarial inputs with a small number of iterations and small population sizes when using perception-in-the-loop. Finally, we show that networks trained specifically to be robust against $L_\infty$ norm can still be susceptible to perceptually similar adversarial examples.
Tasks
Published	2019-01-21
URL	http://arxiv.org/abs/1901.06834v1
PDF	http://arxiv.org/pdf/1901.06834v1.pdf
PWC	https://paperswithcode.com/paper/perception-in-the-loop-adversarial-examples
Repo
Framework

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition


Title	Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Authors	Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan
Abstract	This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture. To support the development of such a system, we built a large audio-visual (A/V) dataset of segmented utterances extracted from YouTube public videos, leading to 31k hours of audio-visual training content. The performance of an audio-only, visual-only, and audio-visual system are compared on two large-vocabulary test sets: a set of utterance segments from public YouTube videos called YTDEV18 and the publicly available LRS3-TED set. To highlight the contribution of the visual modality, we also evaluated the performance of our system on the YTDEV18 set artificially corrupted with background noise and overlapping speech. To the best of our knowledge, our system significantly improves the state-of-the-art on the LRS3-TED set.
Tasks	Audio-Visual Speech Recognition, Speech Recognition, Visual Speech Recognition
Published	2019-11-08
URL	https://arxiv.org/abs/1911.04890v1
PDF	https://arxiv.org/pdf/1911.04890v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-network-transducer-for-audio
Repo
Framework

Deep Learning for MIMO Channel Estimation: Interpretation, Performance, and Comparison


Title	Deep Learning for MIMO Channel Estimation: Interpretation, Performance, and Comparison
Authors	Hu Qiang, Gao Feifei, Zhang Hao, Jin Shi, Li Geoffrey Ye
Abstract	Deep learning (DL) has emerged as an effective tool for channel estimation in wireless communication systems, especially under some imperfect environments. However, even with such unprecedented success, DL methods still serve as black boxes and the lack of explanations on their internal mechanism severely limits further improvement and extension. In this paper, we present a preliminary theoretical analysis on DL based channel estimation for multiple-antenna systems to understand and interpret its internal mechanism. Deep neural network (DNN) with rectified linear unit (ReLU) activation function is mathematically equivalent to a set of local linear functions corresponding to different input regions. Hence, the DL estimator built on it can achieve universal approximation to a large family of functions by making efficient use of piecewise linearity. We demonstrate that DL based channel estimation does not restrict to any specific signal model and will approach to the minimum mean-squared error (MMSE) estimation in various scenarios without requiring any prior knowledge of channel statistics. Therefore, DL based channel estimation outperforms or is comparable with traditional channel estimation. Simulation results confirm the accuracy of the proposed interpretation and demonstrate the effectiveness of DL based channel estimation under both linear and nonlinear signal models.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.01918v1
PDF	https://arxiv.org/pdf/1911.01918v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-mimo-channel-estimation
Repo
Framework

Quantized Epoch-SGD for Communication-Efficient Distributed Learning


Title	Quantized Epoch-SGD for Communication-Efficient Distributed Learning
Authors	Shen-Yi Zhao, Hao Gao, Wu-Jun Li
Abstract	Due to its efficiency and ease to implement, stochastic gradient descent (SGD) has been widely used in machine learning. In particular, SGD is one of the most popular optimization methods for distributed learning. Recently, quantized SGD (QSGD), which adopts quantization to reduce the communication cost in SGD-based distributed learning, has attracted much attention. Although several QSGD methods have been proposed, some of them are heuristic without theoretical guarantee, and others have high quantization variance which makes the convergence become slow. In this paper, we propose a new method, called Quantized Epoch-SGD (QESGD), for communication-efficient distributed learning. QESGD compresses (quantizes) the parameter with variance reduction, so that it can get almost the same performance as that of SGD with less communication cost. QESGD is implemented on the Parameter Server framework, and empirical results on distributed deep learning show that QESGD can outperform other state-of-the-art quantization methods to achieve the best performance.
Tasks	Quantization
Published	2019-01-10
URL	http://arxiv.org/abs/1901.03040v1
PDF	http://arxiv.org/pdf/1901.03040v1.pdf
PWC	https://paperswithcode.com/paper/quantized-epoch-sgd-for-communication
Repo
Framework

Optimal Experimental Design for Staggered Rollouts


Title	Optimal Experimental Design for Staggered Rollouts
Authors	Ruoxuan Xiong, Susan Athey, Mohsen Bayati, Guido Imbens
Abstract	Experimentation has become an increasingly prevalent tool for guiding policy choices, firm decisions, and product innovation. A common hurdle in designing experiments is the lack of statistical power. In this paper, we study optimal multi-period experimental design under the constraint that the treatment cannot be easily removed once implemented; for example, a government or firm might implement treatment in different geographies at different times, where the treatment cannot be easily removed due to practical constraints. The design problem is to select which units to treat at which time, intending to test hypotheses about the effect of the treatment. When the potential outcome is a linear function of a unit effect, a time effect, and observed discrete covariates, we provide an analytically feasible solution to the design problem where the variance of the estimator for the treatment effect is at most 1+O(1/N^2) times the variance of the optimal design, where N is the number of units. This solution assigns units in a staggered treatment adoption pattern, where the proportion treated is a linear function of time. In the general setting where outcomes depend on latent covariates, we show that historical data can be utilized in the optimal design. We propose a data-driven local search algorithm with the minimax decision criterion to assign units to treatment times. We demonstrate that our approach improves upon benchmark experimental designs through synthetic experiments on real-world data sets from several domains, including healthcare, finance, and retail. Finally, we consider the case where the treatment effect changes with the time of treatment, showing that the optimal design treats a smaller fraction of units at the beginning and a greater share at the end.
Tasks
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03764v1
PDF	https://arxiv.org/pdf/1911.03764v1.pdf
PWC	https://paperswithcode.com/paper/optimal-experimental-design-for-staggered
Repo
Framework