January 27, 2020

3433 words 17 mins read

Paper Group ANR 1102

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares. Column generation for the discrete Unit Commitment problem with min-stop ramping constraints. Automated Circuit Approximation Method Driven by Data Distribution. Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture …

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares


Title	The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares
Authors	Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli
Abstract	Minimax optimal convergence rates for classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes. In contrast, SGD’s final iterate behavior has received much less attention despite their widespread use in practice. Motivated by this observation, this work provides a detailed study of the following question: what rate is achievable using the final iterate of SGD for the streaming least squares regression problem with and without strong convexity? First, this work shows that even if the time horizon T (i.e. the number of iterations SGD is run for) is known in advance, SGD’s final iterate behavior with any polynomially decaying learning rate scheme is highly sub-optimal compared to the minimax rate (by a condition number factor in the strongly convex case and a factor of $\sqrt{T}$ in the non-strongly convex case). In contrast, this paper shows that Step Decay schedules, which cut the learning rate by a constant factor every constant number of epochs (i.e., the learning rate decays geometrically) offers significant improvements over any polynomially decaying step sizes. In particular, the final iterate behavior with a step decay schedule is off the minimax rate by only $log$ factors (in the condition number for strongly convex case, and in T for the non-strongly convex case). Finally, in stark contrast to the known horizon case, this paper shows that the anytime (i.e. the limiting) behavior of SGD’s final iterate is poor (in that it queries iterates with highly sub-optimal function value infinitely often, i.e. in a limsup sense) irrespective of the stepsizes employed. These results demonstrate the subtlety in establishing optimal learning rate schemes (for the final iterate) for stochastic gradient procedures in fixed time horizon settings.
Tasks	Stochastic Optimization
Published	2019-04-29
URL	https://arxiv.org/abs/1904.12838v2
PDF	https://arxiv.org/pdf/1904.12838v2.pdf
PWC	https://paperswithcode.com/paper/the-step-decay-schedule-a-near-optimal
Repo
Framework

Column generation for the discrete Unit Commitment problem with min-stop ramping constraints


Title	Column generation for the discrete Unit Commitment problem with min-stop ramping constraints
Authors	Nicolas Dupin
Abstract	The discrete unit commitment problem with min-stop ramping constraints optimizes the daily production of thermal power plants (coal, gas, fuel units). For this problem, compact Integer Linear Programming (ILP) formulations have been designed to solve exactly small instances and heuristically real-size instances. This paper investigates whether Dantzig-Wolfe reformulation allows to improve the previous exact method and matheuristics. The extended ILP formulation is presented with the column generation algorithm to solve its linear relaxation. The experimental results show that the Dantzig-Wolfe reformulation does not improve the quality of the linear relaxation of the tightest compact ILP formulations. Computational experiments suggest also a conjecture which would explain such result: the compact ILP formulation of min-stop ramping constraints would be tight. Such results validate the quality of the exact methods and matheuristics based on compact ILP formulations previously designed.
Tasks
Published	2019-12-13
URL	https://arxiv.org/abs/1912.09255v1
PDF	https://arxiv.org/pdf/1912.09255v1.pdf
PWC	https://paperswithcode.com/paper/column-generation-for-the-discrete-unit
Repo
Framework

Automated Circuit Approximation Method Driven by Data Distribution


Title	Automated Circuit Approximation Method Driven by Data Distribution
Authors	Zdenek Vasicek, Vojtech Mrazek, Lukas Sekanina
Abstract	We propose an application-tailored data-driven fully automated method for functional approximation of combinational circuits. We demonstrate how an application-level error metric such as the classification accuracy can be translated to a component-level error metric needed for an efficient and fast search in the space of approximate low-level components that are used in the application. This is possible by employing a weighted mean error distance (WMED) metric for steering the circuit approximation process which is conducted by means of genetic programming. WMED introduces a set of weights (calculated from the data distribution measured on a selected signal in a given application) determining the importance of each input vector for the approximation process. The method is evaluated using synthetic benchmarks and application-specific approximate MAC (multiply-and-accumulate) units that are designed to provide the best trade-offs between the classification accuracy and power consumption of two image classifiers based on neural networks.
Tasks
Published	2019-03-11
URL	http://arxiv.org/abs/1903.04188v1
PDF	http://arxiv.org/pdf/1903.04188v1.pdf
PWC	https://paperswithcode.com/paper/automated-circuit-approximation-method-driven
Repo
Framework

Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture Model


Title	Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture Model
Authors	Yoshiaki Bando, Yoko Sasaki, Kazuyoshi Yoshii
Abstract	This paper presents an unsupervised method that trains neural source separation by using only multichannel mixture signals. Conventional neural separation methods require a lot of supervised data to achieve excellent performance. Although multichannel methods based on spatial information can work without such training data, they are often sensitive to parameter initialization and degraded with the sources located close to each other. The proposed method uses a cost function based on a spatial model called a complex Gaussian mixture model (cGMM). This model has the time-frequency (TF) masks and direction of arrivals (DoAs) of sources as latent variables and is used for training separation and localization networks that respectively estimate these variables. This joint training solves the frequency permutation ambiguity of the spatial model in a unified deep Bayesian framework. In addition, the pre-trained network can be used not only for conducting monaural separation but also for efficiently initializing a multichannel separation algorithm. Experimental results with simulated speech mixtures showed that our method outperformed a conventional initialization method.
Tasks
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11307v1
PDF	https://arxiv.org/pdf/1908.11307v1.pdf
PWC	https://paperswithcode.com/paper/deep-bayesian-unsupervised-source-separation
Repo
Framework

Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising


Title	Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising
Authors	Haokui Zhang, Ying Li, Hao Chen, Chunhua Shen
Abstract	Recently, neural architecture search (NAS) methods have attracted much attention and outperformed manually designed architectures on a few high-level vision tasks. In this paper, we propose HiNAS (Hierarchical NAS), an effort towards employing NAS to automatically design effective neural network architectures for image denoising. HiNAS adopts gradient based search strategies and employs operations with adaptive receptive field to build an flexible hierarchical search space. During the search stage, HiNAS shares cells across different feature levels to save memory and employ an early stopping strategy to avoid the collapse issue in NAS, and considerably accelerate the search speed. The proposed HiNAS is both memory and computation efficient, which takes only about 4.5 hours for searching using a single GPU. We evaluate the effectiveness of our proposed HiNAS on two different datasets, namely an additive white Gaussian noise dataset BSD500, and a realistic noise dataset SIM1800. Experimental results show that the architecture found by HiNAS has fewer parameters and enjoys a faster inference speed, while achieving highly competitive performance compared with state-of-the-art methods. We also present analysis on the architectures found by NAS. HiNAS also shows good performance on experiments for image de-raining.
Tasks	Denoising, Image Denoising, Image Restoration, Neural Architecture Search
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08228v3
PDF	https://arxiv.org/pdf/1909.08228v3.pdf
PWC	https://paperswithcode.com/paper/ir-nas-neural-architecture-search-for-image
Repo
Framework

A Machine Learning Solution for Beam Tracking in mmWave Systems


Title	A Machine Learning Solution for Beam Tracking in mmWave Systems
Authors	Daoud Burghal, Naveed A. Abbasi, Andreas F. Molisch
Abstract	Utilizing millimeter-wave (mmWave) frequencies for wireless communication in \emph{mobile} systems is challenging since it requires continuous tracking of the beam direction. Recently, beam tracking techniques based on channel sparsity and/or Kalman filter-based techniques were proposed where the solutions use assumptions regarding the environment and device mobility that may not hold in practical scenarios. In this paper, we explore a machine learning-based approach to track the angle of arrival (AoA) for specific paths in realistic scenarios. In particular, we use a recurrent neural network (R-NN) structure with a modified cost function to track the AoA. We propose methods to train the network in sequential data, and study the performance of our proposed solution in comparison to an extended Kalman filter based solution in a realistic mmWave scenario based on stochastic channel model from the QuaDRiGa framework. Results show that our proposed solution outperforms an extended Kalman filter-based method by reducing the AoA outage probability, and thus reducing the need for frequent beam search.
Tasks
Published	2019-12-29
URL	https://arxiv.org/abs/2001.01574v1
PDF	https://arxiv.org/pdf/2001.01574v1.pdf
PWC	https://paperswithcode.com/paper/a-machine-learning-solution-for-beam-tracking
Repo
Framework

Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples


Title	Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples
Authors	Hossein Hosseini, Sreeram Kannan, Radha Poovendran
Abstract	Deep learning classifiers are known to be vulnerable to adversarial examples. A recent paper presented at ICML 2019 proposed a statistical test detection method based on the observation that logits of noisy adversarial examples are biased toward the true class. The method is evaluated on CIFAR-10 dataset and is shown to achieve 99% true positive rate (TPR) at only 1% false positive rate (FPR). In this paper, we first develop a classifier-based adaptation of the statistical test method and show that it improves the detection performance. We then propose Logit Mimicry Attack method to generate adversarial examples such that their logits mimic those of benign images. We show that our attack bypasses both statistical test and classifier-based methods, reducing their TPR to less than 2:2% and 1:6%, respectively, even at 5% FPR. We finally show that a classifier-based detector that is trained with logits of mimicry adversarial examples can be evaded by an adaptive attacker that specifically targets the detector. Furthermore, even a detector that is iteratively trained to defend against adaptive attacker cannot be made robust, indicating that statistics of logits cannot be used to detect adversarial examples.
Tasks
Published	2019-07-28
URL	https://arxiv.org/abs/1907.12138v1
PDF	https://arxiv.org/pdf/1907.12138v1.pdf
PWC	https://paperswithcode.com/paper/are-odds-really-odd-bypassing-statistical
Repo
Framework


Title	Semantic Role Labeling with Iterative Structure Refinement
Authors	Chunchuan Lyu, Shay B. Cohen, Ivan Titov
Abstract	Modern state-of-the-art Semantic Role Labeling (SRL) methods rely on expressive sentence encoders (e.g., multi-layer LSTMs) but tend to model only local (if any) interactions between individual argument labeling decisions. This contrasts with earlier work and also with the intuition that the labels of individual arguments are strongly interdependent. We model interactions between argument labeling decisions through {\it iterative refinement}. Starting with an output produced by a factorized model, we iteratively refine it using a refinement network. Instead of modeling arbitrary interactions among roles and words, we encode prior knowledge about the SRL problem by designing a restricted network architecture capturing non-local interactions. This modeling choice prevents overfitting and results in an effective model, outperforming strong factorized baseline models on all 7 CoNLL-2009 languages, and achieving state-of-the-art results on 5 of them, including English.
Tasks	Semantic Role Labeling
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03285v1
PDF	https://arxiv.org/pdf/1909.03285v1.pdf
PWC	https://paperswithcode.com/paper/semantic-role-labeling-with-iterative
Repo
Framework

MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors


Title	MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors
Authors	Royson Lee, Stylianos I. Venieris, Łukasz Dudziak, Sourav Bhattacharya, Nicholas D. Lane
Abstract	In recent years, convolutional networks have demonstrated unprecedented performance in the image restoration task of super-resolution (SR). SR entails the upscaling of a single low-resolution image in order to meet application-specific image quality demands and plays a key role in mobile devices. To comply with privacy regulations and reduce the overhead of cloud computing, executing SR models locally on-device constitutes a key alternative approach. Nevertheless, the excessive compute and memory requirements of SR workloads pose a challenge in mapping SR networks on resource-constrained mobile platforms. This work presents MobiSR, a novel framework for performing efficient super-resolution on-device. Given a target mobile platform, the proposed framework considers popular model compression techniques and traverses the design space to reach the highest performing trade-off between image quality and processing speed. At run time, a novel scheduler dispatches incoming image patches to the appropriate model-engine pair based on the patch’s estimated upscaling difficulty in order to meet the required image quality with minimum processing latency. Quantitative evaluation shows that the proposed framework yields on-device SR designs that achieve an average speedup of 2.13x over highly-optimized parallel difficulty-unaware mappings and 4.79x over highly-optimized single compute engine implementations.
Tasks	Image Restoration, Model Compression, Super-Resolution
Published	2019-08-21
URL	https://arxiv.org/abs/1908.07985v1
PDF	https://arxiv.org/pdf/1908.07985v1.pdf
PWC	https://paperswithcode.com/paper/190807985
Repo
Framework

DropConnect Is Effective in Modeling Uncertainty of Bayesian Deep Networks


Title	DropConnect Is Effective in Modeling Uncertainty of Bayesian Deep Networks
Authors	Aryan Mobiny, Hien V. Nguyen, Supratik Moulik, Naveen Garg, Carol C. Wu
Abstract	Deep neural networks (DNNs) have achieved state-of-the-art performances in many important domains, including medical diagnosis, security, and autonomous driving. In these domains where safety is highly critical, an erroneous decision can result in serious consequences. While a perfect prediction accuracy is not always achievable, recent work on Bayesian deep networks shows that it is possible to know when DNNs are more likely to make mistakes. Knowing what DNNs do not know is desirable to increase the safety of deep learning technology in sensitive applications. Bayesian neural networks attempt to address this challenge. However, traditional approaches are computationally intractable and do not scale well to large, complex neural network architectures. In this paper, we develop a theoretical framework to approximate Bayesian inference for DNNs by imposing a Bernoulli distribution on the model weights. This method, called MC-DropConnect, gives us a tool to represent the model uncertainty with little change in the overall model structure or computational cost. We extensively validate the proposed algorithm on multiple network architectures and datasets for classification and semantic segmentation tasks. We also propose new metrics to quantify the uncertainty estimates. This enables an objective comparison between MC-DropConnect and prior approaches. Our empirical results demonstrate that the proposed framework yields significant improvement in both prediction accuracy and uncertainty estimation quality compared to the state of the art.
Tasks	Autonomous Driving, Bayesian Inference, Medical Diagnosis, Semantic Segmentation
Published	2019-06-07
URL	https://arxiv.org/abs/1906.04569v1
PDF	https://arxiv.org/pdf/1906.04569v1.pdf
PWC	https://paperswithcode.com/paper/dropconnect-is-effective-in-modeling
Repo
Framework

Boosted GAN with Semantically Interpretable Information for Image Inpainting


Title	Boosted GAN with Semantically Interpretable Information for Image Inpainting
Authors	Ang Li, Jianzhong Qi, Rui Zhang, Ramamohanarao Kotagiri
Abstract	Image inpainting aims at restoring missing region of corrupted images, which has many applications such as image restoration and object removal. However, current GAN-based inpainting models fail to explicitly consider the semantic consistency between restored images and original images. Forexample, given a male image with image region of one eye missing, current models may restore it with a female eye. This is due to the ambiguity of GAN-based inpainting models: these models can generate many possible restorations given a missing region. To address this limitation, our key insight is that semantically interpretable information (such as attribute and segmentation information) of input images (with missing regions) can provide essential guidance for the inpainting process. Based on this insight, we propose a boosted GAN with semantically interpretable information for image inpainting that consists of an inpainting network and a discriminative network. The inpainting network utilizes two auxiliary pretrained networks to discover the attribute and segmentation information of input images and incorporates them into the inpainting process to provide explicit semantic-level guidance. The discriminative network adopts a multi-level design that can enforce regularizations not only on overall realness but also on attribute and segmentation consistency with the original images. Experimental results show that our proposed model can preserve consistency on both attribute and segmentation level, and significantly outperforms the state-of-the-art models.
Tasks	Image Inpainting, Image Restoration
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04503v1
PDF	https://arxiv.org/pdf/1908.04503v1.pdf
PWC	https://paperswithcode.com/paper/boosted-gan-with-semantically-interpretable
Repo
Framework

Copula Representations and Error Surface Projections for the Exclusive Or Problem


Title	Copula Representations and Error Surface Projections for the Exclusive Or Problem
Authors	Roy S. Freedman
Abstract	The exclusive or (xor) function is one of the simplest examples that illustrate why nonlinear feedforward networks are superior to linear regression for machine learning applications. We review the xor representation and approximation problems and discuss their solutions in terms of probabilistic logic and associative copula functions. After briefly reviewing the specification of feedforward networks, we compare the dynamics of learned error surfaces with different activation functions such as RELU and tanh through a set of colorful three-dimensional charts. The copula representations extend xor from Boolean to real values, thereby providing a convenient way to demonstrate the concept of cross-validation on in-sample and out-sample data sets. Our approach is pedagogical and is meant to be a machine learning prolegomenon.
Tasks
Published	2019-07-08
URL	https://arxiv.org/abs/1907.04483v1
PDF	https://arxiv.org/pdf/1907.04483v1.pdf
PWC	https://paperswithcode.com/paper/copula-representations-and-error-surface
Repo
Framework

Manifold Modeling in Embedded Space: A Perspective for Interpreting Deep Image Prior


Title	Manifold Modeling in Embedded Space: A Perspective for Interpreting Deep Image Prior
Authors	Tatsuya Yokota, Hidekata Hontani, Qibin Zhao, Andrzej Cichocki
Abstract	Deep image prior (DIP), which utilizes a deep convolutional network (ConvNet) structure itself as an image prior, has attracted attentions in computer vision and machine learning communities. It empirically shows the effectiveness of ConvNet structure for various image restoration applications. However, why the DIP works so well is still unknown, and why convolution operation is useful for image reconstruction or enhancement is not very clear. In this study, we tackle these questions. The proposed approach is dividing the convolution into `delay-embedding'' and` transformation (\ie encoder-decoder)'', and proposing a simple, but essential, image/tensor modeling method which is closely related to dynamical systems and self-similarity. The proposed method named as manifold modeling in embedded space (MMES) is implemented by using a novel denoising-auto-encoder in combination with multi-way delay-embedding transform. In spite of its simplicity, the image/tensor completion, super-resolution, deconvolution, and denoising results of MMES are quite similar even competitive to DIP in our extensive experiments, and these results would help us for reinterpreting/characterizing the DIP from a perspective of ``low-dimensional patch-manifold prior’'. \|
Tasks	Denoising, Image Reconstruction, Image Restoration, Super-Resolution
Published	2019-08-08
URL	https://arxiv.org/abs/1908.02995v2
PDF	https://arxiv.org/pdf/1908.02995v2.pdf
PWC	https://paperswithcode.com/paper/manifold-modeling-in-embedded-space-a
Repo
Framework

Restoration of Non-rigidly Distorted Underwater Images using a Combination of Compressive Sensing and Local Polynomial Image Representations


Title	Restoration of Non-rigidly Distorted Underwater Images using a Combination of Compressive Sensing and Local Polynomial Image Representations
Authors	Jerin Geo James, Pranay Agrawal, Ajit Rajwade
Abstract	Images of static scenes submerged beneath a wavy water surface exhibit severe non-rigid distortions. The physics of water flow suggests that water surfaces possess spatio-temporal smoothness and temporal periodicity. Hence they possess a sparse representation in the 3D discrete Fourier (DFT) basis. Motivated by this, we pose the task of restoration of such video sequences as a compressed sensing (CS) problem. We begin by tracking a few salient feature points across the frames of a video sequence of the submerged scene. Using these point trajectories, we show that the motion fields at all other (non-tracked) points can be effectively estimated using a typical CS solver. This by itself is a novel contribution in the field of non-rigid motion estimation. We show that this method outperforms state of the art algorithms for underwater image restoration. We further consider a simple optical flow algorithm based on local polynomial expansion of the image frames (PEOF). Surprisingly, we demonstrate that PEOF is more efficient and often outperforms all the state of the art methods in terms of numerical measures. Finally, we demonstrate that a two-stage approach consisting of the CS step followed by PEOF much more accurately preserves the image structure and improves the (visual as well as numerical) video quality as compared to just the PEOF stage.
Tasks	Compressive Sensing, Image Restoration, Motion Estimation, Optical Flow Estimation
Published	2019-08-06
URL	https://arxiv.org/abs/1908.01940v1
PDF	https://arxiv.org/pdf/1908.01940v1.pdf
PWC	https://paperswithcode.com/paper/restoration-of-non-rigidly-distorted
Repo
Framework

Fenton-Wilkinson Order Statistics and German Tanks: A Case Study of an Orienteering Relay Race


Title	Fenton-Wilkinson Order Statistics and German Tanks: A Case Study of an Orienteering Relay Race
Authors	Joonas Pääkkönen
Abstract	Ordinal regression falls between discrete-valued classification and continuous-valued regression. Ordinal target variables can be associated with ranked random variables. These random variables are known as order statistics and they are closely related to ordinal regression. However, the challenge of using order statistics for ordinal regression prediction is finding a suitable parent distribution. In this work, we provide a case study of a real-world orienteering relay race by viewing it as a random process. For this process, we show that accurate order statistical ordinal regression predictions of final team rankings, or places, can be obtained by assuming a lognormal distribution of individual leg times. Moreover, we apply Fenton-Wilkinson approximations to intermediate changeover times alongside an estimator for the total number of teams as in the notorious German tank problem. The purpose of this work is, in part, to spark interest in studying the applicability of order statistics in ordinal regression problems.
Tasks
Published	2019-12-10
URL	https://arxiv.org/abs/1912.05034v1
PDF	https://arxiv.org/pdf/1912.05034v1.pdf
PWC	https://paperswithcode.com/paper/fenton-wilkinson-order-statistics-and-german
Repo
Framework