Paper Group ANR 1102
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares. Column generation for the discrete Unit Commitment problem with min-stop ramping constraints. Automated Circuit Approximation Method Driven by Data Distribution. Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture …
The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares
Title | The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares |
Authors | Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli |
Abstract | Minimax optimal convergence rates for classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes. In contrast, SGD’s final iterate behavior has received much less attention despite their widespread use in practice. Motivated by this observation, this work provides a detailed study of the following question: what rate is achievable using the final iterate of SGD for the streaming least squares regression problem with and without strong convexity? First, this work shows that even if the time horizon T (i.e. the number of iterations SGD is run for) is known in advance, SGD’s final iterate behavior with any polynomially decaying learning rate scheme is highly sub-optimal compared to the minimax rate (by a condition number factor in the strongly convex case and a factor of $\sqrt{T}$ in the non-strongly convex case). In contrast, this paper shows that Step Decay schedules, which cut the learning rate by a constant factor every constant number of epochs (i.e., the learning rate decays geometrically) offers significant improvements over any polynomially decaying step sizes. In particular, the final iterate behavior with a step decay schedule is off the minimax rate by only $log$ factors (in the condition number for strongly convex case, and in T for the non-strongly convex case). Finally, in stark contrast to the known horizon case, this paper shows that the anytime (i.e. the limiting) behavior of SGD’s final iterate is poor (in that it queries iterates with highly sub-optimal function value infinitely often, i.e. in a limsup sense) irrespective of the stepsizes employed. These results demonstrate the subtlety in establishing optimal learning rate schemes (for the final iterate) for stochastic gradient procedures in fixed time horizon settings. |
Tasks | Stochastic Optimization |
Published | 2019-04-29 |
URL | https://arxiv.org/abs/1904.12838v2 |
https://arxiv.org/pdf/1904.12838v2.pdf | |
PWC | https://paperswithcode.com/paper/the-step-decay-schedule-a-near-optimal |
Repo | |
Framework | |
Column generation for the discrete Unit Commitment problem with min-stop ramping constraints
Title | Column generation for the discrete Unit Commitment problem with min-stop ramping constraints |
Authors | Nicolas Dupin |
Abstract | The discrete unit commitment problem with min-stop ramping constraints optimizes the daily production of thermal power plants (coal, gas, fuel units). For this problem, compact Integer Linear Programming (ILP) formulations have been designed to solve exactly small instances and heuristically real-size instances. This paper investigates whether Dantzig-Wolfe reformulation allows to improve the previous exact method and matheuristics. The extended ILP formulation is presented with the column generation algorithm to solve its linear relaxation. The experimental results show that the Dantzig-Wolfe reformulation does not improve the quality of the linear relaxation of the tightest compact ILP formulations. Computational experiments suggest also a conjecture which would explain such result: the compact ILP formulation of min-stop ramping constraints would be tight. Such results validate the quality of the exact methods and matheuristics based on compact ILP formulations previously designed. |
Tasks | |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.09255v1 |
https://arxiv.org/pdf/1912.09255v1.pdf | |
PWC | https://paperswithcode.com/paper/column-generation-for-the-discrete-unit |
Repo | |
Framework | |
Automated Circuit Approximation Method Driven by Data Distribution
Title | Automated Circuit Approximation Method Driven by Data Distribution |
Authors | Zdenek Vasicek, Vojtech Mrazek, Lukas Sekanina |
Abstract | We propose an application-tailored data-driven fully automated method for functional approximation of combinational circuits. We demonstrate how an application-level error metric such as the classification accuracy can be translated to a component-level error metric needed for an efficient and fast search in the space of approximate low-level components that are used in the application. This is possible by employing a weighted mean error distance (WMED) metric for steering the circuit approximation process which is conducted by means of genetic programming. WMED introduces a set of weights (calculated from the data distribution measured on a selected signal in a given application) determining the importance of each input vector for the approximation process. The method is evaluated using synthetic benchmarks and application-specific approximate MAC (multiply-and-accumulate) units that are designed to provide the best trade-offs between the classification accuracy and power consumption of two image classifiers based on neural networks. |
Tasks | |
Published | 2019-03-11 |
URL | http://arxiv.org/abs/1903.04188v1 |
http://arxiv.org/pdf/1903.04188v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-circuit-approximation-method-driven |
Repo | |
Framework | |
Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture Model
Title | Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture Model |
Authors | Yoshiaki Bando, Yoko Sasaki, Kazuyoshi Yoshii |
Abstract | This paper presents an unsupervised method that trains neural source separation by using only multichannel mixture signals. Conventional neural separation methods require a lot of supervised data to achieve excellent performance. Although multichannel methods based on spatial information can work without such training data, they are often sensitive to parameter initialization and degraded with the sources located close to each other. The proposed method uses a cost function based on a spatial model called a complex Gaussian mixture model (cGMM). This model has the time-frequency (TF) masks and direction of arrivals (DoAs) of sources as latent variables and is used for training separation and localization networks that respectively estimate these variables. This joint training solves the frequency permutation ambiguity of the spatial model in a unified deep Bayesian framework. In addition, the pre-trained network can be used not only for conducting monaural separation but also for efficiently initializing a multichannel separation algorithm. Experimental results with simulated speech mixtures showed that our method outperformed a conventional initialization method. |
Tasks | |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11307v1 |
https://arxiv.org/pdf/1908.11307v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-bayesian-unsupervised-source-separation |
Repo | |
Framework | |
Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising
Title | Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising |
Authors | Haokui Zhang, Ying Li, Hao Chen, Chunhua Shen |
Abstract | Recently, neural architecture search (NAS) methods have attracted much attention and outperformed manually designed architectures on a few high-level vision tasks. In this paper, we propose HiNAS (Hierarchical NAS), an effort towards employing NAS to automatically design effective neural network architectures for image denoising. HiNAS adopts gradient based search strategies and employs operations with adaptive receptive field to build an flexible hierarchical search space. During the search stage, HiNAS shares cells across different feature levels to save memory and employ an early stopping strategy to avoid the collapse issue in NAS, and considerably accelerate the search speed. The proposed HiNAS is both memory and computation efficient, which takes only about 4.5 hours for searching using a single GPU. We evaluate the effectiveness of our proposed HiNAS on two different datasets, namely an additive white Gaussian noise dataset BSD500, and a realistic noise dataset SIM1800. Experimental results show that the architecture found by HiNAS has fewer parameters and enjoys a faster inference speed, while achieving highly competitive performance compared with state-of-the-art methods. We also present analysis on the architectures found by NAS. HiNAS also shows good performance on experiments for image de-raining. |
Tasks | Denoising, Image Denoising, Image Restoration, Neural Architecture Search |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08228v3 |
https://arxiv.org/pdf/1909.08228v3.pdf | |
PWC | https://paperswithcode.com/paper/ir-nas-neural-architecture-search-for-image |
Repo | |
Framework | |
A Machine Learning Solution for Beam Tracking in mmWave Systems
Title | A Machine Learning Solution for Beam Tracking in mmWave Systems |
Authors | Daoud Burghal, Naveed A. Abbasi, Andreas F. Molisch |
Abstract | Utilizing millimeter-wave (mmWave) frequencies for wireless communication in \emph{mobile} systems is challenging since it requires continuous tracking of the beam direction. Recently, beam tracking techniques based on channel sparsity and/or Kalman filter-based techniques were proposed where the solutions use assumptions regarding the environment and device mobility that may not hold in practical scenarios. In this paper, we explore a machine learning-based approach to track the angle of arrival (AoA) for specific paths in realistic scenarios. In particular, we use a recurrent neural network (R-NN) structure with a modified cost function to track the AoA. We propose methods to train the network in sequential data, and study the performance of our proposed solution in comparison to an extended Kalman filter based solution in a realistic mmWave scenario based on stochastic channel model from the QuaDRiGa framework. Results show that our proposed solution outperforms an extended Kalman filter-based method by reducing the AoA outage probability, and thus reducing the need for frequent beam search. |
Tasks | |
Published | 2019-12-29 |
URL | https://arxiv.org/abs/2001.01574v1 |
https://arxiv.org/pdf/2001.01574v1.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-solution-for-beam-tracking |
Repo | |
Framework | |
Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples
Title | Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples |
Authors | Hossein Hosseini, Sreeram Kannan, Radha Poovendran |
Abstract | Deep learning classifiers are known to be vulnerable to adversarial examples. A recent paper presented at ICML 2019 proposed a statistical test detection method based on the observation that logits of noisy adversarial examples are biased toward the true class. The method is evaluated on CIFAR-10 dataset and is shown to achieve 99% true positive rate (TPR) at only 1% false positive rate (FPR). In this paper, we first develop a classifier-based adaptation of the statistical test method and show that it improves the detection performance. We then propose Logit Mimicry Attack method to generate adversarial examples such that their logits mimic those of benign images. We show that our attack bypasses both statistical test and classifier-based methods, reducing their TPR to less than 2:2% and 1:6%, respectively, even at 5% FPR. We finally show that a classifier-based detector that is trained with logits of mimicry adversarial examples can be evaded by an adaptive attacker that specifically targets the detector. Furthermore, even a detector that is iteratively trained to defend against adaptive attacker cannot be made robust, indicating that statistics of logits cannot be used to detect adversarial examples. |
Tasks | |
Published | 2019-07-28 |
URL | https://arxiv.org/abs/1907.12138v1 |
https://arxiv.org/pdf/1907.12138v1.pdf | |
PWC | https://paperswithcode.com/paper/are-odds-really-odd-bypassing-statistical |
Repo | |
Framework | |
Semantic Role Labeling with Iterative Structure Refinement
Title | Semantic Role Labeling with Iterative Structure Refinement |
Authors | Chunchuan Lyu, Shay B. Cohen, Ivan Titov |
Abstract | Modern state-of-the-art Semantic Role Labeling (SRL) methods rely on expressive sentence encoders (e.g., multi-layer LSTMs) but tend to model only local (if any) interactions between individual argument labeling decisions. This contrasts with earlier work and also with the intuition that the labels of individual arguments are strongly interdependent. We model interactions between argument labeling decisions through {\it iterative refinement}. Starting with an output produced by a factorized model, we iteratively refine it using a refinement network. Instead of modeling arbitrary interactions among roles and words, we encode prior knowledge about the SRL problem by designing a restricted network architecture capturing non-local interactions. This modeling choice prevents overfitting and results in an effective model, outperforming strong factorized baseline models on all 7 CoNLL-2009 languages, and achieving state-of-the-art results on 5 of them, including English. |
Tasks | Semantic Role Labeling |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03285v1 |
https://arxiv.org/pdf/1909.03285v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-role-labeling-with-iterative |
Repo | |
Framework | |
MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors
Title | MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors |
Authors | Royson Lee, Stylianos I. Venieris, Łukasz Dudziak, Sourav Bhattacharya, Nicholas D. Lane |
Abstract | In recent years, convolutional networks have demonstrated unprecedented performance in the image restoration task of super-resolution (SR). SR entails the upscaling of a single low-resolution image in order to meet application-specific image quality demands and plays a key role in mobile devices. To comply with privacy regulations and reduce the overhead of cloud computing, executing SR models locally on-device constitutes a key alternative approach. Nevertheless, the excessive compute and memory requirements of SR workloads pose a challenge in mapping SR networks on resource-constrained mobile platforms. This work presents MobiSR, a novel framework for performing efficient super-resolution on-device. Given a target mobile platform, the proposed framework considers popular model compression techniques and traverses the design space to reach the highest performing trade-off between image quality and processing speed. At run time, a novel scheduler dispatches incoming image patches to the appropriate model-engine pair based on the patch’s estimated upscaling difficulty in order to meet the required image quality with minimum processing latency. Quantitative evaluation shows that the proposed framework yields on-device SR designs that achieve an average speedup of 2.13x over highly-optimized parallel difficulty-unaware mappings and 4.79x over highly-optimized single compute engine implementations. |
Tasks | Image Restoration, Model Compression, Super-Resolution |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.07985v1 |
https://arxiv.org/pdf/1908.07985v1.pdf | |
PWC | https://paperswithcode.com/paper/190807985 |
Repo | |
Framework | |
DropConnect Is Effective in Modeling Uncertainty of Bayesian Deep Networks
Title | DropConnect Is Effective in Modeling Uncertainty of Bayesian Deep Networks |
Authors | Aryan Mobiny, Hien V. Nguyen, Supratik Moulik, Naveen Garg, Carol C. Wu |
Abstract | Deep neural networks (DNNs) have achieved state-of-the-art performances in many important domains, including medical diagnosis, security, and autonomous driving. In these domains where safety is highly critical, an erroneous decision can result in serious consequences. While a perfect prediction accuracy is not always achievable, recent work on Bayesian deep networks shows that it is possible to know when DNNs are more likely to make mistakes. Knowing what DNNs do not know is desirable to increase the safety of deep learning technology in sensitive applications. Bayesian neural networks attempt to address this challenge. However, traditional approaches are computationally intractable and do not scale well to large, complex neural network architectures. In this paper, we develop a theoretical framework to approximate Bayesian inference for DNNs by imposing a Bernoulli distribution on the model weights. This method, called MC-DropConnect, gives us a tool to represent the model uncertainty with little change in the overall model structure or computational cost. We extensively validate the proposed algorithm on multiple network architectures and datasets for classification and semantic segmentation tasks. We also propose new metrics to quantify the uncertainty estimates. This enables an objective comparison between MC-DropConnect and prior approaches. Our empirical results demonstrate that the proposed framework yields significant improvement in both prediction accuracy and uncertainty estimation quality compared to the state of the art. |
Tasks | Autonomous Driving, Bayesian Inference, Medical Diagnosis, Semantic Segmentation |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.04569v1 |
https://arxiv.org/pdf/1906.04569v1.pdf | |
PWC | https://paperswithcode.com/paper/dropconnect-is-effective-in-modeling |
Repo | |
Framework | |
Boosted GAN with Semantically Interpretable Information for Image Inpainting
Title | Boosted GAN with Semantically Interpretable Information for Image Inpainting |
Authors | Ang Li, Jianzhong Qi, Rui Zhang, Ramamohanarao Kotagiri |
Abstract | Image inpainting aims at restoring missing region of corrupted images, which has many applications such as image restoration and object removal. However, current GAN-based inpainting models fail to explicitly consider the semantic consistency between restored images and original images. Forexample, given a male image with image region of one eye missing, current models may restore it with a female eye. This is due to the ambiguity of GAN-based inpainting models: these models can generate many possible restorations given a missing region. To address this limitation, our key insight is that semantically interpretable information (such as attribute and segmentation information) of input images (with missing regions) can provide essential guidance for the inpainting process. Based on this insight, we propose a boosted GAN with semantically interpretable information for image inpainting that consists of an inpainting network and a discriminative network. The inpainting network utilizes two auxiliary pretrained networks to discover the attribute and segmentation information of input images and incorporates them into the inpainting process to provide explicit semantic-level guidance. The discriminative network adopts a multi-level design that can enforce regularizations not only on overall realness but also on attribute and segmentation consistency with the original images. Experimental results show that our proposed model can preserve consistency on both attribute and segmentation level, and significantly outperforms the state-of-the-art models. |
Tasks | Image Inpainting, Image Restoration |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.04503v1 |
https://arxiv.org/pdf/1908.04503v1.pdf | |
PWC | https://paperswithcode.com/paper/boosted-gan-with-semantically-interpretable |
Repo | |
Framework | |
Copula Representations and Error Surface Projections for the Exclusive Or Problem
Title | Copula Representations and Error Surface Projections for the Exclusive Or Problem |
Authors | Roy S. Freedman |
Abstract | The exclusive or (xor) function is one of the simplest examples that illustrate why nonlinear feedforward networks are superior to linear regression for machine learning applications. We review the xor representation and approximation problems and discuss their solutions in terms of probabilistic logic and associative copula functions. After briefly reviewing the specification of feedforward networks, we compare the dynamics of learned error surfaces with different activation functions such as RELU and tanh through a set of colorful three-dimensional charts. The copula representations extend xor from Boolean to real values, thereby providing a convenient way to demonstrate the concept of cross-validation on in-sample and out-sample data sets. Our approach is pedagogical and is meant to be a machine learning prolegomenon. |
Tasks | |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.04483v1 |
https://arxiv.org/pdf/1907.04483v1.pdf | |
PWC | https://paperswithcode.com/paper/copula-representations-and-error-surface |
Repo | |
Framework | |
Manifold Modeling in Embedded Space: A Perspective for Interpreting Deep Image Prior
Title | Manifold Modeling in Embedded Space: A Perspective for Interpreting Deep Image Prior |
Authors | Tatsuya Yokota, Hidekata Hontani, Qibin Zhao, Andrzej Cichocki |
Abstract | Deep image prior (DIP), which utilizes a deep convolutional network (ConvNet) structure itself as an image prior, has attracted attentions in computer vision and machine learning communities. It empirically shows the effectiveness of ConvNet structure for various image restoration applications. However, why the DIP works so well is still unknown, and why convolution operation is useful for image reconstruction or enhancement is not very clear. In this study, we tackle these questions. The proposed approach is dividing the convolution into delay-embedding'' and transformation (\ie encoder-decoder)'', and proposing a simple, but essential, image/tensor modeling method which is closely related to dynamical systems and self-similarity. The proposed method named as manifold modeling in embedded space (MMES) is implemented by using a novel denoising-auto-encoder in combination with multi-way delay-embedding transform. In spite of its simplicity, the image/tensor completion, super-resolution, deconvolution, and denoising results of MMES are quite similar even competitive to DIP in our extensive experiments, and these results would help us for reinterpreting/characterizing the DIP from a perspective of ``low-dimensional patch-manifold prior’'. | |
Tasks | Denoising, Image Reconstruction, Image Restoration, Super-Resolution |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.02995v2 |
https://arxiv.org/pdf/1908.02995v2.pdf | |
PWC | https://paperswithcode.com/paper/manifold-modeling-in-embedded-space-a |
Repo | |
Framework | |
Restoration of Non-rigidly Distorted Underwater Images using a Combination of Compressive Sensing and Local Polynomial Image Representations
Title | Restoration of Non-rigidly Distorted Underwater Images using a Combination of Compressive Sensing and Local Polynomial Image Representations |
Authors | Jerin Geo James, Pranay Agrawal, Ajit Rajwade |
Abstract | Images of static scenes submerged beneath a wavy water surface exhibit severe non-rigid distortions. The physics of water flow suggests that water surfaces possess spatio-temporal smoothness and temporal periodicity. Hence they possess a sparse representation in the 3D discrete Fourier (DFT) basis. Motivated by this, we pose the task of restoration of such video sequences as a compressed sensing (CS) problem. We begin by tracking a few salient feature points across the frames of a video sequence of the submerged scene. Using these point trajectories, we show that the motion fields at all other (non-tracked) points can be effectively estimated using a typical CS solver. This by itself is a novel contribution in the field of non-rigid motion estimation. We show that this method outperforms state of the art algorithms for underwater image restoration. We further consider a simple optical flow algorithm based on local polynomial expansion of the image frames (PEOF). Surprisingly, we demonstrate that PEOF is more efficient and often outperforms all the state of the art methods in terms of numerical measures. Finally, we demonstrate that a two-stage approach consisting of the CS step followed by PEOF much more accurately preserves the image structure and improves the (visual as well as numerical) video quality as compared to just the PEOF stage. |
Tasks | Compressive Sensing, Image Restoration, Motion Estimation, Optical Flow Estimation |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.01940v1 |
https://arxiv.org/pdf/1908.01940v1.pdf | |
PWC | https://paperswithcode.com/paper/restoration-of-non-rigidly-distorted |
Repo | |
Framework | |
Fenton-Wilkinson Order Statistics and German Tanks: A Case Study of an Orienteering Relay Race
Title | Fenton-Wilkinson Order Statistics and German Tanks: A Case Study of an Orienteering Relay Race |
Authors | Joonas Pääkkönen |
Abstract | Ordinal regression falls between discrete-valued classification and continuous-valued regression. Ordinal target variables can be associated with ranked random variables. These random variables are known as order statistics and they are closely related to ordinal regression. However, the challenge of using order statistics for ordinal regression prediction is finding a suitable parent distribution. In this work, we provide a case study of a real-world orienteering relay race by viewing it as a random process. For this process, we show that accurate order statistical ordinal regression predictions of final team rankings, or places, can be obtained by assuming a lognormal distribution of individual leg times. Moreover, we apply Fenton-Wilkinson approximations to intermediate changeover times alongside an estimator for the total number of teams as in the notorious German tank problem. The purpose of this work is, in part, to spark interest in studying the applicability of order statistics in ordinal regression problems. |
Tasks | |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.05034v1 |
https://arxiv.org/pdf/1912.05034v1.pdf | |
PWC | https://paperswithcode.com/paper/fenton-wilkinson-order-statistics-and-german |
Repo | |
Framework | |