Paper Group ANR 298
Randomized block proximal damped Newton method for composite self-concordant minimization. An Integrated Optimization + Learning Approach to Optimal Dynamic Pricing for the Retailer with Multi-type Customers in Smart Grids. Fixpoint Approximation of Strategic Abilities under Imperfect Information. Simultaneous Estimation of Noise Variance and Numbe …
Randomized block proximal damped Newton method for composite self-concordant minimization
Title | Randomized block proximal damped Newton method for composite self-concordant minimization |
Authors | Zhaosong Lu |
Abstract | In this paper we consider the composite self-concordant (CSC) minimization problem, which minimizes the sum of a self-concordant function $f$ and a (possibly nonsmooth) proper closed convex function $g$. The CSC minimization is the cornerstone of the path-following interior point methods for solving a broad class of convex optimization problems. It has also found numerous applications in machine learning. The proximal damped Newton (PDN) methods have been well studied in the literature for solving this problem that enjoy a nice iteration complexity. Given that at each iteration these methods typically require evaluating or accessing the Hessian of $f$ and also need to solve a proximal Newton subproblem, the cost per iteration can be prohibitively high when applied to large-scale problems. Inspired by the recent success of block coordinate descent methods, we propose a randomized block proximal damped Newton (RBPDN) method for solving the CSC minimization. Compared to the PDN methods, the computational cost per iteration of RBPDN is usually significantly lower. The computational experiment on a class of regularized logistic regression problems demonstrate that RBPDN is indeed promising in solving large-scale CSC minimization problems. The convergence of RBPDN is also analyzed in the paper. In particular, we show that RBPDN is globally convergent when $g$ is Lipschitz continuous. It is also shown that RBPDN enjoys a local linear convergence. Moreover, we show that for a class of $g$ including the case where $g$ is Lipschitz differentiable, RBPDN enjoys a global linear convergence. As a striking consequence, it shows that the classical damped Newton methods [22,40] and the PDN [31] for such $g$ are globally linearly convergent, which was previously unknown in the literature. Moreover, this result can be used to sharpen the existing iteration complexity of these methods. |
Tasks | |
Published | 2016-07-01 |
URL | http://arxiv.org/abs/1607.00101v1 |
http://arxiv.org/pdf/1607.00101v1.pdf | |
PWC | https://paperswithcode.com/paper/randomized-block-proximal-damped-newton |
Repo | |
Framework | |
An Integrated Optimization + Learning Approach to Optimal Dynamic Pricing for the Retailer with Multi-type Customers in Smart Grids
Title | An Integrated Optimization + Learning Approach to Optimal Dynamic Pricing for the Retailer with Multi-type Customers in Smart Grids |
Authors | Fanlin Meng, Xiao-Jun Zeng, Yan Zhang, Chris J. Dent, Dunwei Gong |
Abstract | In this paper, we consider a realistic and meaningful scenario in the context of smart grids where an electricity retailer serves three different types of customers, i.e., customers with an optimal home energy management system embedded in their smart meters (C-HEMS), customers with only smart meters (C-SM), and customers without smart meters (C-NONE). The main objective of this paper is to support the retailer to make optimal day-ahead dynamic pricing decisions in such a mixed customer pool. To this end, we propose a two-level decision-making framework where the retailer acting as upper-level agent firstly announces its electricity prices of next 24 hours and customers acting as lower-level agents subsequently schedule their energy usages accordingly. For the lower level problem, we model the price responsiveness of different customers according to their unique characteristics. For the upper level problem, we optimize the dynamic prices for the retailer to maximize its profit subject to realistic market constraints. The above two-level model is tackled by genetic algorithms (GA) based distributed optimization methods while its feasibility and effectiveness are confirmed via simulation results. |
Tasks | Decision Making, Distributed Optimization |
Published | 2016-12-18 |
URL | http://arxiv.org/abs/1612.05971v3 |
http://arxiv.org/pdf/1612.05971v3.pdf | |
PWC | https://paperswithcode.com/paper/an-integrated-optimization-learning-approach |
Repo | |
Framework | |
Fixpoint Approximation of Strategic Abilities under Imperfect Information
Title | Fixpoint Approximation of Strategic Abilities under Imperfect Information |
Authors | Wojciech Jamroga, Michał Knapik, Damian Kurpiewski |
Abstract | Model checking of strategic ability under imperfect information is known to be hard. The complexity results range from NP-completeness to undecidability, depending on the precise setup of the problem. No less importantly, fixpoint equivalences do not generally hold for imperfect information strategies, which seriously hampers incremental synthesis of winning strategies. In this paper, we propose translations of ATLir formulae that provide lower and upper bounds for their truth values, and are cheaper to verify than the original specifications. That is, if the expression is verified as true then the corresponding formula of ATLir should also hold in the given model. We begin by showing where the straightforward approach does not work. Then, we propose how it can be modified to obtain guaranteed lower bounds. To this end, we alter the next-step operator in such a way that traversing one’s indistinguishability relation is seen as atomic activity. Most interestingly, the lower approximation is provided by a fixpoint expression that uses a nonstandard variant of the next-step ability operator. We show the correctness of the translations, establish their computational complexity, and validate the approach by experiments with a scalable scenario of Bridge play. |
Tasks | |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02684v3 |
http://arxiv.org/pdf/1612.02684v3.pdf | |
PWC | https://paperswithcode.com/paper/fixpoint-approximation-of-strategic-abilities |
Repo | |
Framework | |
Simultaneous Estimation of Noise Variance and Number of Peaks in Bayesian Spectral Deconvolution
Title | Simultaneous Estimation of Noise Variance and Number of Peaks in Bayesian Spectral Deconvolution |
Authors | Satoru Tokuda, Kenji Nagata, Masato Okada |
Abstract | The heuristic identification of peaks from noisy complex spectra often leads to misunderstanding of the physical and chemical properties of matter. In this paper, we propose a framework based on Bayesian inference, which enables us to separate multipeak spectra into single peaks statistically and consists of two steps. The first step is estimating both the noise variance and the number of peaks as hyperparameters based on Bayes free energy, which generally is not analytically tractable. The second step is fitting the parameters of each peak function to the given spectrum by calculating the posterior density, which has a problem of local minima and saddles since multipeak models are nonlinear and hierarchical. Our framework enables the escape from local minima or saddles by using the exchange Monte Carlo method and calculates Bayes free energy via the multiple histogram method. We discuss a simulation demonstrating how efficient our framework is and show that estimating both the noise variance and the number of peaks prevents overfitting, overpenalizing, and misunderstanding the precision of parameter estimation. |
Tasks | Bayesian Inference |
Published | 2016-07-26 |
URL | http://arxiv.org/abs/1607.07590v2 |
http://arxiv.org/pdf/1607.07590v2.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-estimation-of-noise-variance-and |
Repo | |
Framework | |
Learning Representations by Stochastic Meta-Gradient Descent in Neural Networks
Title | Learning Representations by Stochastic Meta-Gradient Descent in Neural Networks |
Authors | Vivek Veeriah, Shangtong Zhang, Richard S. Sutton |
Abstract | Representations are fundamental to artificial intelligence. The performance of a learning system depends on the type of representation used for representing the data. Typically, these representations are hand-engineered using domain knowledge. More recently, the trend is to learn these representations through stochastic gradient descent in multi-layer neural networks, which is called backprop. Learning the representations directly from the incoming data stream reduces the human labour involved in designing a learning system. More importantly, this allows in scaling of a learning system for difficult tasks. In this paper, we introduce a new incremental learning algorithm called crossprop, which learns incoming weights of hidden units based on the meta-gradient descent approach, that was previously introduced by Sutton (1992) and Schraudolph (1999) for learning step-sizes. The final update equation introduces an additional memory parameter for each of these weights and generalizes the backprop update equation. From our experiments, we show that crossprop learns and reuses its feature representation while tackling new and unseen tasks whereas backprop relearns a new feature representation. |
Tasks | |
Published | 2016-12-09 |
URL | http://arxiv.org/abs/1612.02879v2 |
http://arxiv.org/pdf/1612.02879v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-representations-by-stochastic-meta |
Repo | |
Framework | |
BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration
Title | BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration |
Authors | Angela Dai, Matthias Nießner, Michael Zollhöfer, Shahram Izadi, Christian Theobalt |
Abstract | Real-time, high-quality, 3D scanning of large-scale scenes is key to mixed reality and robotic applications. However, scalability brings challenges of drift in pose estimation, introducing significant errors in the accumulated model. Approaches often require hours of offline processing to globally correct model errors. Recent online methods demonstrate compelling results, but suffer from: (1) needing minutes to perform online correction preventing true real-time use; (2) brittle frame-to-frame (or frame-to-model) pose estimation resulting in many tracking failures; or (3) supporting only unstructured point-based representations, which limit scan quality and applicability. We systematically address these issues with a novel, real-time, end-to-end reconstruction framework. At its core is a robust pose estimation strategy, optimizing per frame for a global set of camera poses by considering the complete history of RGB-D input with an efficient hierarchical approach. We remove the heavy reliance on temporal tracking, and continually localize to the globally optimized frames instead. We contribute a parallelizable optimization framework, which employs correspondences based on sparse features and dense geometric and photometric matching. Our approach estimates globally optimized (i.e., bundle adjusted) poses in real-time, supports robust tracking with recovery from gross tracking failures (i.e., relocalization), and re-estimates the 3D model in real-time to ensure global consistency; all within a single framework. Our approach outperforms state-of-the-art online systems with quality on par to offline methods, but with unprecedented speed and scan completeness. Our framework leads to a comprehensive online scanning solution for large indoor environments, enabling ease of use and high-quality results. |
Tasks | 3D Reconstruction, Pose Estimation |
Published | 2016-04-05 |
URL | http://arxiv.org/abs/1604.01093v3 |
http://arxiv.org/pdf/1604.01093v3.pdf | |
PWC | https://paperswithcode.com/paper/bundlefusion-real-time-globally-consistent-3d |
Repo | |
Framework | |
LOMo: Latent Ordinal Model for Facial Analysis in Videos
Title | LOMo: Latent Ordinal Model for Facial Analysis in Videos |
Authors | Karan Sikka, Gaurav Sharma, Marian Bartlett |
Abstract | We study the problem of facial analysis in videos. We propose a novel weakly supervised learning method that models the video event (expression, pain etc.) as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for smile, brow lower and cheek raise for pain). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF- it extends such frameworks to model the ordinal or temporal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations. In combination with complimentary features, we report state-of-the-art results on these datasets. |
Tasks | Multiple Instance Learning |
Published | 2016-04-06 |
URL | http://arxiv.org/abs/1604.01500v1 |
http://arxiv.org/pdf/1604.01500v1.pdf | |
PWC | https://paperswithcode.com/paper/lomo-latent-ordinal-model-for-facial-analysis |
Repo | |
Framework | |
Detecting Novel Processes with CANDIES – An Holistic Novelty Detection Technique based on Probabilistic Models
Title | Detecting Novel Processes with CANDIES – An Holistic Novelty Detection Technique based on Probabilistic Models |
Authors | Christian Gruhl, Bernhard Sick |
Abstract | In this article, we propose CANDIES (Combined Approach for Novelty Detection in Intelligent Embedded Systems), a new approach to novelty detection in technical systems. We assume that in a technical system several processes interact. If we observe these processes with sensors, we are able to model the observations (samples) with a probabilistic model, where, in an ideal case, the components of the parametric mixture density model we use, correspond to the processes in the real world. Eventually, at run-time, novel processes emerge in the technical systems such as in the case of an unpredictable failure. As a consequence, new kinds of samples are observed that require an adaptation of the model. CANDIES relies on mixtures of Gaussians which can be used for classification purposes, too. New processes may emerge in regions of the models’ input spaces where few samples were observed before (low-density regions) or in regions where already many samples were available (high-density regions). The latter case is more difficult, but most existing solutions focus on the former. Novelty detection in low- and high-density regions requires different detection strategies. With CANDIES, we introduce a new technique to detect novel processes in high-density regions by means of a fast online goodness-of-fit test. For detection in low-density regions we combine this approach with a 2SND (Two-Stage-Novelty-Detector) which we presented in preliminary work. The properties of CANDIES are evaluated using artificial data and benchmark data from the field of intrusion detection in computer networks, where the task is to detect new kinds of attacks. |
Tasks | Intrusion Detection |
Published | 2016-05-18 |
URL | http://arxiv.org/abs/1605.05628v1 |
http://arxiv.org/pdf/1605.05628v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-novel-processes-with-candies-an |
Repo | |
Framework | |
Fast Binary Embedding via Circulant Downsampled Matrix – A Data-Independent Approach
Title | Fast Binary Embedding via Circulant Downsampled Matrix – A Data-Independent Approach |
Authors | Sung-Hsien Hsieh, Chun-Shien Lu, Soo-Chang Pei |
Abstract | Binary embedding of high-dimensional data aims to produce low-dimensional binary codes while preserving discriminative power. State-of-the-art methods often suffer from high computation and storage costs. We present a simple and fast embedding scheme by first downsampling N-dimensional data into M-dimensional data and then multiplying the data with an MxM circulant matrix. Our method requires O(N +M log M) computation and O(N) storage costs. We prove if data have sparsity, our scheme can achieve similarity-preserving well. Experiments further demonstrate that though our method is cost-effective and fast, it still achieves comparable performance in image applications. |
Tasks | |
Published | 2016-01-24 |
URL | http://arxiv.org/abs/1601.06342v1 |
http://arxiv.org/pdf/1601.06342v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-binary-embedding-via-circulant |
Repo | |
Framework | |
Integrating Atlas and Graph Cut Methods for LV Segmentation from Cardiac Cine MRI
Title | Integrating Atlas and Graph Cut Methods for LV Segmentation from Cardiac Cine MRI |
Authors | Shusil Dangi, Nathan Cahill, Cristian A. Linte |
Abstract | Magnetic Resonance Imaging (MRI) has evolved as a clinical standard-of-care imaging modality for cardiac morphology, function assessment, and guidance of cardiac interventions. All these applications rely on accurate extraction of the myocardial tissue and blood pool from the imaging data. Here we propose a framework for left ventricle (LV) segmentation from cardiac cine-MRI. First, we segment the LV blood pool using iterative graph cuts, and subsequently use this information to segment the myocardium. We formulate the segmentation procedure as an energy minimization problem in a graph subject to the shape prior obtained by label propagation from an average atlas using affine registration. The proposed framework has been validated on 30 patient cardiac cine-MRI datasets available through the STACOM LV segmentation challenge and yielded fast, robust, and accurate segmentation results. |
Tasks | |
Published | 2016-11-03 |
URL | http://arxiv.org/abs/1611.01195v1 |
http://arxiv.org/pdf/1611.01195v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-atlas-and-graph-cut-methods-for |
Repo | |
Framework | |
Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution
Title | Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution |
Authors | Wenhan Yang, Jiashi Feng, Jianchao Yang, Fang Zhao, Jiaying Liu, Zongming Guo, Shuicheng Yan |
Abstract | In this work, we consider the image super-resolution (SR) problem. The main challenge of image SR is to recover high-frequency details of a low-resolution (LR) image that are important for human perception. To address this essentially ill-posed problem, we introduce a Deep Edge Guided REcurrent rEsidual~(DEGREE) network to progressively recover the high-frequency details. Different from most of existing methods that aim at predicting high-resolution (HR) images directly, DEGREE investigates an alternative route to recover the difference between a pair of LR and HR images by recurrent residual learning. DEGREE further augments the SR process with edge-preserving capability, namely the LR image and its edge map can jointly infer the sharp edge details of the HR image during the recurrent recovery process. To speed up its training convergence rate, by-pass connections across multiple layers of DEGREE are constructed. In addition, we offer an understanding on DEGREE from the view-point of sub-band frequency decomposition on image signal and experimentally demonstrate how DEGREE can recover different frequency bands separately. Extensive experiments on three benchmark datasets clearly demonstrate the superiority of DEGREE over well-established baselines and DEGREE also provides new state-of-the-arts on these datasets. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2016-04-29 |
URL | http://arxiv.org/abs/1604.08671v2 |
http://arxiv.org/pdf/1604.08671v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-edge-guided-recurrent-residual-learning |
Repo | |
Framework | |
Task scheduling system for UAV operations in indoor environment
Title | Task scheduling system for UAV operations in indoor environment |
Authors | Yohanes Khosiawan, Young Soo Park, Ilkyeong Moon, Janardhanan Mukund Nilakantan, Izabela Nielsen |
Abstract | Application of UAV in indoor environment is emerging nowadays due to the advancements in technology. UAV brings more space-flexibility in an occupied or hardly-accessible indoor environment, e.g., shop floor of manufacturing industry, greenhouse, nuclear powerplant. UAV helps in creating an autonomous manufacturing system by executing tasks with less human intervention in time-efficient manner. Consequently, a scheduler is one essential component to be focused on; yet the number of reported studies on UAV scheduling has been minimal. This work proposes a methodology with a heuristic (based on Earliest Available Time algorithm) which assigns tasks to UAVs with an objective of minimizing the makespan. In addition, a quick response towards uncertain events and a quick creation of new high-quality feasible schedule are needed. Hence, the proposed heuristic is incorporated with Particle Swarm Optimization (PSO) algorithm to find a quick near optimal schedule. This proposed methodology is implemented into a scheduler and tested on a few scales of datasets generated based on a real flight demonstration. Performance evaluation of scheduler is discussed in detail and the best solution obtained from a selected set of parameters is reported. |
Tasks | |
Published | 2016-04-21 |
URL | http://arxiv.org/abs/1604.06223v1 |
http://arxiv.org/pdf/1604.06223v1.pdf | |
PWC | https://paperswithcode.com/paper/task-scheduling-system-for-uav-operations-in |
Repo | |
Framework | |
Learning Privately from Multiparty Data
Title | Learning Privately from Multiparty Data |
Authors | Jihun Hamm, Paul Cao, Mikhail Belkin |
Abstract | Learning a classifier from private data collected by multiple parties is an important problem that has many potential applications. How can we build an accurate and differentially private global classifier by combining locally-trained classifiers from different parties, without access to any party’s private data? We propose to transfer the `knowledge’ of the local classifier ensemble by first creating labeled data from auxiliary unlabeled data, and then train a global $\epsilon$-differentially private classifier. We show that majority voting is too sensitive and therefore propose a new risk weighted by class probabilities estimated from the ensemble. Relative to a non-private solution, our private solution has a generalization error bounded by $O(\epsilon^{-2}M^{-2})$ where $M$ is the number of parties. This allows strong privacy without performance loss when $M$ is large, such as in crowdsensing applications. We demonstrate the performance of our method with realistic tasks of activity recognition, network intrusion detection, and malicious URL detection. | |
Tasks | Activity Recognition, Intrusion Detection, Network Intrusion Detection |
Published | 2016-02-10 |
URL | http://arxiv.org/abs/1602.03552v1 |
http://arxiv.org/pdf/1602.03552v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-privately-from-multiparty-data |
Repo | |
Framework | |
Asynchronous Stochastic Gradient MCMC with Elastic Coupling
Title | Asynchronous Stochastic Gradient MCMC with Elastic Coupling |
Authors | Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, Frank Hutter |
Abstract | We consider parallel asynchronous Markov Chain Monte Carlo (MCMC) sampling for problems where we can leverage (stochastic) gradients to define continuous dynamics which explore the target distribution. We outline a solution strategy for this setting based on stochastic gradient Hamiltonian Monte Carlo sampling (SGHMC) which we alter to include an elastic coupling term that ties together multiple MCMC instances. The proposed strategy turns inherently sequential HMC algorithms into asynchronous parallel versions. First experiments empirically show that the resulting parallel sampler significantly speeds up exploration of the target distribution, when compared to standard SGHMC, and is less prone to the harmful effects of stale gradients than a naive parallelization approach. |
Tasks | |
Published | 2016-12-02 |
URL | http://arxiv.org/abs/1612.00767v2 |
http://arxiv.org/pdf/1612.00767v2.pdf | |
PWC | https://paperswithcode.com/paper/asynchronous-stochastic-gradient-mcmc-with |
Repo | |
Framework | |
Uncovering the Dynamics of Crowdlearning and the Value of Knowledge
Title | Uncovering the Dynamics of Crowdlearning and the Value of Knowledge |
Authors | Utkarsh Upadhyay, Isabel Valera, Manuel Gomez-Rodriguez |
Abstract | Learning from the crowd has become increasingly popular in the Web and social media. There is a wide variety of crowdlearning sites in which, on the one hand, users learn from the knowledge that other users contribute to the site, and, on the other hand, knowledge is reviewed and curated by the same users using assessment measures such as upvotes or likes. In this paper, we present a probabilistic modeling framework of crowdlearning, which uncovers the evolution of a user’s expertise over time by leveraging other users’ assessments of her contributions. The model allows for both off-site and on-site learning and captures forgetting of knowledge. We then develop a scalable estimation method to fit the model parameters from millions of recorded learning and contributing events. We show the effectiveness of our model by tracing activity of ~25 thousand users in Stack Overflow over a 4.5 year period. We find that answers with high knowledge value are rare. Newbies and experts tend to acquire less knowledge than users in the middle range. Prolific learners tend to be also proficient contributors that post answers with high knowledge value. |
Tasks | |
Published | 2016-12-14 |
URL | http://arxiv.org/abs/1612.04831v1 |
http://arxiv.org/pdf/1612.04831v1.pdf | |
PWC | https://paperswithcode.com/paper/uncovering-the-dynamics-of-crowdlearning-and |
Repo | |
Framework | |