Paper Group ANR 632
Robust and Sparse Regression in GLM by Stochastic Optimization. Low-Dose CT via Deep CNN with Skip Connection and Network in Network. Nonparametric Density Estimation under Adversarial Losses. contextual: Evaluating Contextual Multi-Armed Bandit Problems in R. Supervising Nyström Methods via Negative Margin Support Vector Selection. Counterfactual …
Robust and Sparse Regression in GLM by Stochastic Optimization
Title | Robust and Sparse Regression in GLM by Stochastic Optimization |
Authors | Takayuki Kawashima, Hironori Fujisawa |
Abstract | The generalized linear model (GLM) plays a key role in regression analyses. In high-dimensional data, the sparse GLM has been used but it is not robust against outliers. Recently, the robust methods have been proposed for the specific example of the sparse GLM. Among them, we focus on the robust and sparse linear regression based on the $\gamma$-divergence. The estimator of the $\gamma$-divergence has strong robustness under heavy contamination. In this paper, we extend the robust and sparse linear regression based on the $\gamma$-divergence to the robust and sparse GLM based on the $\gamma$-divergence with a stochastic optimization approach in order to obtain the estimate. We adopt the randomized stochastic projected gradient descent as a stochastic optimization approach and extend the established convergence property to the classical first-order necessary condition. By virtue of the stochastic optimization approach, we can efficiently estimate parameters for very large problems. Particularly, we show the linear regression, logistic regression and Poisson regression with $L_1$ regularization in detail as specific examples of robust and sparse GLM. In numerical experiments and real data analysis, the proposed method outperformed comparative methods. |
Tasks | Stochastic Optimization |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.03127v1 |
http://arxiv.org/pdf/1802.03127v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-and-sparse-regression-in-glm-by |
Repo | |
Framework | |
Low-Dose CT via Deep CNN with Skip Connection and Network in Network
Title | Low-Dose CT via Deep CNN with Skip Connection and Network in Network |
Authors | Chenyu You, Linfeng Yang, Yi Zhang, Ge Wang |
Abstract | A major challenge in computed tomography (CT) is how to minimize patient radiation exposure without compromising image quality and diagnostic performance. The use of deep convolutional (Conv) neural networks for noise reduction in Low-Dose CT (LDCT) images has recently shown a great potential in this important application. In this paper, we present a highly efficient and effective neural network model for LDCT image noise reduction. Specifically, to capture local anatomical features we integrate Deep Convolutional Neural Networks (CNNs) and Skip connection layers for feature extraction. Also, we introduce parallelized $1\times 1$ CNN, called Network in Network, to lower the dimensionality of the output from the previous layer, achieving faster computational speed at less feature loss. To optimize the performance of the network, we adopt a Wasserstein generative adversarial network (WGAN) framework. Quantitative and qualitative comparisons demonstrate that our proposed network model can produce images with lower noise and more structural details than state-of-the-art noise-reduction methods. |
Tasks | Computed Tomography (CT) |
Published | 2018-11-26 |
URL | https://arxiv.org/abs/1811.10564v2 |
https://arxiv.org/pdf/1811.10564v2.pdf | |
PWC | https://paperswithcode.com/paper/low-dose-ct-via-deep-cnn-with-skip-connection |
Repo | |
Framework | |
Nonparametric Density Estimation under Adversarial Losses
Title | Nonparametric Density Estimation under Adversarial Losses |
Authors | Shashank Singh, Ananya Uppal, Boyue Li, Chun-Liang Li, Manzil Zaheer, Barnabás Póczos |
Abstract | We study minimax convergence rates of nonparametric density estimation under a large class of loss functions called “adversarial losses”, which, besides classical $\mathcal{L}^p$ losses, includes maximum mean discrepancy (MMD), Wasserstein distance, and total variation distance. These losses are closely related to the losses encoded by discriminator networks in generative adversarial networks (GANs). In a general framework, we study how the choice of loss and the assumed smoothness of the underlying density together determine the minimax rate. We also discuss implications for training GANs based on deep ReLU networks, and more general connections to learning implicit generative models in a minimax statistical sense. |
Tasks | Density Estimation |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08836v2 |
http://arxiv.org/pdf/1805.08836v2.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-density-estimation-under |
Repo | |
Framework | |
contextual: Evaluating Contextual Multi-Armed Bandit Problems in R
Title | contextual: Evaluating Contextual Multi-Armed Bandit Problems in R |
Authors | Robin van Emden, Maurits Kaptein |
Abstract | Over the past decade, contextual bandit algorithms have been gaining in popularity due to their effectiveness and flexibility in solving sequential decision problems—from online advertising and finance to clinical trial design and personalized medicine. At the same time, there are, as of yet, surprisingly few options that enable researchers and practitioners to simulate and compare the wealth of new and existing bandit algorithms in a standardized way. To help close this gap between analytical research and empirical evaluation the current paper introduces the object-oriented R package “contextual”: a user-friendly and, through its object-oriented structure, easily extensible framework that facilitates parallelized comparison of contextual and context-free bandit policies through both simulation and offline analysis. |
Tasks | |
Published | 2018-11-06 |
URL | https://arxiv.org/abs/1811.01926v4 |
https://arxiv.org/pdf/1811.01926v4.pdf | |
PWC | https://paperswithcode.com/paper/contextual-evaluating-contextual-multi-armed |
Repo | |
Framework | |
Supervising Nyström Methods via Negative Margin Support Vector Selection
Title | Supervising Nyström Methods via Negative Margin Support Vector Selection |
Authors | Mert Al, Thee Chanyaswad, Sun-Yuan Kung |
Abstract | The Nystr"om methods have been popular techniques for scalable kernel based learning. They approximate explicit, low-dimensional feature mappings for kernel functions from the pairwise comparisons with the training data. However, Nystr"om methods are generally applied without the supervision provided by the training labels in the classification/regression problems. This leads to pairwise comparisons with randomly chosen training samples in the model. Conversely, this work studies a supervised Nystr"om method that chooses the critical subsets of samples for the success of the Machine Learning model. Particularly, we select the Nystr"om support vectors via the negative margin criterion, and create explicit feature maps that are more suitable for the classification task on the data. Experimental results on six datasets show that, without increasing the complexity over unsupervised techniques, our method can significantly improve the classification performance achieved via kernel approximation methods and reduce the number of features needed to reach or exceed the performance of the full-dimensional kernel machines. |
Tasks | |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.04018v2 |
http://arxiv.org/pdf/1805.04018v2.pdf | |
PWC | https://paperswithcode.com/paper/supervising-nystrom-methods-via-negative |
Repo | |
Framework | |
Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Title | Counterfactual Critic Multi-Agent Training for Scene Graph Generation |
Authors | Long Chen, Hanwang Zhang, Jun Xiao, Xiangnan He, Shiliang Pu, Shih-Fu Chang |
Abstract | Scene graphs – objects as nodes and visual relationships as edges – describe the whereabouts and interactions of the things and stuff in an image for comprehensive scene understanding. To generate coherent scene graphs, almost all existing methods exploit the fruitful visual context by modeling message passing among objects, fitting the dynamic nature of reasoning with visual context, eg, “person” on “bike” can help to determine the relationship “ride”, which in turn contributes to the category confidence of the two objects. However, we argue that the scene dynamics is not properly learned by using the prevailing cross-entropy based supervised learning paradigm, which is not sensitive to graph inconsistency: errors at the hub or non-hub nodes are unfortunately penalized equally. To this end, we propose a Counterfactual critic Multi-Agent Training (CMAT) approach to resolve the mismatch. CMAT is a multi-agent policy gradient method that frames objects as cooperative agents, and then directly maximizes a graph-level metric as the reward. In particular, to assign the reward properly to each agent, CMAT uses a counterfactual baseline that disentangles the agent-specific reward by fixing the dynamics of other agents. Extensive validations on the challenging Visual Genome benchmark show that CMAT achieves a state-of-the-art by significant performance gains under various settings and metrics. |
Tasks | Graph Generation, Scene Graph Generation, Scene Understanding |
Published | 2018-12-06 |
URL | https://arxiv.org/abs/1812.02347v3 |
https://arxiv.org/pdf/1812.02347v3.pdf | |
PWC | https://paperswithcode.com/paper/scene-dynamics-counterfactual-critic-multi |
Repo | |
Framework | |
Iterative Learning with Open-set Noisy Labels
Title | Iterative Learning with Open-set Noisy Labels |
Authors | Yisen Wang, Weiyang Liu, Xingjun Ma, James Bailey, Hongyuan Zha, Le Song, Shu-Tao Xia |
Abstract | Large-scale datasets possessing clean label annotations are crucial for training Convolutional Neural Networks (CNNs). However, labeling large-scale data can be very costly and error-prone, and even high-quality datasets are likely to contain noisy (incorrect) labels. Existing works usually employ a closed-set assumption, whereby the samples associated with noisy labels possess a true class contained within the set of known classes in the training data. However, such an assumption is too restrictive for many applications, since samples associated with noisy labels might in fact possess a true class that is not present in the training data. We refer to this more complex scenario as the \textbf{open-set noisy label} problem and show that it is nontrivial in order to make accurate predictions. To address this problem, we propose a novel iterative learning framework for training CNNs on datasets with open-set noisy labels. Our approach detects noisy labels and learns deep discriminative features in an iterative fashion. To benefit from the noisy label detection, we design a Siamese network to encourage clean labels and noisy labels to be dissimilar. A reweighting module is also applied to simultaneously emphasize the learning from clean labels and reduce the effect caused by noisy labels. Experiments on CIFAR-10, ImageNet and real-world noisy (web-search) datasets demonstrate that our proposed model can robustly train CNNs in the presence of a high proportion of open-set as well as closed-set noisy labels. |
Tasks | |
Published | 2018-03-31 |
URL | http://arxiv.org/abs/1804.00092v1 |
http://arxiv.org/pdf/1804.00092v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-learning-with-open-set-noisy-labels |
Repo | |
Framework | |
Scene Graph Generation via Conditional Random Fields
Title | Scene Graph Generation via Conditional Random Fields |
Authors | Weilin Cong, William Wang, Wang-Chien Lee |
Abstract | Despite the great success object detection and segmentation models have achieved in recognizing individual objects in images, performance on cognitive tasks such as image caption, semantic image retrieval, and visual QA is far from satisfactory. To achieve better performance on these cognitive tasks, merely recognizing individual object instances is insufficient. Instead, the interactions between object instances need to be captured in order to facilitate reasoning and understanding of the visual scenes in an image. Scene graph, a graph representation of images that captures object instances and their relationships, offers a comprehensive understanding of an image. However, existing techniques on scene graph generation fail to distinguish subjects and objects in the visual scenes of images and thus do not perform well with real-world datasets where exist ambiguous object instances. In this work, we propose a novel scene graph generation model for predicting object instances and its corresponding relationships in an image. Our model, SG-CRF, learns the sequential order of subject and object in a relationship triplet, and the semantic compatibility of object instance nodes and relationship nodes in a scene graph efficiently. Experiments empirically show that SG-CRF outperforms the state-of-the-art methods, on three different datasets, i.e., CLEVR, VRD, and Visual Genome, raising the Recall@100 from 24.99% to 49.95%, from 41.92% to 50.47%, and from 54.69% to 54.77%, respectively. |
Tasks | Graph Generation, Image Retrieval, Object Detection, Scene Graph Generation |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08075v1 |
http://arxiv.org/pdf/1811.08075v1.pdf | |
PWC | https://paperswithcode.com/paper/scene-graph-generation-via-conditional-random |
Repo | |
Framework | |
LinkNet: Relational Embedding for Scene Graph
Title | LinkNet: Relational Embedding for Scene Graph |
Authors | Sanghyun Woo, Dahun Kim, Donghyeon Cho, In So Kweon |
Abstract | Objects and their relationships are critical contents for image understanding. A scene graph provides a structured description that captures these properties of an image. However, reasoning about the relationships between objects is very challenging and only a few recent works have attempted to solve the problem of generating a scene graph from an image. In this paper, we present a method that improves scene graph generation by explicitly modeling inter-dependency among the entire object instances. We design a simple and effective relational embedding module that enables our model to jointly represent connections among all related objects, rather than focus on an object in isolation. Our method significantly benefits the main part of the scene graph generation task: relationship classification. Using it on top of a basic Faster R-CNN, our model achieves state-of-the-art results on the Visual Genome benchmark. We further push the performance by introducing global context encoding module and geometrical layout encoding module. We validate our final model, LinkNet, through extensive ablation studies, demonstrating its efficacy in scene graph generation. |
Tasks | Graph Generation, Scene Graph Generation |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06410v1 |
http://arxiv.org/pdf/1811.06410v1.pdf | |
PWC | https://paperswithcode.com/paper/linknet-relational-embedding-for-scene-graph |
Repo | |
Framework | |
Constant-Time Predictive Distributions for Gaussian Processes
Title | Constant-Time Predictive Distributions for Gaussian Processes |
Authors | Geoff Pleiss, Jacob R. Gardner, Kilian Q. Weinberger, Andrew Gordon Wilson |
Abstract | One of the most compelling features of Gaussian process (GP) regression is its ability to provide well-calibrated posterior distributions. Recent advances in inducing point methods have sped up GP marginal likelihood and posterior mean computations, leaving posterior covariance estimation and sampling as the remaining computational bottlenecks. In this paper we address these shortcomings by using the Lanczos algorithm to rapidly approximate the predictive covariance matrix. Our approach, which we refer to as LOVE (LanczOs Variance Estimates), substantially improves time and space complexity. In our experiments, LOVE computes covariances up to 2,000 times faster and draws samples 18,000 times faster than existing methods, all without sacrificing accuracy. |
Tasks | Gaussian Processes |
Published | 2018-03-16 |
URL | http://arxiv.org/abs/1803.06058v4 |
http://arxiv.org/pdf/1803.06058v4.pdf | |
PWC | https://paperswithcode.com/paper/constant-time-predictive-distributions-for |
Repo | |
Framework | |
Image-Level Attentional Context Modeling Using Nested-Graph Neural Networks
Title | Image-Level Attentional Context Modeling Using Nested-Graph Neural Networks |
Authors | Guillaume Jaume, Behzad Bozorgtabar, Hazim Kemal Ekenel, Jean-Philippe Thiran, Maria Gabrani |
Abstract | We introduce a new scene graph generation method called image-level attentional context modeling (ILAC). Our model includes an attentional graph network that effectively propagates contextual information across the graph using image-level features. Whereas previous works use an object-centric context, we build an image-level context agent to encode the scene properties. The proposed method comprises a single-stream network that iteratively refines the scene graph with a nested graph neural network. We demonstrate that our approach achieves competitive performance with the state-of-the-art for scene graph generation on the Visual Genome dataset, while requiring fewer parameters than other methods. We also show that ILAC can improve regular object detectors by incorporating relational image-level information. |
Tasks | Graph Generation, Scene Graph Generation |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03830v2 |
http://arxiv.org/pdf/1811.03830v2.pdf | |
PWC | https://paperswithcode.com/paper/image-level-attentional-context-modeling |
Repo | |
Framework | |
Deep learning long-range information in undirected graphs with wave networks
Title | Deep learning long-range information in undirected graphs with wave networks |
Authors | Matthew K. Matlock, Arghya Datta, Na Le Dang, Kevin Jiang, S. Joshua Swamidass |
Abstract | Graph algorithms are key tools in many fields of science and technology. Some of these algorithms depend on propagating information between distant nodes in a graph. Recently, there have been a number of deep learning architectures proposed to learn on undirected graphs. However, most of these architectures aggregate information in the local neighborhood of a node, and therefore they may not be capable of efficiently propagating long-range information. To solve this problem we examine a recently proposed architecture, wave, which propagates information back and forth across an undirected graph in waves of nonlinear computation. We compare wave to graph convolution, an architecture based on local aggregation, and find that wave learns three different graph-based tasks with greater efficiency and accuracy. These three tasks include (1) labeling a path connecting two nodes in a graph, (2) solving a maze presented as an image, and (3) computing voltages in a circuit. These tasks range from trivial to very difficult, but wave can extrapolate from small training examples to much larger testing examples. These results show that wave may be able to efficiently solve a wide range of problems that require long-range information propagation across undirected graphs. An implementation of the wave network, and example code for the maze problem are included in the tflon deep learning toolkit (https://bitbucket.org/mkmatlock/tflon). |
Tasks | |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12153v1 |
http://arxiv.org/pdf/1810.12153v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-long-range-information-in |
Repo | |
Framework | |
Fast Convergence for Object Detection by Learning how to Combine Error Functions
Title | Fast Convergence for Object Detection by Learning how to Combine Error Functions |
Authors | Benjamin Schnieders, Karl Tuyls |
Abstract | In this paper, we introduce an innovative method to improve the convergence speed and accuracy of object detection neural networks. Our approach, CONVERGE-FAST-AUXNET, is based on employing multiple, dependent loss metrics and weighting them optimally using an on-line trained auxiliary network. Experiments are performed in the well-known RoboCup@Work challenge environment. A fully convolutional segmentation network is trained on detecting objects’ pickup points. We empirically obtain an approximate measure for the rate of success of a robotic pickup operation based on the accuracy of the object detection network. Our experiments show that adding an optimally weighted Euclidean distance loss to a network trained on the commonly used Intersection over Union (IoU) metric reduces the convergence time by 42.48%. The estimated pickup rate is improved by 39.90%. Compared to state-of-the-art task weighting methods, the improvement is 24.5% in convergence, and 15.8% on the estimated pickup rate. |
Tasks | Object Detection |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04480v1 |
http://arxiv.org/pdf/1808.04480v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-convergence-for-object-detection-by |
Repo | |
Framework | |
Structure Learning from Time Series with False Discovery Control
Title | Structure Learning from Time Series with False Discovery Control |
Authors | Bernat Guillen Pegueroles, Bhanukiran Vinzamuri, Karthikeyan Shanmugam, Steve Hedden, Jonathan D. Moyer, Kush R. Varshney |
Abstract | We consider the Granger causal structure learning problem from time series data. Granger causal algorithms predict a ‘Granger causal effect’ between two variables by testing if prediction error of one decreases significantly in the absence of the other variable among the predictor covariates. Almost all existing Granger causal algorithms condition on a large number of variables (all but two variables) to test for effects between a pair of variables. We propose a new structure learning algorithm called MMPC-p inspired by the well known MMHC algorithm for non-time series data. We show that under some assumptions, the algorithm provides false discovery rate control. The algorithm is sound and complete when given access to perfect directed information testing oracles. We also outline a novel tester for the linear Gaussian case. We show through our extensive experiments that the MMPC-p algorithm scales to larger problems and has improved statistical power compared to existing state of the art for large sparse graphs. We also apply our algorithm on a global development dataset and validate our findings with subject matter experts. |
Tasks | Time Series |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09909v1 |
http://arxiv.org/pdf/1805.09909v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-learning-from-time-series-with |
Repo | |
Framework | |
Power Market Price Forecasting via Deep Learning
Title | Power Market Price Forecasting via Deep Learning |
Authors | Yongli Zhu, Songtao Lu, Renchang Dai, Guangyi Liu, Zhiwei Wang |
Abstract | A study on power market price forecasting by deep learning is presented. As one of the most successful deep learning frameworks, the LSTM (Long short-term memory) neural network is utilized. The hourly prices data from the New England and PJM day-ahead markets are used in this study. First, a LSTM network is formulated and trained. Then the raw input and output data are preprocessed by unit scaling, and the trained network is tested on the real price data under different input lengths, forecasting horizons and data sizes. Its performance is also compared with other existing methods. The forecasted results demonstrate that, the LSTM deep neural network can outperform the others under different application settings in this problem. |
Tasks | |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.08092v2 |
http://arxiv.org/pdf/1809.08092v2.pdf | |
PWC | https://paperswithcode.com/paper/power-market-price-forecasting-via-deep |
Repo | |
Framework | |