Paper Group ANR 74
Contextual Bandits with Latent Confounders: An NMF Approach. Combining Fully Convolutional and Recurrent Neural Networks for 3D Biomedical Image Segmentation. Automated Selection of Uniform Regions for CT Image Quality Detection. Scalable Semi-supervised Learning with Graph-based Kernel Machine. Solving Ridge Regression using Sketched Preconditione …
Contextual Bandits with Latent Confounders: An NMF Approach
Title | Contextual Bandits with Latent Confounders: An NMF Approach |
Authors | Rajat Sen, Karthikeyan Shanmugam, Murat Kocaoglu, Alexandros G. Dimakis, Sanjay Shakkottai |
Abstract | Motivated by online recommendation and advertising systems, we consider a causal model for stochastic contextual bandits with a latent low-dimensional confounder. In our model, there are $L$ observed contexts and $K$ arms of the bandit. The observed context influences the reward obtained through a latent confounder variable with cardinality $m$ ($m \ll L,K$). The arm choice and the latent confounder causally determines the reward while the observed context is correlated with the confounder. Under this model, the $L \times K$ mean reward matrix $\mathbf{U}$ (for each context in $[L]$ and each arm in $[K]$) factorizes into non-negative factors $\mathbf{A}$ ($L \times m$) and $\mathbf{W}$ ($m \times K$). This insight enables us to propose an $\epsilon$-greedy NMF-Bandit algorithm that designs a sequence of interventions (selecting specific arms), that achieves a balance between learning this low-dimensional structure and selecting the best arm to minimize regret. Our algorithm achieves a regret of $\mathcal{O}\left(L\mathrm{poly}(m, \log K) \log T \right)$ at time $T$, as compared to $\mathcal{O}(LK\log T)$ for conventional contextual bandits, assuming a constant gap between the best arm and the rest for each context. These guarantees are obtained under mild sufficiency conditions on the factors that are weaker versions of the well-known Statistical RIP condition. We further propose a class of generative models that satisfy our sufficient conditions, and derive a lower bound of $\mathcal{O}\left(Km\log T\right)$. These are the first regret guarantees for online matrix completion with bandit feedback, when the rank is greater than one. We further compare the performance of our algorithm with the state of the art, on synthetic and real world data-sets. |
Tasks | Matrix Completion, Multi-Armed Bandits |
Published | 2016-06-01 |
URL | http://arxiv.org/abs/1606.00119v3 |
http://arxiv.org/pdf/1606.00119v3.pdf | |
PWC | https://paperswithcode.com/paper/contextual-bandits-with-latent-confounders-an |
Repo | |
Framework | |
Combining Fully Convolutional and Recurrent Neural Networks for 3D Biomedical Image Segmentation
Title | Combining Fully Convolutional and Recurrent Neural Networks for 3D Biomedical Image Segmentation |
Authors | Jianxu Chen, Lin Yang, Yizhe Zhang, Mark Alber, Danny Z. Chen |
Abstract | Segmentation of 3D images is a fundamental problem in biomedical image analysis. Deep learning (DL) approaches have achieved state-of-the-art segmentation perfor- mance. To exploit the 3D contexts using neural networks, known DL segmentation methods, including 3D convolution, 2D convolution on planes orthogonal to 2D image slices, and LSTM in multiple directions, all suffer incompatibility with the highly anisotropic dimensions in common 3D biomedical images. In this paper, we propose a new DL framework for 3D image segmentation, based on a com- bination of a fully convolutional network (FCN) and a recurrent neural network (RNN), which are responsible for exploiting the intra-slice and inter-slice contexts, respectively. To our best knowledge, this is the first DL framework for 3D image segmentation that explicitly leverages 3D image anisotropism. Evaluating using a dataset from the ISBI Neuronal Structure Segmentation Challenge and in-house image stacks for 3D fungus segmentation, our approach achieves promising results comparing to the known DL-based 3D segmentation approaches. |
Tasks | 3D Medical Imaging Segmentation, Semantic Segmentation |
Published | 2016-09-05 |
URL | http://arxiv.org/abs/1609.01006v2 |
http://arxiv.org/pdf/1609.01006v2.pdf | |
PWC | https://paperswithcode.com/paper/combining-fully-convolutional-and-recurrent |
Repo | |
Framework | |
Automated Selection of Uniform Regions for CT Image Quality Detection
Title | Automated Selection of Uniform Regions for CT Image Quality Detection |
Authors | Maitham D Naeemi, Adam M Alessio, Sohini Roychowdhury |
Abstract | CT images are widely used in pathology detection and follow-up treatment procedures. Accurate identification of pathological features requires diagnostic quality CT images with minimal noise and artifact variation. In this work, a novel Fourier-transform based metric for image quality (IQ) estimation is presented that correlates to additive CT image noise. In the proposed method, two windowed CT image subset regions are analyzed together to identify the extent of variation in the corresponding Fourier-domain spectrum. The two square windows are chosen such that their center pixels coincide and one window is a subset of the other. The Fourier-domain spectral difference between these two sub-sampled windows is then used to isolate spatial regions-of-interest (ROI) with low signal variation (ROI-LV) and high signal variation (ROI-HV), respectively. Finally, the spatial variance ($var$), standard deviation ($std$), coefficient of variance ($cov$) and the fraction of abdominal ROI pixels in ROI-LV ($\nu’(q)$), are analyzed with respect to CT image noise. For the phantom CT images, $var$ and $std$ correlate to CT image noise ($r>0.76$ ($p\ll0.001$)), though not as well as $\nu’(q)$ ($r=0.96$ ($p\ll0.001$)). However, for the combined phantom and patient CT images, $var$ and $std$ do not correlate well with CT image noise ($r<0.46$ ($p\ll0.001$)) as compared to $\nu’(q)$ ($r=0.95$ ($p\ll0.001$)). Thus, the proposed method and the metric, $\nu’(q)$, can be useful to quantitatively estimate CT image noise. |
Tasks | |
Published | 2016-08-13 |
URL | http://arxiv.org/abs/1608.04381v2 |
http://arxiv.org/pdf/1608.04381v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-selection-of-uniform-regions-for-ct |
Repo | |
Framework | |
Scalable Semi-supervised Learning with Graph-based Kernel Machine
Title | Scalable Semi-supervised Learning with Graph-based Kernel Machine |
Authors | Trung Le, Khanh Nguyen, Van Nguyen, Vu Nguyen, Dinh Phung |
Abstract | Acquiring labels are often costly, whereas unlabeled data are usually easy to obtain in modern machine learning applications. Semi-supervised learning provides a principled machine learning framework to address such situations, and has been applied successfully in many real-word applications and industries. Nonetheless, most of existing semi-supervised learning methods encounter two serious limitations when applied to modern and large-scale datasets: computational burden and memory usage demand. To this end, we present in this paper the Graph-based semi-supervised Kernel Machine (GKM), a method that leverages the generalization ability of kernel-based method with the geometrical and distributive information formulated through a spectral graph induced from data for semi-supervised learning purpose. Our proposed GKM can be solved directly in the primal form using the Stochastic Gradient Descent method with the ideal convergence rate $O(\frac{1}{T})$. Besides, our formulation is suitable for a wide spectrum of important loss functions in the literature of machine learning (e.g., Hinge, smooth Hinge, Logistic, L1, and {\epsilon}-insensitive) and smoothness functions (i.e., $l_p(t) = t^p$ with $p\ge1$). We further show that the well-known Laplacian Support Vector Machine is a special case of our formulation. We validate our proposed method on several benchmark datasets to demonstrate that GKM is appropriate for the large-scale datasets since it is optimal in memory usage and yields superior classification accuracy whilst simultaneously achieving a significant computation speed-up in comparison with the state-of-the-art baselines. |
Tasks | |
Published | 2016-06-22 |
URL | http://arxiv.org/abs/1606.06793v3 |
http://arxiv.org/pdf/1606.06793v3.pdf | |
PWC | https://paperswithcode.com/paper/scalable-semi-supervised-learning-with-graph |
Repo | |
Framework | |
Solving Ridge Regression using Sketched Preconditioned SVRG
Title | Solving Ridge Regression using Sketched Preconditioned SVRG |
Authors | Alon Gonen, Francesco Orabona, Shai Shalev-Shwartz |
Abstract | We develop a novel preconditioning method for ridge regression, based on recent linear sketching methods. By equipping Stochastic Variance Reduced Gradient (SVRG) with this preconditioning process, we obtain a significant speed-up relative to fast stochastic methods such as SVRG, SDCA and SAG. |
Tasks | |
Published | 2016-02-07 |
URL | http://arxiv.org/abs/1602.02350v2 |
http://arxiv.org/pdf/1602.02350v2.pdf | |
PWC | https://paperswithcode.com/paper/solving-ridge-regression-using-sketched |
Repo | |
Framework | |
Learning to Decode Linear Codes Using Deep Learning
Title | Learning to Decode Linear Codes Using Deep Learning |
Authors | Eliya Nachmani, Yair Beery, David Burshtein |
Abstract | A novel deep learning method for improving the belief propagation algorithm is proposed. The method generalizes the standard belief propagation algorithm by assigning weights to the edges of the Tanner graph. These edges are then trained using deep learning techniques. A well-known property of the belief propagation algorithm is the independence of the performance on the transmitted codeword. A crucial property of our new method is that our decoder preserved this property. Furthermore, this property allows us to learn only a single codeword instead of exponential number of code-words. Improvements over the belief propagation algorithm are demonstrated for various high density parity check codes. |
Tasks | |
Published | 2016-07-16 |
URL | http://arxiv.org/abs/1607.04793v2 |
http://arxiv.org/pdf/1607.04793v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-decode-linear-codes-using-deep |
Repo | |
Framework | |
Statistical Inference for Model Parameters in Stochastic Gradient Descent
Title | Statistical Inference for Model Parameters in Stochastic Gradient Descent |
Authors | Xi Chen, Jason D. Lee, Xin T. Tong, Yichen Zhang |
Abstract | The stochastic gradient descent (SGD) algorithm has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency. While most existing works focus on the convergence of the objective function or the error of the obtained solution, we investigate the problem of statistical inference of true model parameters based on SGD when the population loss function is strongly convex and satisfies certain smoothness conditions. Our main contributions are two-fold. First, in the fixed dimension setup, we propose two consistent estimators of the asymptotic covariance of the average iterate from SGD: (1) a plug-in estimator, and (2) a batch-means estimator, which is computationally more efficient and only uses the iterates from SGD. Both proposed estimators allow us to construct asymptotically exact confidence intervals and hypothesis tests. Second, for high-dimensional linear regression, using a variant of the SGD algorithm, we construct a debiased estimator of each regression coefficient that is asymptotically normal. This gives a one-pass algorithm for computing both the sparse regression coefficients and confidence intervals, which is computationally attractive and applicable to online data. |
Tasks | |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08637v2 |
http://arxiv.org/pdf/1610.08637v2.pdf | |
PWC | https://paperswithcode.com/paper/statistical-inference-for-model-parameters-in |
Repo | |
Framework | |
Automatic View-Point Selection for Inter-Operative Endoscopic Surveillance
Title | Automatic View-Point Selection for Inter-Operative Endoscopic Surveillance |
Authors | Anant S. Vemuri, Stephane A. Nicolau, Jacques Marescaux, Luc Soler, Nicholas Ayache |
Abstract | Esophageal adenocarcinoma arises from Barrett’s esophagus, which is the most serious complication of gastroesophageal reflux disease. Strategies for screening involve periodic surveillance and tissue biopsies. A major challenge in such regular examinations is to record and track the disease evolution and re-localization of biopsied sites to provide targeted treatments. In this paper, we extend our original inter-operative relocalization framework to provide a constrained image based search for obtaining the best view-point match to the live view. Within this context we investigate the effect of: the choice of feature descriptors and color-space; filtering of uninformative frames and endoscopic modality, for view-point localization. Our experiments indicate an improvement in the best view-point retrieval rate to [92%,87%] from [73%,76%] (in our previous approach) for NBI and WL. |
Tasks | |
Published | 2016-10-13 |
URL | http://arxiv.org/abs/1610.04097v1 |
http://arxiv.org/pdf/1610.04097v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-view-point-selection-for-inter |
Repo | |
Framework | |
Maximizing Investment Value of Small-Scale PV in a Smart Grid Environment
Title | Maximizing Investment Value of Small-Scale PV in a Smart Grid Environment |
Authors | Jeremy Every, Li Li, Youguang G. Guo, David G. Dorrell |
Abstract | Determining the optimal size and orientation of small-scale residential based PV arrays will become increasingly complex in the future smart grid environment with the introduction of smart meters and dynamic tariffs. However consumers can leverage the availability of smart meter data to conduct a more detailed exploration of PV investment options for their particular circumstances. In this paper, an optimization method for PV orientation and sizing is proposed whereby maximizing the PV investment value is set as the defining objective. Solar insolation and PV array models are described to form the basis of the PV array optimization strategy. A constrained particle swarm optimization algorithm is selected due to its strong performance in non-linear applications. The optimization algorithm is applied to real-world metered data to quantify the possible investment value of a PV installation under different energy retailers and tariff structures. The arrangement with the highest value is determined to enable prospective small-scale PV investors to select the most cost-effective system. |
Tasks | |
Published | 2016-11-03 |
URL | http://arxiv.org/abs/1611.00890v1 |
http://arxiv.org/pdf/1611.00890v1.pdf | |
PWC | https://paperswithcode.com/paper/maximizing-investment-value-of-small-scale-pv |
Repo | |
Framework | |
Automatic segmentation of lizard spots using an active contour model
Title | Automatic segmentation of lizard spots using an active contour model |
Authors | Jhony Giraldo, Augusto Salazar |
Abstract | Animal biometrics is a challenging task. In the literature, many algorithms have been used, e.g. penguin chest recognition, elephant ears recognition and leopard stripes pattern recognition, but to use technology to a large extent in this area of research, still a lot of work has to be done. One important target in animal biometrics is to automate the segmentation process, so in this paper we propose a segmentation algorithm for extracting the spots of Diploglossus millepunctatus, an endangered lizard species. The automatic segmentation is achieved with a combination of preprocessing, active contours and morphology. The parameters of each stage of the segmentation algorithm are found using an optimization procedure, which is guided by the ground truth. The results show that automatic segmentation of spots is possible. A 78.37 % of correct segmentation in average is reached. |
Tasks | |
Published | 2016-03-02 |
URL | http://arxiv.org/abs/1603.00841v1 |
http://arxiv.org/pdf/1603.00841v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-segmentation-of-lizard-spots-using |
Repo | |
Framework | |
Median-Truncated Nonconvex Approach for Phase Retrieval with Outliers
Title | Median-Truncated Nonconvex Approach for Phase Retrieval with Outliers |
Authors | Huishuai Zhang, Yuejie Chi, Yingbin Liang |
Abstract | This paper investigates the phase retrieval problem, which aims to recover a signal from the magnitudes of its linear measurements. We develop statistically and computationally efficient algorithms for the situation when the measurements are corrupted by sparse outliers that can take arbitrary values. We propose a novel approach to robustify the gradient descent algorithm by using the sample median as a guide for pruning spurious samples in initialization and local search. Adopting the Poisson loss and the reshaped quadratic loss respectively, we obtain two algorithms termed median-TWF and median-RWF, both of which provably recover the signal from a near-optimal number of measurements when the measurement vectors are composed of i.i.d. Gaussian entries, up to a logarithmic factor, even when a constant fraction of the measurements are adversarially corrupted. We further show that both algorithms are stable in the presence of additional dense bounded noise. Our analysis is accomplished by developing non-trivial concentration results of median-related quantities, which may be of independent interest. We provide numerical experiments to demonstrate the effectiveness of our approach. |
Tasks | |
Published | 2016-03-11 |
URL | http://arxiv.org/abs/1603.03805v2 |
http://arxiv.org/pdf/1603.03805v2.pdf | |
PWC | https://paperswithcode.com/paper/median-truncated-nonconvex-approach-for-phase |
Repo | |
Framework | |
Asynchronous Stochastic Gradient Descent with Variance Reduction for Non-Convex Optimization
Title | Asynchronous Stochastic Gradient Descent with Variance Reduction for Non-Convex Optimization |
Authors | Zhouyuan Huo, Heng Huang |
Abstract | We provide the first theoretical analysis on the convergence rate of the asynchronous stochastic variance reduced gradient (SVRG) descent algorithm on non-convex optimization. Recent studies have shown that the asynchronous stochastic gradient descent (SGD) based algorithms with variance reduction converge with a linear convergent rate on convex problems. However, there is no work to analyze asynchronous SGD with variance reduction technique on non-convex problem. In this paper, we study two asynchronous parallel implementations of SVRG: one is on a distributed memory system and the other is on a shared memory system. We provide the theoretical analysis that both algorithms can obtain a convergence rate of $O(1/T)$, and linear speed up is achievable if the number of workers is upper bounded. V1,v2,v3 have been withdrawn due to reference issue, please refer the newest version v4. |
Tasks | |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03584v4 |
http://arxiv.org/pdf/1604.03584v4.pdf | |
PWC | https://paperswithcode.com/paper/asynchronous-stochastic-gradient-descent-with |
Repo | |
Framework | |
Counterfactual Prediction with Deep Instrumental Variables Networks
Title | Counterfactual Prediction with Deep Instrumental Variables Networks |
Authors | Jason Hartford, Greg Lewis, Kevin Leyton-Brown, Matt Taddy |
Abstract | We are in the middle of a remarkable rise in the use and capability of artificial intelligence. Much of this growth has been fueled by the success of deep learning architectures: models that map from observables to outputs via multiple layers of latent representations. These deep learning algorithms are effective tools for unstructured prediction, and they can be combined in AI systems to solve complex automated reasoning problems. This paper provides a recipe for combining ML algorithms to solve for causal effects in the presence of instrumental variables – sources of treatment randomization that are conditionally independent from the response. We show that a flexible IV specification resolves into two prediction tasks that can be solved with deep neural nets: a first-stage network for treatment prediction and a second-stage network whose loss function involves integration over the conditional treatment distribution. This Deep IV framework imposes some specific structure on the stochastic gradient descent routine used for training, but it is general enough that we can take advantage of off-the-shelf ML capabilities and avoid extensive algorithm customization. We outline how to obtain out-of-sample causal validation in order to avoid over-fit. We also introduce schemes for both Bayesian and frequentist inference: the former via a novel adaptation of dropout training, and the latter via a data splitting routine. |
Tasks | |
Published | 2016-12-30 |
URL | http://arxiv.org/abs/1612.09596v1 |
http://arxiv.org/pdf/1612.09596v1.pdf | |
PWC | https://paperswithcode.com/paper/counterfactual-prediction-with-deep |
Repo | |
Framework | |
Is Faster R-CNN Doing Well for Pedestrian Detection?
Title | Is Faster R-CNN Doing Well for Pedestrian Detection? |
Authors | Liliang Zhang, Liang Lin, Xiaodan Liang, Kaiming He |
Abstract | Detecting pedestrian has been arguably addressed as a special topic beyond general object detection. Although recent deep learning object detectors such as Fast/Faster R-CNN [1, 2] have shown excellent performance for general object detection, they have limited success for detecting pedestrian, and previous leading pedestrian detectors were in general hybrid methods combining hand-crafted and deep convolutional features. In this paper, we investigate issues involving Faster R-CNN [2] for pedestrian detection. We discover that the Region Proposal Network (RPN) in Faster R-CNN indeed performs well as a stand-alone pedestrian detector, but surprisingly, the downstream classifier degrades the results. We argue that two reasons account for the unsatisfactory accuracy: (i) insufficient resolution of feature maps for handling small instances, and (ii) lack of any bootstrapping strategy for mining hard negative examples. Driven by these observations, we propose a very simple but effective baseline for pedestrian detection, using an RPN followed by boosted forests on shared, high-resolution convolutional feature maps. We comprehensively evaluate this method on several benchmarks (Caltech, INRIA, ETH, and KITTI), presenting competitive accuracy and good speed. Code will be made publicly available. |
Tasks | Object Detection, Pedestrian Detection |
Published | 2016-07-24 |
URL | http://arxiv.org/abs/1607.07032v2 |
http://arxiv.org/pdf/1607.07032v2.pdf | |
PWC | https://paperswithcode.com/paper/is-faster-r-cnn-doing-well-for-pedestrian |
Repo | |
Framework | |
Magnetic skyrmion-based synaptic devices
Title | Magnetic skyrmion-based synaptic devices |
Authors | Yangqi Huang, Wang Kang, Xichao Zhang, Yan Zhou, Weisheng Zhao |
Abstract | Magnetic skyrmions are promising candidates for next-generation information carriers, owing to their small size, topological stability, and ultralow depinning current density. A wide variety of skyrmionic device concepts and prototypes have been proposed, highlighting their potential applications. Here, we report on a bioinspired skyrmionic device with synaptic plasticity. The synaptic weight of the proposed device can be strengthened/weakened by positive/negative stimuli, mimicking the potentiation/depression process of a biological synapse. Both short-term plasticity(STP) and long-term potentiation(LTP) functionalities have been demonstrated for a spiking time-dependent plasticity(STDP) scheme. This proposal suggests new possibilities for synaptic devices for use in spiking neuromorphic computing applications. |
Tasks | |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.07955v1 |
http://arxiv.org/pdf/1608.07955v1.pdf | |
PWC | https://paperswithcode.com/paper/magnetic-skyrmion-based-synaptic-devices |
Repo | |
Framework | |