Paper Group ANR 353
Effective Classification of MicroRNA Precursors Using Combinatorial Feature Mining and AdaBoost Algorithms. Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes. A Semi-Lagrangian two-level preconditioned Newton-Krylov solver for constrained diffeomorphic image registration. Texture Synthesis Thr …
Effective Classification of MicroRNA Precursors Using Combinatorial Feature Mining and AdaBoost Algorithms
Title | Effective Classification of MicroRNA Precursors Using Combinatorial Feature Mining and AdaBoost Algorithms |
Authors | Ling Zhong, Jason T. L. Wang |
Abstract | MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides (nt) that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpins can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (referred to as pseudo pre-miRNAs). Several computational methods have been developed to tackle this challenge. In this paper we propose a new method, called MirID, for identifying and classifying microRNA precursors. We collect 74 features from the sequences and secondary structures of pre-miRNAs; some of these features are taken from our previous studies on non-coding RNA prediction while others were suggested in the literature. We develop a combinatorial feature mining algorithm to identify suitable feature sets. These feature sets are then used to train support vector machines to obtain classification models, based on which classifier ensemble is constructed. Finally we use an AdaBoost algorithm to further enhance the accuracy of the classifier ensemble. Experimental results on a variety of species demonstrate the good performance of the proposed method, and its superiority over existing tools. |
Tasks | |
Published | 2016-10-06 |
URL | http://arxiv.org/abs/1610.02281v1 |
http://arxiv.org/pdf/1610.02281v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-classification-of-microrna |
Repo | |
Framework | |
Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes
Title | Information-Theoretic Methods for Planning and Learning in Partially Observable Markov Decision Processes |
Authors | Roy Fox |
Abstract | Bounded agents are limited by intrinsic constraints on their ability to process information that is available in their sensors and memory and choose actions and memory updates. In this dissertation, we model these constraints as information-rate constraints on communication channels connecting these various internal components of the agent. We make four major contributions detailed below and many smaller contributions detailed in each section. First, we formulate the problem of optimizing the agent under both extrinsic and intrinsic constraints and develop the main tools for solving it. Second, we identify another reason for the challenging convergence properties of the optimization algorithm, which is the bifurcation structure of the update operator near phase transitions. Third, we study the special case of linear-Gaussian dynamics and quadratic cost (LQG), where the optimal solution has a particularly simple and solvable form. Fourth, we explore the learning task, where the model of the world dynamics is unknown and sample-based updates are used instead. |
Tasks | |
Published | 2016-09-24 |
URL | http://arxiv.org/abs/1609.07672v2 |
http://arxiv.org/pdf/1609.07672v2.pdf | |
PWC | https://paperswithcode.com/paper/information-theoretic-methods-for-planning |
Repo | |
Framework | |
A Semi-Lagrangian two-level preconditioned Newton-Krylov solver for constrained diffeomorphic image registration
Title | A Semi-Lagrangian two-level preconditioned Newton-Krylov solver for constrained diffeomorphic image registration |
Authors | Andreas Mang, George Biros |
Abstract | We propose an efficient numerical algorithm for the solution of diffeomorphic image registration problems. We use a variational formulation constrained by a partial differential equation (PDE), where the constraints are a scalar transport equation. We use a pseudospectral discretization in space and second-order accurate semi-Lagrangian time stepping scheme for the transport equations. We solve for a stationary velocity field using a preconditioned, globalized, matrix-free Newton-Krylov scheme. We propose and test a two-level Hessian preconditioner. We consider two strategies for inverting the preconditioner on the coarse grid: a nested preconditioned conjugate gradient method (exact solve) and a nested Chebyshev iterative method (inexact solve) with a fixed number of iterations. We test the performance of our solver in different synthetic and real-world two-dimensional application scenarios. We study grid convergence and computational efficiency of our new scheme. We compare the performance of our solver against our initial implementation that uses the same spatial discretization but a standard, explicit, second-order Runge-Kutta scheme for the numerical time integration of the transport equations and a single-level preconditioner. Our improved scheme delivers significant speedups over our original implementation. As a highlight, we observe a 20$\times$ speedup for a two dimensional, real world multi-subject medical image registration problem. |
Tasks | Constrained Diffeomorphic Image Registration, Image Registration, Medical Image Registration |
Published | 2016-04-07 |
URL | http://arxiv.org/abs/1604.02153v2 |
http://arxiv.org/pdf/1604.02153v2.pdf | |
PWC | https://paperswithcode.com/paper/a-semi-lagrangian-two-level-preconditioned |
Repo | |
Framework | |
Texture Synthesis Through Convolutional Neural Networks and Spectrum Constraints
Title | Texture Synthesis Through Convolutional Neural Networks and Spectrum Constraints |
Authors | Gang Liu, Yann Gousseau, Gui-Song Xia |
Abstract | This paper presents a significant improvement for the synthesis of texture images using convolutional neural networks (CNNs), making use of constraints on the Fourier spectrum of the results. More precisely, the texture synthesis is regarded as a constrained optimization problem, with constraints conditioning both the Fourier spectrum and statistical features learned by CNNs. In contrast with existing methods, the presented method inherits from previous CNN approaches the ability to depict local structures and fine scale details, and at the same time yields coherent large scale structures, even in the case of quasi-periodic images. This is done at no extra computational cost. Synthesis experiments on various images show a clear improvement compared to a recent state-of-the art method relying on CNN constraints only. |
Tasks | Texture Synthesis |
Published | 2016-05-04 |
URL | http://arxiv.org/abs/1605.01141v3 |
http://arxiv.org/pdf/1605.01141v3.pdf | |
PWC | https://paperswithcode.com/paper/texture-synthesis-through-convolutional |
Repo | |
Framework | |
Automated OCT Segmentation for Images with DME
Title | Automated OCT Segmentation for Images with DME |
Authors | Sohini Roychowdhury, Dara D. Koozekanani, Michael Reinsbach, Keshab K. Parhi |
Abstract | This paper presents a novel automated system that segments six sub-retinal layers from optical coherence tomography (OCT) image stacks of healthy patients and patients with diabetic macular edema (DME). First, each image in the OCT stack is denoised using a Wiener deconvolution algorithm that estimates the additive speckle noise variance using a novel Fourier-domain based structural error. This denoising method enhances the image SNR by an average of 12dB. Next, the denoised images are subjected to an iterative multi-resolution high-pass filtering algorithm that detects seven sub-retinal surfaces in six iterative steps. The thicknesses of each sub-retinal layer for all scans from a particular OCT stack are then compared to the manually marked groundtruth. The proposed system uses adaptive thresholds for denoising and segmenting each image and hence it is robust to disruptions in the retinal micro-structure due to DME. The proposed denoising and segmentation system has an average error of 1.2-5.8 $\mu m$ and 3.5-26$\mu m$ for segmenting sub-retinal surfaces in normal and abnormal images with DME, respectively. For estimating the sub-retinal layer thicknesses, the proposed system has an average error of 0.2-2.5 $\mu m$ and 1.8-18 $\mu m$ in normal and abnormal images, respectively. Additionally, the average inner sub-retinal layer thickness in abnormal images is estimated as 275$\mu m (r=0.92)$ with an average error of 9.3 $\mu m$, while the average thickness of the outer layers in abnormal images is estimated as 57.4$\mu m (r=0.74)$ with an average error of 3.5 $\mu m$. The proposed system can be useful for tracking the disease progression for DME over a period of time. |
Tasks | Denoising |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07560v1 |
http://arxiv.org/pdf/1610.07560v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-oct-segmentation-for-images-with |
Repo | |
Framework | |
Bayesian Adaptive Data Analysis Guarantees from Subgaussianity
Title | Bayesian Adaptive Data Analysis Guarantees from Subgaussianity |
Authors | Sam Elder |
Abstract | The new field of adaptive data analysis seeks to provide algorithms and provable guarantees for models of machine learning that allow researchers to reuse their data, which normally falls outside of the usual statistical paradigm of static data analysis. In 2014, Dwork, Feldman, Hardt, Pitassi, Reingold and Roth introduced one potential model and proposed several solutions based on differential privacy. In previous work in 2016, we described a problem with this model and instead proposed a Bayesian variant, but also found that the analogous Bayesian methods cannot achieve the same statistical guarantees as in the static case. In this paper, we prove the first positive results for the Bayesian model, showing that with a Dirichlet prior, the posterior mean algorithm indeed matches the statistical guarantees of the static case. The main ingredient is a new theorem showing that the $\mathrm{Beta}(\alpha,\beta)$ distribution is subgaussian with variance proxy $O(1/(\alpha+\beta+1))$, a concentration result also of independent interest. We provide two proofs of this result: a probabilistic proof utilizing a simple condition for the raw moments of a positive random variable and a learning-theoretic proof based on considering the beta distribution as a posterior, both of which have implications to other related problems. |
Tasks | |
Published | 2016-10-31 |
URL | http://arxiv.org/abs/1611.00065v3 |
http://arxiv.org/pdf/1611.00065v3.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-adaptive-data-analysis-guarantees |
Repo | |
Framework | |
Zero-Shot Visual Recognition via Bidirectional Latent Embedding
Title | Zero-Shot Visual Recognition via Bidirectional Latent Embedding |
Authors | Qian Wang, Ke Chen |
Abstract | Zero-shot learning for visual recognition, e.g., object and action recognition, has recently attracted a lot of attention. However, it still remains challenging in bridging the semantic gap between visual features and their underlying semantics and transferring knowledge to semantic categories unseen during learning. Unlike most of the existing zero-shot visual recognition methods, we propose a stagewise bidirectional latent embedding framework to two subsequent learning stages for zero-shot visual recognition. In the bottom-up stage, a latent embedding space is first created by exploring the topological and labeling information underlying training data of known classes via a proper supervised subspace learning algorithm and the latent embedding of training data are used to form landmarks that guide embedding semantics underlying unseen classes into this learned latent space. In the top-down stage, semantic representations of unseen-class labels in a given label vocabulary are then embedded to the same latent space to preserve the semantic relatedness between all different classes via our proposed semi-supervised Sammon mapping with the guidance of landmarks. Thus, the resultant latent embedding space allows for predicting the label of a test instance with a simple nearest-neighbor rule. To evaluate the effectiveness of the proposed framework, we have conducted extensive experiments on four benchmark datasets in object and action recognition, i.e., AwA, CUB-200-2011, UCF101 and HMDB51. The experimental results under comparative studies demonstrate that our proposed approach yields the state-of-the-art performance under inductive and transductive settings. |
Tasks | Temporal Action Localization, Zero-Shot Learning |
Published | 2016-07-07 |
URL | http://arxiv.org/abs/1607.02104v4 |
http://arxiv.org/pdf/1607.02104v4.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-visual-recognition-via |
Repo | |
Framework | |
On Minimal Accuracy Algorithm Selection in Computer Vision and Intelligent Systems
Title | On Minimal Accuracy Algorithm Selection in Computer Vision and Intelligent Systems |
Authors | Martin Lukac, Kamila Abdiyeva, Michitaka Kameyama |
Abstract | In this paper we discuss certain theoretical properties of algorithm selection approach to image processing and to intelligent system in general. We analyze the theoretical limits of algorithm selection with respect to the algorithm selection accuracy. We show the theoretical formulation of a crisp bound on the algorithm selector precision guaranteeing to always obtain better than the best available algorithm result. |
Tasks | |
Published | 2016-08-12 |
URL | http://arxiv.org/abs/1608.03832v1 |
http://arxiv.org/pdf/1608.03832v1.pdf | |
PWC | https://paperswithcode.com/paper/on-minimal-accuracy-algorithm-selection-in |
Repo | |
Framework | |
A Survey of Inductive Biases for Factorial Representation-Learning
Title | A Survey of Inductive Biases for Factorial Representation-Learning |
Authors | Karl Ridgeway |
Abstract | With the resurgence of interest in neural networks, representation learning has re-emerged as a central focus in artificial intelligence. Representation learning refers to the discovery of useful encodings of data that make domain-relevant information explicit. Factorial representations identify underlying independent causal factors of variation in data. A factorial representation is compact and faithful, makes the causal factors explicit, and facilitates human interpretation of data. Factorial representations support a variety of applications, including the generation of novel examples, indexing and search, novelty detection, and transfer learning. This article surveys various constraints that encourage a learning algorithm to discover factorial representations. I dichotomize the constraints in terms of unsupervised and supervised inductive bias. Unsupervised inductive biases exploit assumptions about the environment, such as the statistical distribution of factor coefficients, assumptions about the perturbations a factor should be invariant to (e.g. a representation of an object can be invariant to rotation, translation or scaling), and assumptions about how factors are combined to synthesize an observation. Supervised inductive biases are constraints on the representations based on additional information connected to observations. Supervisory labels come in variety of types, which vary in how strongly they constrain the representation, how many factors are labeled, how many observations are labeled, and whether or not we know the associations between the constraints and the factors they are related to. This survey brings together a wide variety of models that all touch on the problem of learning factorial representations and lays out a framework for comparing these models based on the strengths of the underlying supervised and unsupervised inductive biases. |
Tasks | Representation Learning, Transfer Learning |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.05299v1 |
http://arxiv.org/pdf/1612.05299v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-inductive-biases-for-factorial |
Repo | |
Framework | |
Visual Closed-Loop Control for Pouring Liquids
Title | Visual Closed-Loop Control for Pouring Liquids |
Authors | Connor Schenck, Dieter Fox |
Abstract | Pouring a specific amount of liquid is a challenging task. In this paper we develop methods for robots to use visual feedback to perform closed-loop control for pouring liquids. We propose both a model-based and a model-free method utilizing deep learning for estimating the volume of liquid in a container. Our results show that the model-free method is better able to estimate the volume. We combine this with a simple PID controller to pour specific amounts of liquid, and show that the robot is able to achieve an average 38ml deviation from the target amount. To our knowledge, this is the first use of raw visual feedback to pour liquids in robotics. |
Tasks | |
Published | 2016-10-09 |
URL | http://arxiv.org/abs/1610.02610v3 |
http://arxiv.org/pdf/1610.02610v3.pdf | |
PWC | https://paperswithcode.com/paper/visual-closed-loop-control-for-pouring |
Repo | |
Framework | |
Noisy population recovery in polynomial time
Title | Noisy population recovery in polynomial time |
Authors | Anindya De, Michael Saks, Sijian Tang |
Abstract | In the noisy population recovery problem of Dvir et al., the goal is to learn an unknown distribution $f$ on binary strings of length $n$ from noisy samples. For some parameter $\mu \in [0,1]$, a noisy sample is generated by flipping each coordinate of a sample from $f$ independently with probability $(1-\mu)/2$. We assume an upper bound $k$ on the size of the support of the distribution, and the goal is to estimate the probability of any string to within some given error $\varepsilon$. It is known that the algorithmic complexity and sample complexity of this problem are polynomially related to each other. We show that for $\mu > 0$, the sample complexity (and hence the algorithmic complexity) is bounded by a polynomial in $k$, $n$ and $1/\varepsilon$ improving upon the previous best result of $\mathsf{poly}(k^{\log\log k},n,1/\varepsilon)$ due to Lovett and Zhang. Our proof combines ideas from Lovett and Zhang with a \emph{noise attenuated} version of M"{o}bius inversion. In turn, the latter crucially uses the construction of \emph{robust local inverse} due to Moitra and Saks. |
Tasks | |
Published | 2016-02-24 |
URL | http://arxiv.org/abs/1602.07616v1 |
http://arxiv.org/pdf/1602.07616v1.pdf | |
PWC | https://paperswithcode.com/paper/noisy-population-recovery-in-polynomial-time |
Repo | |
Framework | |
Graph-Based Active Learning: A New Look at Expected Error Minimization
Title | Graph-Based Active Learning: A New Look at Expected Error Minimization |
Authors | Kwang-Sung Jun, Robert Nowak |
Abstract | In graph-based active learning, algorithms based on expected error minimization (EEM) have been popular and yield good empirical performance. The exact computation of EEM optimally balances exploration and exploitation. In practice, however, EEM-based algorithms employ various approximations due to the computational hardness of exact EEM. This can result in a lack of either exploration or exploitation, which can negatively impact the effectiveness of active learning. We propose a new algorithm TSA (Two-Step Approximation) that balances between exploration and exploitation efficiently while enjoying the same computational complexity as existing approximations. Finally, we empirically show the value of balancing between exploration and exploitation in both toy and real-world datasets where our method outperforms several state-of-the-art methods. |
Tasks | Active Learning |
Published | 2016-09-03 |
URL | http://arxiv.org/abs/1609.00845v1 |
http://arxiv.org/pdf/1609.00845v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-active-learning-a-new-look-at |
Repo | |
Framework | |
$\mathbf{D^3}$: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images
Title | $\mathbf{D^3}$: Deep Dual-Domain Based Fast Restoration of JPEG-Compressed Images |
Authors | Zhangyang Wang, Ding Liu, Shiyu Chang, Qing Ling, Yingzhen Yang, Thomas S. Huang |
Abstract | In this paper, we design a Deep Dual-Domain ($\mathbf{D^3}$) based fast restoration model to remove artifacts of JPEG compressed images. It leverages the large learning capacity of deep networks, as well as the problem-specific expertise that was hardly incorporated in the past design of deep architectures. For the latter, we take into consideration both the prior knowledge of the JPEG compression scheme, and the successful practice of the sparsity-based dual-domain approach. We further design the One-Step Sparse Inference (1-SI) module, as an efficient and light-weighted feed-forward approximation of sparse coding. Extensive experiments verify the superiority of the proposed $D^3$ model over several state-of-the-art methods. Specifically, our best model is capable of outperforming the latest deep model for around 1 dB in PSNR, and is 30 times faster. |
Tasks | |
Published | 2016-01-16 |
URL | http://arxiv.org/abs/1601.04149v3 |
http://arxiv.org/pdf/1601.04149v3.pdf | |
PWC | https://paperswithcode.com/paper/mathbfd3-deep-dual-domain-based-fast |
Repo | |
Framework | |
Learning-Based View Synthesis for Light Field Cameras
Title | Learning-Based View Synthesis for Light Field Cameras |
Authors | Nima Khademi Kalantari, Ting-Chun Wang, Ravi Ramamoorthi |
Abstract | With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either spatial or angular domain. In this paper, we use machine learning to mitigate this trade-off. Specifically, we propose a novel learning-based approach to synthesize new views from a sparse set of input views. We build upon existing view synthesis techniques and break down the process into disparity and color estimation components. We use two sequential convolutional neural networks to model these two components and train both networks simultaneously by minimizing the error between the synthesized and ground truth images. We show the performance of our approach using only four corner sub-aperture views from the light fields captured by the Lytro Illum camera. Experimental results show that our approach synthesizes high-quality images that are superior to the state-of-the-art techniques on a variety of challenging real-world scenes. We believe our method could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase. |
Tasks | |
Published | 2016-09-09 |
URL | http://arxiv.org/abs/1609.02974v1 |
http://arxiv.org/pdf/1609.02974v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-based-view-synthesis-for-light-field |
Repo | |
Framework | |
Spatio-temporal Co-Occurrence Characterizations for Human Action Classification
Title | Spatio-temporal Co-Occurrence Characterizations for Human Action Classification |
Authors | Aznul Qalid Md Sabri, Jacques Boonaert, Erma Rahayu Mohd Faizal Abdullah, Ali Mohammed Mansoor |
Abstract | The human action classification task is a widely researched topic and is still an open problem. Many state-of-the-arts approaches involve the usage of bag-of-video-words with spatio-temporal local features to construct characterizations for human actions. In order to improve beyond this standard approach, we investigate the usage of co-occurrences between local features. We propose the usage of co-occurrences information to characterize human actions. A trade-off factor is used to define an optimal trade-off between vocabulary size and classification rate. Next, a spatio-temporal co-occurrence technique is applied to extract co-occurrence information between labeled local features. Novel characterizations for human actions are then constructed. These include a vector quantized correlogram-elements vector, a highly discriminative PCA (Principal Components Analysis) co-occurrence vector and a Haralick texture vector. Multi-channel kernel SVM (support vector machine) is utilized for classification. For evaluation, the well known KTH as well as the challenging UCF-Sports action datasets are used. We obtained state-of-the-arts classification performance. We also demonstrated that we are able to fully utilize co-occurrence information, and improve the standard bag-of-video-words approach. |
Tasks | Action Classification |
Published | 2016-08-02 |
URL | http://arxiv.org/abs/1610.05174v1 |
http://arxiv.org/pdf/1610.05174v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-co-occurrence |
Repo | |
Framework | |