May 6, 2019

2993 words 15 mins read

Paper Group ANR 326

Paper Group ANR 326

Semi-supervised Zero-Shot Learning by a Clustering-based Approach. Generative Image Modeling using Style and Structure Adversarial Networks. Variational Lossy Autoencoder. Semi-supervised Learning using Denoising Autoencoders for Brain Lesion Detection and Segmentation. Temporally Robust Global Motion Compensation by Keypoint-based Congealing. Extr …

Semi-supervised Zero-Shot Learning by a Clustering-based Approach

Title Semi-supervised Zero-Shot Learning by a Clustering-based Approach
Authors Seyed Mohsen Shojaee, Mahdieh Soleymani Baghshah
Abstract In some of object recognition problems, labeled data may not be available for all categories. Zero-shot learning utilizes auxiliary information (also called signatures) describing each category in order to find a classifier that can recognize samples from categories with no labeled instance. In this paper, we propose a novel semi-supervised zero-shot learning method that works on an embedding space corresponding to abstract deep visual features. We seek a linear transformation on signatures to map them onto the visual features, such that the mapped signatures of the seen classes are close to labeled samples of the corresponding classes and unlabeled data are also close to the mapped signatures of one of the unseen classes. We use the idea that the rich deep visual features provide a representation space in which samples of each class are usually condensed in a cluster. The effectiveness of the proposed method is demonstrated through extensive experiments on four public benchmarks improving the state-of-the-art prediction accuracy on three of them.
Tasks Object Recognition, Zero-Shot Learning
Published 2016-05-29
URL http://arxiv.org/abs/1605.09016v2
PDF http://arxiv.org/pdf/1605.09016v2.pdf
PWC https://paperswithcode.com/paper/semi-supervised-zero-shot-learning-by-a
Repo
Framework

Generative Image Modeling using Style and Structure Adversarial Networks

Title Generative Image Modeling using Style and Structure Adversarial Networks
Authors Xiaolong Wang, Abhinav Gupta
Abstract Current generative frameworks use end-to-end learning and generate images by sampling from uniform noise distribution. However, these approaches ignore the most basic principle of image formation: images are product of: (a) Structure: the underlying 3D model; (b) Style: the texture mapped onto structure. In this paper, we factorize the image generation process and propose Style and Structure Generative Adversarial Network (S^2-GAN). Our S^2-GAN has two components: the Structure-GAN generates a surface normal map; the Style-GAN takes the surface normal map as input and generates the 2D image. Apart from a real vs. generated loss function, we use an additional loss with computed surface normals from generated images. The two GANs are first trained independently, and then merged together via joint learning. We show our S^2-GAN model is interpretable, generates more realistic images and can be used to learn unsupervised RGBD representations.
Tasks Image Generation
Published 2016-03-17
URL http://arxiv.org/abs/1603.05631v2
PDF http://arxiv.org/pdf/1603.05631v2.pdf
PWC https://paperswithcode.com/paper/generative-image-modeling-using-style-and
Repo
Framework

Variational Lossy Autoencoder

Title Variational Lossy Autoencoder
Authors Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel
Abstract Representation learning seeks to expose certain aspects of observed data in a learned representation that’s amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only “autoencodes” data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(xz)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.
Tasks Density Estimation, Omniglot, Representation Learning
Published 2016-11-08
URL http://arxiv.org/abs/1611.02731v2
PDF http://arxiv.org/pdf/1611.02731v2.pdf
PWC https://paperswithcode.com/paper/variational-lossy-autoencoder
Repo
Framework

Semi-supervised Learning using Denoising Autoencoders for Brain Lesion Detection and Segmentation

Title Semi-supervised Learning using Denoising Autoencoders for Brain Lesion Detection and Segmentation
Authors Varghese Alex, Kiran Vaidhya, Subramaniam Thirunavukkarasu, Chandrasekharan Kesavdas, Ganapathy Krishnamurthi
Abstract The work presented explores the use of denoising autoencoders (DAE) for brain lesion detection, segmentation and false positive reduction. Stacked denoising autoencoders (SDAE) were pre-trained using a large number of unlabeled patient volumes and fine tuned with patches drawn from a limited number of patients (n=20, 40, 65). The results show negligible loss in performance even when SDAE was fine tuned using 20 patients. Low grade glioma (LGG) segmentation was achieved using a transfer learning approach wherein a network pre-trained with High Grade Glioma (HGG) data was fine tuned using LGG image patches. The weakly supervised SDAE (for HGG) and transfer learning based LGG network were also shown to generalize well and provide good segmentation on unseen BraTS 2013 & BraTS 2015 test data. An unique contribution includes a single layer DAE, referred to as novelty detector(ND). ND was trained to accurately reconstruct non-lesion patches using a mean squared error loss function. The reconstruction error maps of test data were used to identify regions containing lesions. The error maps were shown to assign unique error distributions to various constituents of the glioma, enabling localization. The ND learns the non-lesion brain accurately as it was also shown to provide good segmentation performance on ischemic brain lesions in images from a different database.
Tasks Denoising, Transfer Learning
Published 2016-11-26
URL http://arxiv.org/abs/1611.08664v4
PDF http://arxiv.org/pdf/1611.08664v4.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-using-denoising
Repo
Framework

Temporally Robust Global Motion Compensation by Keypoint-based Congealing

Title Temporally Robust Global Motion Compensation by Keypoint-based Congealing
Authors S. Morteza Safdarnejad, Yousef Atoum, Xiaoming Liu
Abstract Global motion compensation (GMC) removes the impact of camera motion and creates a video in which the background appears static over the progression of time. Various vision problems, such as human activity recognition, background reconstruction, and multi-object tracking can benefit from GMC. Existing GMC algorithms rely on sequentially processing consecutive frames, by estimating the transformation mapping the two frames, and obtaining a composite transformation to a global motion compensated coordinate. Sequential GMC suffers from temporal drift of frames from the accurate global coordinate, due to either error accumulation or sporadic failures of motion estimation at a few frames. We propose a temporally robust global motion compensation (TRGMC) algorithm which performs accurate and stable GMC, despite complicated and long-term camera motion. TRGMC densely connects pairs of frames, by matching local keypoints of each frame. A joint alignment of these frames is formulated as a novel keypoint-based congealing problem, where the transformation of each frame is updated iteratively, such that the spatial coordinates for the start and end points of matched keypoints are identical. Experimental results demonstrate that TRGMC has superior performance in a wide range of scenarios.
Tasks Activity Recognition, Human Activity Recognition, Motion Compensation, Motion Estimation, Multi-Object Tracking, Object Tracking
Published 2016-03-12
URL http://arxiv.org/abs/1603.03968v1
PDF http://arxiv.org/pdf/1603.03968v1.pdf
PWC https://paperswithcode.com/paper/temporally-robust-global-motion-compensation
Repo
Framework

Extreme Stochastic Variational Inference: Distributed and Asynchronous

Title Extreme Stochastic Variational Inference: Distributed and Asynchronous
Authors Jiong Zhang, Parameswaran Raman, Shihao Ji, Hsiang-Fu Yu, S. V. N. Vishwanathan, Inderjit S. Dhillon
Abstract Stochastic variational inference (SVI), the state-of-the-art algorithm for scaling variational inference to large-datasets, is inherently serial. Moreover, it requires the parameters to fit in the memory of a single processor; this is problematic when the number of parameters is in billions. In this paper, we propose extreme stochastic variational inference (ESVI), an asynchronous and lock-free algorithm to perform variational inference for mixture models on massive real world datasets. ESVI overcomes the limitations of SVI by requiring that each processor only access a subset of the data and a subset of the parameters, thus providing data and model parallelism simultaneously. We demonstrate the effectiveness of ESVI by running Latent Dirichlet Allocation (LDA) on UMBC-3B, a dataset that has a vocabulary of 3 million and a token size of 3 billion. In our experiments, we found that ESVI not only outperforms VI and SVI in wallclock-time, but also achieves a better quality solution. In addition, we propose a strategy to speed up computation and save memory when fitting large number of topics.
Tasks
Published 2016-05-31
URL http://arxiv.org/abs/1605.09499v9
PDF http://arxiv.org/pdf/1605.09499v9.pdf
PWC https://paperswithcode.com/paper/extreme-stochastic-variational-inference
Repo
Framework

Bacterial Foraging Optimized STATCOM for Stability Assessment in Power System

Title Bacterial Foraging Optimized STATCOM for Stability Assessment in Power System
Authors Shiba R. Paital, Prakash K. Ray, Asit Mohanty, Sandipan Patra, Harishchandra Dubey
Abstract This paper presents a study of improvement in stability in a single machine connected to infinite bus (SMIB) power system by using static compensator (STATCOM). The gains of Proportional-Integral-Derivative (PID) controller in STATCOM are being optimized by heuristic technique based on Particle swarm optimization (PSO). Further, Bacterial Foraging Optimization (BFO) as an alternative heuristic method is also applied to select optimal gains of PID controller. The performance of STATCOM with the above soft-computing techniques are studied and compared with the conventional PID controller under various scenarios. The simulation results are accompanied with performance indices based quantitative analysis. The analysis clearly signifies the robustness of the new scheme in terms of stability and voltage regulation when compared with conventional PID.
Tasks
Published 2016-10-01
URL http://arxiv.org/abs/1610.00001v1
PDF http://arxiv.org/pdf/1610.00001v1.pdf
PWC https://paperswithcode.com/paper/bacterial-foraging-optimized-statcom-for
Repo
Framework

PDDL+ Planning via Constraint Answer Set Programming

Title PDDL+ Planning via Constraint Answer Set Programming
Authors Marcello Balduccini, Daniele Magazzeni, Marco Maratea
Abstract PDDL+ is an extension of PDDL that enables modelling planning domains with mixed discrete-continuous dynamics. In this paper we present a new approach to PDDL+ planning based on Constraint Answer Set Programming (CASP), i.e. ASP rules plus numerical constraints. To the best of our knowledge, ours is the first attempt to link PDDL+ planning and logic programming. We provide an encoding of PDDL+ models into CASP problems. The encoding can handle non-linear hybrid domains, and represents a solid basis for applying logic programming to PDDL+ planning. As a case study, we consider the EZCSP CASP solver and obtain promising results on a set of PDDL+ benchmark problems.
Tasks
Published 2016-08-31
URL http://arxiv.org/abs/1609.00030v1
PDF http://arxiv.org/pdf/1609.00030v1.pdf
PWC https://paperswithcode.com/paper/pddl-planning-via-constraint-answer-set
Repo
Framework

Representing Independence Models with Elementary Triplets

Title Representing Independence Models with Elementary Triplets
Authors Jose M. Peña
Abstract In an independence model, the triplets that represent conditional independences between singletons are called elementary. It is known that the elementary triplets represent the independence model unambiguously under some conditions. In this paper, we show how this representation helps performing some operations with independence models, such as finding the dominant triplets or a minimal independence map of an independence model, or computing the union or intersection of a pair of independence models, or performing causal reasoning. For the latter, we rephrase in terms of conditional independences some of Pearl’s results for computing causal effects.
Tasks
Published 2016-12-04
URL http://arxiv.org/abs/1612.01095v1
PDF http://arxiv.org/pdf/1612.01095v1.pdf
PWC https://paperswithcode.com/paper/representing-independence-models-with
Repo
Framework

Local Uncertainty Sampling for Large-Scale Multi-Class Logistic Regression

Title Local Uncertainty Sampling for Large-Scale Multi-Class Logistic Regression
Authors Lei Han, Kean Ming Tan, Ting Yang, Tong Zhang
Abstract A major challenge for building statistical models in the big data era is that the available data volume far exceeds the computational capability. A common approach for solving this problem is to employ a subsampled dataset that can be handled by available computational resources. In this paper, we propose a general subsampling scheme for large-scale multi-class logistic regression and examine the variance of the resulting estimator. We show that asymptotically, the proposed method always achieves a smaller variance than that of the uniform random sampling. Moreover, when the classes are conditionally imbalanced, significant improvement over uniform sampling can be achieved. Empirical performance of the proposed method is compared to other methods on both simulated and real-world datasets, and these results match and confirm our theoretical analysis.
Tasks
Published 2016-04-27
URL http://arxiv.org/abs/1604.08098v3
PDF http://arxiv.org/pdf/1604.08098v3.pdf
PWC https://paperswithcode.com/paper/local-uncertainty-sampling-for-large-scale
Repo
Framework

Few-Shot Object Recognition from Machine-Labeled Web Images

Title Few-Shot Object Recognition from Machine-Labeled Web Images
Authors Zhongwen Xu, Linchao Zhu, Yi Yang
Abstract With the tremendous advances of Convolutional Neural Networks (ConvNets) on object recognition, we can now obtain reliable enough machine-labeled annotations easily by predictions from off-the-shelf ConvNets. In this work, we present an abstraction memory based framework for few-shot learning, building upon machine-labeled image annotations. Our method takes some large-scale machine-annotated datasets (e.g., OpenImages) as an external memory bank. In the external memory bank, the information is stored in the memory slots with the form of key-value, where image feature is regarded as key and label embedding serves as value. When queried by the few-shot examples, our model selects visually similar data from the external memory bank, and writes the useful information obtained from related external data into another memory bank, i.e., abstraction memory. Long Short-Term Memory (LSTM) controllers and attention mechanisms are utilized to guarantee the data written to the abstraction memory is correlated to the query example. The abstraction memory concentrates information from the external memory bank, so that it makes the few-shot recognition effective. In the experiments, we firstly confirm that our model can learn to conduct few-shot object recognition on clean human-labeled data from ImageNet dataset. Then, we demonstrate that with our model, machine-labeled image annotations are very effective and abundant resources to perform object recognition on novel categories. Experimental results show that our proposed model with machine-labeled annotations achieves great performance, only with a gap of 1% between of the one with human-labeled annotations.
Tasks Few-Shot Learning, Object Recognition
Published 2016-12-19
URL http://arxiv.org/abs/1612.06152v1
PDF http://arxiv.org/pdf/1612.06152v1.pdf
PWC https://paperswithcode.com/paper/few-shot-object-recognition-from-machine
Repo
Framework

DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation

Title DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation
Authors Yaroslav Ganin, Daniil Kononenko, Diana Sungatullina, Victor Lempitsky
Abstract In this work, we consider the task of generating highly-realistic images of a given face with a redirected gaze. We treat this problem as a specific instance of conditional image generation and suggest a new deep architecture that can handle this task very well as revealed by numerical comparison with prior art and a user study. Our deep architecture performs coarse-to-fine warping with an additional intensity correction of individual pixels. All these operations are performed in a feed-forward manner, and the parameters associated with different operations are learned jointly in the end-to-end fashion. After learning, the resulting neural network can synthesize images with manipulated gaze, while the redirection angle can be selected arbitrarily from a certain range and provided as an input to the network.
Tasks Conditional Image Generation, Image Generation
Published 2016-07-25
URL http://arxiv.org/abs/1607.07215v2
PDF http://arxiv.org/pdf/1607.07215v2.pdf
PWC https://paperswithcode.com/paper/deepwarp-photorealistic-image-resynthesis-for
Repo
Framework

Doubly Convolutional Neural Networks

Title Doubly Convolutional Neural Networks
Authors Shuangfei Zhai, Yu Cheng, Weining Lu, Zhongfei Zhang
Abstract Building large models with parameter sharing accounts for most of the success of deep convolutional neural networks (CNNs). In this paper, we propose doubly convolutional neural networks (DCNNs), which significantly improve the performance of CNNs by further exploring this idea. In stead of allocating a set of convolutional filters that are independently learned, a DCNN maintains groups of filters where filters within each group are translated versions of each other. Practically, a DCNN can be easily implemented by a two-step convolution procedure, which is supported by most modern deep learning libraries. We perform extensive experiments on three image classification benchmarks: CIFAR-10, CIFAR-100 and ImageNet, and show that DCNNs consistently outperform other competing architectures. We have also verified that replacing a convolutional layer with a doubly convolutional layer at any depth of a CNN can improve its performance. Moreover, various design choices of DCNNs are demonstrated, which shows that DCNN can serve the dual purpose of building more accurate models and/or reducing the memory footprint without sacrificing the accuracy.
Tasks Image Classification
Published 2016-10-30
URL http://arxiv.org/abs/1610.09716v1
PDF http://arxiv.org/pdf/1610.09716v1.pdf
PWC https://paperswithcode.com/paper/doubly-convolutional-neural-networks
Repo
Framework

Unsupervised classification of children’s bodies using currents

Title Unsupervised classification of children’s bodies using currents
Authors Sonia Barahona, Ximo Gual-Arnau, Maria Victoria Ibáñez, Amelia Simó
Abstract Object classification according to their shape and size is of key importance in many scientific fields. This work focuses on the case where the size and shape of an object is characterized by a current}. A current is a mathematical object which has been proved relevant to the modeling of geometrical data, like submanifolds, through integration of vector fields along them. As a consequence of the choice of a vector-valued Reproducing Kernel Hilbert Space (RKHS) as a test space for integrating manifolds, it is possible to consider that shapes are embedded in this Hilbert Space. A vector-valued RKHS is a Hilbert space of vector fields; therefore, it is possible to compute a mean of shapes, or to calculate a distance between two manifolds. This embedding enables us to consider size-and-shape classification algorithms. These algorithms are applied to a 3D database obtained from an anthropometric survey of the Spanish child population with a potential application to online sales of children’s wear.
Tasks Object Classification
Published 2016-06-06
URL http://arxiv.org/abs/1606.01746v1
PDF http://arxiv.org/pdf/1606.01746v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-classification-of-childrens
Repo
Framework

k2-means for fast and accurate large scale clustering

Title k2-means for fast and accurate large scale clustering
Authors Eirikur Agustsson, Radu Timofte, Luc Van Gool
Abstract We propose k^2-means, a new clustering method which efficiently copes with large numbers of clusters and achieves low energy solutions. k^2-means builds upon the standard k-means (Lloyd’s algorithm) and combines a new strategy to accelerate the convergence with a new low time complexity divisive initialization. The accelerated convergence is achieved through only looking at k_n nearest clusters and using triangle inequality bounds in the assignment step while the divisive initialization employs an optimal 2-clustering along a direction. The worst-case time complexity per iteration of our k^2-means is O(nk_nd+k^2d), where d is the dimension of the n data points and k is the number of clusters and usually n « k « k_n. Compared to k-means’ O(nkd) complexity, our k^2-means complexity is significantly lower, at the expense of slightly increasing the memory complexity by O(nk_n+k^2). In our extensive experiments k^2-means is order(s) of magnitude faster than standard methods in computing accurate clusterings on several standard datasets and settings with hundreds of clusters and high dimensional data. Moreover, the proposed divisive initialization generally leads to clustering energies comparable to those achieved with the standard k-means++ initialization, while being significantly faster.
Tasks
Published 2016-05-30
URL http://arxiv.org/abs/1605.09299v1
PDF http://arxiv.org/pdf/1605.09299v1.pdf
PWC https://paperswithcode.com/paper/k2-means-for-fast-and-accurate-large-scale
Repo
Framework
comments powered by Disqus