July 27, 2019

2702 words 13 mins read

Paper Group ANR 750

One-Shot Coresets: The Case of k-Clustering. R-C3D: Region Convolutional 3D Network for Temporal Activity Detection. Robust Photometric Stereo via Dictionary Learning. Non-rigid image registration using fully convolutional networks with deep self-supervision. An Improved Oscillating-Error Classifier with Branching. Optimal Rates for Learning with N …

One-Shot Coresets: The Case of k-Clustering


Title	One-Shot Coresets: The Case of k-Clustering
Authors	Olivier Bachem, Mario Lucic, Silvio Lattanzi
Abstract	Scaling clustering algorithms to massive data sets is a challenging task. Recently, several successful approaches based on data summarization methods, such as coresets and sketches, were proposed. While these techniques provide provably good and small summaries, they are inherently problem dependent - the practitioner has to commit to a fixed clustering objective before even exploring the data. However, can one construct small data summaries for a wide range of clustering problems simultaneously? In this work, we affirmatively answer this question by proposing an efficient algorithm that constructs such one-shot summaries for k-clustering problems while retaining strong theoretical guarantees.
Tasks	Data Summarization
Published	2017-11-27
URL	http://arxiv.org/abs/1711.09649v3
PDF	http://arxiv.org/pdf/1711.09649v3.pdf
PWC	https://paperswithcode.com/paper/one-shot-coresets-the-case-of-k-clustering
Repo
Framework

R-C3D: Region Convolutional 3D Network for Temporal Activity Detection


Title	R-C3D: Region Convolutional 3D Network for Temporal Activity Detection
Authors	Huijuan Xu, Abir Das, Kate Saenko
Abstract	We address the problem of activity detection in continuous, untrimmed video streams. This is a difficult task that requires extracting meaningful spatio-temporal features to capture activities, accurately localizing the start and end times of each activity. We introduce a new model, Region Convolutional 3D Network (R-C3D), which encodes the video streams using a three-dimensional fully convolutional network, then generates candidate temporal regions containing activities, and finally classifies selected regions into specific activities. Computation is saved due to the sharing of convolutional features between the proposal and the classification pipelines. The entire model is trained end-to-end with jointly optimized localization and classification losses. R-C3D is faster than existing methods (569 frames per second on a single Titan X Maxwell GPU) and achieves state-of-the-art results on THUMOS’14. We further demonstrate that our model is a general activity detection framework that does not rely on assumptions about particular dataset properties by evaluating our approach on ActivityNet and Charades. Our code is available at http://ai.bu.edu/r-c3d/.
Tasks	Action Detection, Activity Detection
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07814v2
PDF	http://arxiv.org/pdf/1703.07814v2.pdf
PWC	https://paperswithcode.com/paper/r-c3d-region-convolutional-3d-network-for
Repo
Framework

Robust Photometric Stereo via Dictionary Learning


Title	Robust Photometric Stereo via Dictionary Learning
Authors	Andrew J. Wagenmaker, Brian E. Moore, Raj Rao Nadakuditi
Abstract	Photometric stereo is a method that seeks to reconstruct the normal vectors of an object from a set of images of the object illuminated under different light sources. While effective in some situations, classical photometric stereo relies on a diffuse surface model that cannot handle objects with complex reflectance patterns, and it is sensitive to non-idealities in the images. In this work, we propose a novel approach to photometric stereo that relies on dictionary learning to produce robust normal vector reconstructions. Specifically, we develop two formulations for applying dictionary learning to photometric stereo. We propose a model that applies dictionary learning to regularize and reconstruct the normal vectors from the images under the classic Lambertian reflectance model. We then generalize this model to explicitly model non-Lambertian objects. We investigate both approaches through extensive experimentation on synthetic and real benchmark datasets and observe state-of-the-art performance compared to existing robust photometric stereo methods.
Tasks	Dictionary Learning
Published	2017-10-24
URL	http://arxiv.org/abs/1710.08873v3
PDF	http://arxiv.org/pdf/1710.08873v3.pdf
PWC	https://paperswithcode.com/paper/robust-photometric-stereo-via-dictionary
Repo
Framework

Non-rigid image registration using fully convolutional networks with deep self-supervision


Title	Non-rigid image registration using fully convolutional networks with deep self-supervision
Authors	Hongming Li, Yong Fan
Abstract	We propose a novel non-rigid image registration algorithm that is built upon fully convolutional networks (FCNs) to optimize and learn spatial transformations between pairs of images to be registered. Different from most existing deep learning based image registration methods that learn spatial transformations from training data with known corresponding spatial transformations, our method directly estimates spatial transformations between pairs of images by maximizing an image-wise similarity metric between fixed and deformed moving images, similar to conventional image registration algorithms. At the same time, our method also learns FCNs for encoding the spatial transformations at the same spatial resolution of images to be registered, rather than learning coarse-grained spatial transformation information. The image registration is implemented in a multi-resolution image registration framework to jointly optimize and learn spatial transformations and FCNs at different resolutions with deep self-supervision through typical feedforward and backpropagation computation. Since our method simultaneously optimizes and learns spatial transformations for the image registration, our method can be directly used to register a pair of images, and the registration of a set of images is also a training procedure for FCNs so that the trained FCNs can be directly adopted to register new images by feedforward computation of the learned FCNs without any optimization. The proposed method has been evaluated for registering 3D structural brain magnetic resonance (MR) images and obtained better performance than state-of-the-art image registration algorithms.
Tasks	Image Registration
Published	2017-09-04
URL	http://arxiv.org/abs/1709.00799v1
PDF	http://arxiv.org/pdf/1709.00799v1.pdf
PWC	https://paperswithcode.com/paper/non-rigid-image-registration-using-fully
Repo
Framework

An Improved Oscillating-Error Classifier with Branching


Title	An Improved Oscillating-Error Classifier with Branching
Authors	Kieran Greer
Abstract	This paper extends the earlier work on an oscillating error correction technique. Specifically, it extends the design to include further corrections, by adding new layers to the classifier through a branching method. This technique is still consistent with earlier work and also neural networks in general. With this extended design, the classifier can now achieve the high levels of accuracy reported previously.
Tasks
Published	2017-11-19
URL	https://arxiv.org/abs/1711.07042v3
PDF	https://arxiv.org/pdf/1711.07042v3.pdf
PWC	https://paperswithcode.com/paper/an-improved-oscillating-error-classifier-with
Repo
Framework

Optimal Rates for Learning with Nyström Stochastic Gradient Methods


Title	Optimal Rates for Learning with Nyström Stochastic Gradient Methods
Authors	Junhong Lin, Lorenzo Rosasco
Abstract	In the setting of nonparametric regression, we propose and study a combination of stochastic gradient methods with Nystr"om subsampling, allowing multiple passes over the data and mini-batches. Generalization error bounds for the studied algorithm are provided. Particularly, optimal learning rates are derived considering different possible choices of the step-size, the mini-batch size, the number of iterations/passes, and the subsampling level. In comparison with state-of-the-art algorithms such as the classic stochastic gradient methods and kernel ridge regression with Nystr"om, the studied algorithm has advantages on the computational complexity, while achieving the same optimal learning rates. Moreover, our results indicate that using mini-batches can reduce the total computational cost while achieving the same optimal statistical results.
Tasks
Published	2017-10-21
URL	http://arxiv.org/abs/1710.07797v1
PDF	http://arxiv.org/pdf/1710.07797v1.pdf
PWC	https://paperswithcode.com/paper/optimal-rates-for-learning-with-nystrom
Repo
Framework

Approximation and Convergence Properties of Generative Adversarial Learning


Title	Approximation and Convergence Properties of Generative Adversarial Learning
Authors	Shuang Liu, Olivier Bousquet, Kamalika Chaudhuri
Abstract	Generative adversarial networks (GAN) approximate a target data distribution by jointly optimizing an objective function through a “two-player game” between a generator and a discriminator. Despite their empirical success, however, two very basic questions on how well they can approximate the target distribution remain unanswered. First, it is not known how restricting the discriminator family affects the approximation quality. Second, while a number of different objective functions have been proposed, we do not understand when convergence to the global minima of the objective function leads to convergence to the target distribution under various notions of distributional convergence. In this paper, we address these questions in a broad and unified setting by defining a notion of adversarial divergences that includes a number of recently proposed objective functions. We show that if the objective function is an adversarial divergence with some additional conditions, then using a restricted discriminator family has a moment-matching effect. Additionally, we show that for objective functions that are strict adversarial divergences, convergence in the objective function implies weak convergence, thus generalizing previous results.
Tasks
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08991v1
PDF	http://arxiv.org/pdf/1705.08991v1.pdf
PWC	https://paperswithcode.com/paper/approximation-and-convergence-properties-of
Repo
Framework

Stochastic Particle Gradient Descent for Infinite Ensembles


Title	Stochastic Particle Gradient Descent for Infinite Ensembles
Authors	Atsushi Nitanda, Taiji Suzuki
Abstract	The superior performance of ensemble methods with infinite models are well known. Most of these methods are based on optimization problems in infinite-dimensional spaces with some regularization, for instance, boosting methods and convex neural networks use $L^1$-regularization with the non-negative constraint. However, due to the difficulty of handling $L^1$-regularization, these problems require early stopping or a rough approximation to solve it inexactly. In this paper, we propose a new ensemble learning method that performs in a space of probability measures, that is, our method can handle the $L^1$-constraint and the non-negative constraint in a rigorous way. Such an optimization is realized by proposing a general purpose stochastic optimization method for learning probability measures via parameterization using transport maps on base models. As a result of running the method, a transport map to output an infinite ensemble is obtained, which forms a residual-type network. From the perspective of functional gradient methods, we give a convergence rate as fast as that of a stochastic optimization method for finite dimensional nonconvex problems. Moreover, we show an interior optimality property of a local optimality condition used in our analysis.
Tasks	Stochastic Optimization
Published	2017-12-14
URL	http://arxiv.org/abs/1712.05438v1
PDF	http://arxiv.org/pdf/1712.05438v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-particle-gradient-descent-for
Repo
Framework

Generating Long-term Trajectories Using Deep Hierarchical Networks


Title	Generating Long-term Trajectories Using Deep Hierarchical Networks
Authors	Stephan Zheng, Yisong Yue, Patrick Lucey
Abstract	We study the problem of modeling spatiotemporal trajectories over long time horizons using expert demonstrations. For instance, in sports, agents often choose action sequences with long-term goals in mind, such as achieving a certain strategic position. Conventional policy learning approaches, such as those based on Markov decision processes, generally fail at learning cohesive long-term behavior in such high-dimensional state spaces, and are only effective when myopic modeling lead to the desired behavior. The key difficulty is that conventional approaches are “shallow” models that only learn a single state-action policy. We instead propose a hierarchical policy class that automatically reasons about both long-term and short-term goals, which we instantiate as a hierarchical neural network. We showcase our approach in a case study on learning to imitate demonstrated basketball trajectories, and show that it generates significantly more realistic trajectories compared to non-hierarchical baselines as judged by professional sports analysts.
Tasks
Published	2017-06-21
URL	http://arxiv.org/abs/1706.07138v1
PDF	http://arxiv.org/pdf/1706.07138v1.pdf
PWC	https://paperswithcode.com/paper/generating-long-term-trajectories-using-deep
Repo
Framework

Super-sparse Learning in Similarity Spaces


Title	Super-sparse Learning in Similarity Spaces
Authors	Ambra Demontis, Marco Melis, Battista Biggio, Giorgio Fumera, Fabio Roli
Abstract	In several applications, input samples are more naturally represented in terms of similarities between each other, rather than in terms of feature vectors. In these settings, machine-learning algorithms can become very computationally demanding, as they may require matching the test samples against a very large set of reference prototypes. To mitigate this issue, different approaches have been developed to reduce the number of required reference prototypes. Current reduction approaches select a small subset of representative prototypes in the space induced by the similarity measure, and then separately train the classification function on the reduced subset. However, decoupling these two steps may not allow reducing the number of prototypes effectively without compromising accuracy. We overcome this limitation by jointly learning the classification function along with an optimal set of virtual prototypes, whose number can be either fixed a priori or optimized according to application-specific criteria. Creating a super-sparse set of virtual prototypes provides much sparser solutions, drastically reducing complexity at test time, at the expense of a slightly increased complexity during training. A much smaller set of prototypes also results in easier-to-interpret decisions. We empirically show that our approach can reduce up to ten times the complexity of Support Vector Machines, LASSO and ridge regression at test time, without almost affecting their classification accuracy.
Tasks	Sparse Learning
Published	2017-12-17
URL	http://arxiv.org/abs/1712.06131v1
PDF	http://arxiv.org/pdf/1712.06131v1.pdf
PWC	https://paperswithcode.com/paper/super-sparse-learning-in-similarity-spaces
Repo
Framework

The Shape of a Benedictine Monastery: The SaintGall Ontology (Extended Version)


Title	The Shape of a Benedictine Monastery: The SaintGall Ontology (Extended Version)
Authors	Claudia Cantale, Domenico Cantone, Manuela Lupica Rinato, Marianna Nicolosi-Asmundo, Daniele Francesco Santamaria
Abstract	We present an OWL 2 ontology representing the Saint Gall plan, one of the most ancient documents arrived intact to us, which describes the ideal model of a Benedictine monastic complex that inspired the design of many European monasteries.
Tasks
Published	2017-09-08
URL	http://arxiv.org/abs/1709.02618v5
PDF	http://arxiv.org/pdf/1709.02618v5.pdf
PWC	https://paperswithcode.com/paper/the-shape-of-a-benedictine-monastery-the
Repo
Framework

Compressive Statistical Learning with Random Feature Moments


Title	Compressive Statistical Learning with Random Feature Moments
Authors	Rémi Gribonval, Gilles Blanchard, Nicolas Keriven, Yann Traonmilin
Abstract	We describe a general framework –compressive statistical learning– for resource-efficient large-scale learning: the training collection is compressed in one pass into a low-dimensional sketch (a vector of random empirical generalized moments) that captures the information relevant to the considered learning task. A near-minimizer of the risk is computed from the sketch through the solution of a nonlinear least squares problem. We investigate sufficient sketch sizes to control the generalization error of this procedure. The framework is illustrated on compressive clustering, compressive Gaussian mixture Modeling with fixed known variance, and compressive PCA.
Tasks
Published	2017-06-22
URL	http://arxiv.org/abs/1706.07180v2
PDF	http://arxiv.org/pdf/1706.07180v2.pdf
PWC	https://paperswithcode.com/paper/compressive-statistical-learning-with-random
Repo
Framework

Object-Oriented Knowledge Extraction using Universal Exploiters


Title	Object-Oriented Knowledge Extraction using Universal Exploiters
Authors	Dmytro Terletskyi
Abstract	This paper contains analysis and extension of exploiters-based knowledge extraction methods, which allow generation of new knowledge, based on the basic ones. The main achievement of the paper is useful features of some universal exploiters proof, which allow extending set of basic classes and set of basic relations by finite set of new classes of objects and relations among them, which allow creating of complete lattice. Proposed approach gives an opportunity to compute quantity of new classes, which can be generated using it, and quantity of different types, which each of obtained classes describes; constructing of defined hierarchy of classes with determined subsumption relation; avoidance of some problems of inheritance and more efficient restoring of basic knowledge within the database.
Tasks
Published	2017-09-08
URL	http://arxiv.org/abs/1709.02642v1
PDF	http://arxiv.org/pdf/1709.02642v1.pdf
PWC	https://paperswithcode.com/paper/object-oriented-knowledge-extraction-using
Repo
Framework

A Statistical Model for Simultaneous Template Estimation, Bias Correction, and Registration of 3D Brain Images


Title	A Statistical Model for Simultaneous Template Estimation, Bias Correction, and Registration of 3D Brain Images
Authors	Akshay Pai, Stefan Sommer, Lars Lau Raket, Line Kühnel, Sune Darkner, Lauge Sørensen, Mads Nielsen
Abstract	Template estimation plays a crucial role in computational anatomy since it provides reference frames for performing statistical analysis of the underlying anatomical population variability. While building models for template estimation, variability in sites and image acquisition protocols need to be accounted for. To account for such variability, we propose a generative template estimation model that makes simultaneous inference of both bias fields in individual images, deformations for image registration, and variance hyperparameters. In contrast, existing maximum a posterori based methods need to rely on either bias-invariant similarity measures or robust image normalization. Results on synthetic and real brain MRI images demonstrate the capability of the model to capture heterogeneity in intensities and provide a reliable template estimation from registration.
Tasks	Image Registration
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00432v1
PDF	http://arxiv.org/pdf/1705.00432v1.pdf
PWC	https://paperswithcode.com/paper/a-statistical-model-for-simultaneous-template
Repo
Framework

Network Dissection: Quantifying Interpretability of Deep Visual Representations


Title	Network Dissection: Quantifying Interpretability of Deep Visual Representations
Authors	David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, Antonio Torralba
Abstract	We propose a general framework called Network Dissection for quantifying the interpretability of latent representations of CNNs by evaluating the alignment between individual hidden units and a set of semantic concepts. Given any CNN model, the proposed method draws on a broad data set of visual concepts to score the semantics of hidden units at each intermediate convolutional layer. The units with semantics are given labels across a range of objects, parts, scenes, textures, materials, and colors. We use the proposed method to test the hypothesis that interpretability of units is equivalent to random linear combinations of units, then we apply our method to compare the latent representations of various networks when trained to solve different supervised and self-supervised training tasks. We further analyze the effect of training iterations, compare networks trained with different initializations, examine the impact of network depth and width, and measure the effect of dropout and batch normalization on the interpretability of deep visual representations. We demonstrate that the proposed method can shed light on characteristics of CNN models and training methods that go beyond measurements of their discriminative power.
Tasks
Published	2017-04-19
URL	http://arxiv.org/abs/1704.05796v1
PDF	http://arxiv.org/pdf/1704.05796v1.pdf
PWC	https://paperswithcode.com/paper/network-dissection-quantifying
Repo
Framework