Paper Group ANR 104
Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering. A Fuzzy-Rough based Binary Shuffled Frog Leaping Algorithm for Feature Selection. Long-term Large-scale Mapping and Localization Using maplab. Volumetric performance capture from minimal camera viewpoints. Infill Criterion for Multimodal Model-Based Op …
Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering
Title | Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering |
Authors | Xiaopeng Li, Zhourong Chen, Leonard K. M. Poon, Nevin L. Zhang |
Abstract | We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features. In general, our superstructure is a tree structure of multiple super latent variables and it is automatically learned from data. When there is only one latent variable in the superstructure, our model reduces to one that assumes the latent features to be generated from a Gaussian mixture model. We call our model the latent tree variational autoencoder (LTVAE). Whereas previous deep learning methods for clustering produce only one partition of data, LTVAE produces multiple partitions of data, each being given by one super latent variable. This is desirable because high dimensional data usually have many different natural facets and can be meaningfully partitioned in multiple ways. |
Tasks | |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05206v3 |
http://arxiv.org/pdf/1803.05206v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-latent-superstructures-in |
Repo | |
Framework | |
A Fuzzy-Rough based Binary Shuffled Frog Leaping Algorithm for Feature Selection
Title | A Fuzzy-Rough based Binary Shuffled Frog Leaping Algorithm for Feature Selection |
Authors | Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn |
Abstract | Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality, by either selecting a subset of features or removing unrelated ones. This paper presents a new feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) in the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a new version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The new feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Non-parametric statistical tests are conducted to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy. |
Tasks | Feature Selection |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1808.00068v1 |
http://arxiv.org/pdf/1808.00068v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fuzzy-rough-based-binary-shuffled-frog |
Repo | |
Framework | |
Long-term Large-scale Mapping and Localization Using maplab
Title | Long-term Large-scale Mapping and Localization Using maplab |
Authors | Marcin Dymczyk, Marius Fehr, Thomas Schneider, Roland Siegwart |
Abstract | This paper discusses a large-scale and long-term mapping and localization scenario using the maplab open-source framework. We present a brief overview of the specific algorithms in the system that enable building a consistent map from multiple sessions. We then demonstrate that such a map can be reused even a few months later for efficient 6-DoF localization and also new trajectories can be registered within the existing 3D model. The datasets presented in this paper are made publicly available. |
Tasks | |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10994v1 |
http://arxiv.org/pdf/1805.10994v1.pdf | |
PWC | https://paperswithcode.com/paper/long-term-large-scale-mapping-and |
Repo | |
Framework | |
Volumetric performance capture from minimal camera viewpoints
Title | Volumetric performance capture from minimal camera viewpoints |
Authors | Andrew Gilbert, Marco Volino, John Collomosse, Adrian Hilton |
Abstract | We present a convolutional autoencoder that enables high fidelity volumetric reconstructions of human performance to be captured from multi-view video comprising only a small set of camera views. Our method yields similar end-to-end reconstruction error to that of a probabilistic visual hull computed using significantly more (double or more) viewpoints. We use a deep prior implicitly learned by the autoencoder trained over a dataset of view-ablated multi-view video footage of a wide range of subjects and actions. This opens up the possibility of high-end volumetric performance capture in on-set and prosumer scenarios where time or cost prohibit a high witness camera count. |
Tasks | |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.01950v2 |
http://arxiv.org/pdf/1807.01950v2.pdf | |
PWC | https://paperswithcode.com/paper/volumetric-performance-capture-from-minimal |
Repo | |
Framework | |
Infill Criterion for Multimodal Model-Based Optimisation
Title | Infill Criterion for Multimodal Model-Based Optimisation |
Authors | Dirk Surmann, Uwe Ligges, Claus Weihs |
Abstract | Physical systems are modelled and investigated within simulation software in an increasing range of applications. In reality an investigation of the system is often performed by empirical test scenarios which are related to typical situations. Our aim is to derive a method which generates diverse test scenarios each representing a challenging situation for the corresponding physical system. From a mathematical point of view challenging test scenarios correspond to local optima. Hence, we focus to identify all local optima within mathematical functions. Due to the fact that simulation runs are usually expensive we use the model-based optimisation approach with its well-known representative efficient global optimisation. We derive an infill criterion which focuses on the identification of local optima. The criterion is checked via fifteen different artificial functions in a computer experiment. Our new infill criterion performs better in identifying local optima compared to the expected improvement infill criterion and Latin Hypercube Samples. |
Tasks | |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02118v1 |
http://arxiv.org/pdf/1810.02118v1.pdf | |
PWC | https://paperswithcode.com/paper/infill-criterion-for-multimodal-model-based |
Repo | |
Framework | |
SMarTplan: a Task Planner for Smart Factories
Title | SMarTplan: a Task Planner for Smart Factories |
Authors | Arthur Bit-Monnot, Francesco Leofante, Luca Pulina, Erika Abraham, Armando Tacchella |
Abstract | Smart factories are on the verge of becoming the new industrial paradigm, wherein optimization permeates all aspects of production, from concept generation to sales. To fully pursue this paradigm, flexibility in the production means as well as in their timely organization is of paramount importance. AI is planning a major role in this transition, but the scenarios encountered in practice might be challenging for current tools. Task planning is one example where AI enables more efficient and flexible operation through an online automated adaptation and rescheduling of the activities to cope with new operational constraints and demands. In this paper we present SMarTplan, a task planner specifically conceived to deal with real-world scenarios in the emerging smart factory paradigm. Including both special-purpose and general-purpose algorithms, SMarTplan is based on current automated reasoning technology and it is designed to tackle complex application domains. In particular, we show its effectiveness on a logistic scenario, by comparing its specialized version with the general purpose one, and extending the comparison to other state-of-the-art task planners. |
Tasks | |
Published | 2018-06-19 |
URL | http://arxiv.org/abs/1806.07135v1 |
http://arxiv.org/pdf/1806.07135v1.pdf | |
PWC | https://paperswithcode.com/paper/smartplan-a-task-planner-for-smart-factories |
Repo | |
Framework | |
Study and development of a Computer-Aided Diagnosis system for classification of chest x-ray images using convolutional neural networks pre-trained for ImageNet and data augmentation
Title | Study and development of a Computer-Aided Diagnosis system for classification of chest x-ray images using convolutional neural networks pre-trained for ImageNet and data augmentation |
Authors | Vinicius Pavanelli Vianna |
Abstract | Convolutional neural networks (ConvNets) are the actual standard for image recognizement and classification. On the present work we develop a Computer Aided-Diagnosis (CAD) system using ConvNets to classify a x-rays chest images dataset in two groups: Normal and Pneumonia. The study uses ConvNets models available on the PyTorch platform: AlexNet, SqueezeNet, ResNet and Inception. We initially use three training styles: complete from scratch using random initialization, using a pre-trained ImageNet model training only the last layer adapted to our problem (transfer learning) and a pre-trained model modified training all the classifying layers of the model (fine tuning). The last strategy of training used is with data augmentation techniques that avoid over fitting problems on ConvNets yielding the better results on this study |
Tasks | Data Augmentation, Transfer Learning |
Published | 2018-06-03 |
URL | http://arxiv.org/abs/1806.00839v1 |
http://arxiv.org/pdf/1806.00839v1.pdf | |
PWC | https://paperswithcode.com/paper/study-and-development-of-a-computer-aided |
Repo | |
Framework | |
Discovering Topical Interactions in Text-based Cascades using Hidden Markov Hawkes Processes
Title | Discovering Topical Interactions in Text-based Cascades using Hidden Markov Hawkes Processes |
Authors | Srikanta Bedathur, Indrajit Bhattacharya, Jayesh Choudhari, Anirban Dasgupta |
Abstract | Social media conversations unfold based on complex interactions between users, topics and time. While recent models have been proposed to capture network strengths between users, users’ topical preferences and temporal patterns between posting and response times, interaction patterns between topics has not been studied. We propose the Hidden Markov Hawkes Process (HMHP) that incorporates topical Markov Chains within Hawkes processes to jointly model topical interactions along with user-user and user-topic patterns. We propose a Gibbs sampling algorithm for HMHP that jointly infers the network strengths, diffusion paths, the topics of the posts as well as the topic-topic interactions. We show using experiments on real and semi-synthetic data that HMHP is able to generalize better and recover the network strengths, topics and diffusion paths more accurately than state-of-the-art baselines. More interestingly, HMHP finds insightful interactions between topics in real tweets which no existing model is able to do. |
Tasks | |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04487v1 |
http://arxiv.org/pdf/1809.04487v1.pdf | |
PWC | https://paperswithcode.com/paper/discovering-topical-interactions-in-text |
Repo | |
Framework | |
Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods
Title | Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods |
Authors | Robert M. Freund, Paul Grigas, Rahul Mazumder |
Abstract | Logistic regression is one of the most popular methods in binary classification, wherein estimation of model parameters is carried out by solving the maximum likelihood (ML) optimization problem, and the ML estimator is defined to be the optimal solution of this problem. It is well known that the ML estimator exists when the data is non-separable, but fails to exist when the data is separable. First-order methods are the algorithms of choice for solving large-scale instances of the logistic regression problem. In this paper, we introduce a pair of condition numbers that measure the degree of non-separability or separability of a given dataset in the setting of binary classification, and we study how these condition numbers relate to and inform the properties and the convergence guarantees of first-order methods. When the training data is non-separable, we show that the degree of non-separability naturally enters the analysis and informs the properties and convergence guarantees of two standard first-order methods: steepest descent (for any given norm) and stochastic gradient descent. Expanding on the work of Bach, we also show how the degree of non-separability enters into the analysis of linear convergence of steepest descent (without needing strong convexity), as well as the adaptive convergence of stochastic gradient descent. When the training data is separable, first-order methods rather curiously have good empirical success, which is not well understood in theory. In the case of separable data, we demonstrate how the degree of separability enters into the analysis of $\ell_2$ steepest descent and stochastic gradient descent for delivering approximate-maximum-margin solutions with associated computational guarantees as well. This suggests that first-order methods can lead to statistically meaningful solutions in the separable case, even though the ML solution does not exist. |
Tasks | |
Published | 2018-10-20 |
URL | http://arxiv.org/abs/1810.08727v1 |
http://arxiv.org/pdf/1810.08727v1.pdf | |
PWC | https://paperswithcode.com/paper/condition-number-analysis-of-logistic |
Repo | |
Framework | |
Nonconvex and Nonsmooth Sparse Optimization via Adaptively Iterative Reweighted Methods
Title | Nonconvex and Nonsmooth Sparse Optimization via Adaptively Iterative Reweighted Methods |
Authors | Hao Wang, Fan Zhang, Qiong Wu, Yaohua Hu, Yuanming Shi |
Abstract | We present a general formulation of nonconvex and nonsmooth sparse optimization problems with a convexset constraint, which takes into account most existing types of nonconvex sparsity-inducing terms. It thus brings strong applicability to a wide range of applications. We further design a general algorithmic framework of adaptively iterative reweighted algorithms for solving the nonconvex and nonsmooth sparse optimization problems. This is achieved by solving a sequence of weighted convex penalty subproblems with adaptively updated weights. The first-order optimality condition is then derived and the global convergence results are provided under loose assumptions. This makes our theoretical results a practical tool for analyzing a family of various iteratively reweighted algorithms. In particular, for the iteratively reweighed $\ell_1$-algorithm, global convergence analysis is provided for cases with diminishing relaxation parameter. For the iteratively reweighed $\ell_2$-algorithm, adaptively decreasing relaxation parameter is applicable and the existence of the cluster point to the algorithm is established. The effectiveness and efficiency of our proposed formulation and the algorithms are demonstrated in numerical experiments in various sparse optimization problems. |
Tasks | |
Published | 2018-10-24 |
URL | http://arxiv.org/abs/1810.10167v1 |
http://arxiv.org/pdf/1810.10167v1.pdf | |
PWC | https://paperswithcode.com/paper/nonconvex-and-nonsmooth-sparse-optimization |
Repo | |
Framework | |
Can Neural Machine Translation be Improved with User Feedback?
Title | Can Neural Machine Translation be Improved with User Feedback? |
Authors | Julia Kreutzer, Shahram Khadivi, Evgeny Matusov, Stefan Riezler |
Abstract | We present the first real-world application of methods for improving neural machine translation (NMT) with human reinforcement, based on explicit and implicit user feedback collected on the eBay e-commerce platform. Previous work has been confined to simulation experiments, whereas in this paper we work with real logged feedback for offline bandit learning of NMT parameters. We conduct a thorough analysis of the available explicit user judgments—five-star ratings of translation quality—and show that they are not reliable enough to yield significant improvements in bandit learning. In contrast, we successfully utilize implicit task-based feedback collected in a cross-lingual search task to improve task-specific and machine translation quality metrics. |
Tasks | Machine Translation |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05958v1 |
http://arxiv.org/pdf/1804.05958v1.pdf | |
PWC | https://paperswithcode.com/paper/can-neural-machine-translation-be-improved |
Repo | |
Framework | |
Styling with Attention to Details
Title | Styling with Attention to Details |
Authors | Ayushi Dalmia, Sachindra Joshi, Raghavendra Singh, Vikas Raykar |
Abstract | Fashion as characterized by its nature, is driven by style. In this paper, we propose a method that takes into account the style information to complete a given set of selected fashion items with a complementary fashion item. Complementary items are those items that can be worn along with the selected items according to the style. Addressing this problem facilitates in automatically generating stylish fashion ensembles leading to a richer shopping experience for users. Recently, there has been a surge of online social websites where fashion enthusiasts post the outfit of the day and other users can like and comment on them. These posts contain a gold-mine of information about style. In this paper, we exploit these posts to train a deep neural network which captures style in an automated manner. We pose the problem of predicting complementary fashion items as a sequence to sequence problem where the input is the selected set of fashion items and the output is a complementary fashion item based on the style information learned by the model. We use the encoder decoder architecture to solve this problem of completing the set of fashion items. We evaluate the goodness of the proposed model through a variety of experiments. We empirically observe that our proposed model outperforms competitive baseline like apriori algorithm by ~28 in terms of accuracy for top-1 recommendation to complete the fashion ensemble. We also perform retrieval based experiments to understand the ability of the model to learn style and rank the complementary fashion items and find that using attention in our encoder decoder model helps in improving the mean reciprocal rank by ~24. Qualitatively we find the complementary fashion items generated by our proposed model are richer than the apriori algorithm. |
Tasks | |
Published | 2018-07-03 |
URL | http://arxiv.org/abs/1807.01182v1 |
http://arxiv.org/pdf/1807.01182v1.pdf | |
PWC | https://paperswithcode.com/paper/styling-with-attention-to-details |
Repo | |
Framework | |
An Online-Learning Approach to Inverse Optimization
Title | An Online-Learning Approach to Inverse Optimization |
Authors | Andreas Bärmann, Alexander Martin, Sebastian Pokutta, Oskar Schneider |
Abstract | In this paper, we demonstrate how to learn the objective function of a decision-maker while only observing the problem input data and the decision-maker’s corresponding decisions over multiple rounds. We present exact algorithms for this online version of inverse optimization which converge at a rate of $ \mathcal{O}(1/\sqrt{T}) $ in the number of observations~$T$ and compare their further properties. Especially, they all allow taking decisions which are essentially as good as those of the observed decision-maker already after relatively few iterations, but are suited best for different settings each. Our approach is based on online learning and works for linear objectives over arbitrary feasible sets for which we have a linear optimization oracle. As such, it generalizes previous approaches based on KKT-system decomposition and dualization. We also introduce several generalizations, such as the approximate learning of non-linear objective functions, dynamically changing as well as parameterized objectives and the case of suboptimal observed decisions. When applied to the stochastic offline case, our algorithms are able to give guarantees on the quality of the learned objectives in expectation. Finally, we show the effectiveness and possible applications of our methods in indicative computational experiments. |
Tasks | |
Published | 2018-10-30 |
URL | https://arxiv.org/abs/1810.12997v2 |
https://arxiv.org/pdf/1810.12997v2.pdf | |
PWC | https://paperswithcode.com/paper/an-online-learning-approach-to-inverse |
Repo | |
Framework | |
Distributed Weight Consolidation: A Brain Segmentation Case Study
Title | Distributed Weight Consolidation: A Brain Segmentation Case Study |
Authors | Patrick McClure, Charles Y. Zheng, Jakub R. Kaczmarzyk, John A. Lee, Satrajit S. Ghosh, Dylan Nielson, Peter Bandettini, Francisco Pereira |
Abstract | Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be the case that derivative datasets or predictive models developed within individual sites can be shared and combined with fewer restrictions. Training on distributed data and combining the resulting networks is often viewed as continual learning, but these methods require networks to be trained sequentially. In this paper, we introduce distributed weight consolidation (DWC), a continual learning method to consolidate the weights of separate neural networks, each trained on an independent dataset. We evaluated DWC with a brain segmentation case study, where we consolidated dilated convolutional neural networks trained on independent structural magnetic resonance imaging (sMRI) datasets from different sites. We found that DWC led to increased performance on test sets from the different sites, while maintaining generalization performance for a very large and completely independent multi-site dataset, compared to an ensemble baseline. |
Tasks | Brain Segmentation, Continual Learning |
Published | 2018-05-28 |
URL | http://arxiv.org/abs/1805.10863v9 |
http://arxiv.org/pdf/1805.10863v9.pdf | |
PWC | https://paperswithcode.com/paper/distributed-weight-consolidation-a-brain |
Repo | |
Framework | |
Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training
Title | Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training |
Authors | Yang Zou, Zhiding Yu, B. V. K. Vijaya Kumar, Jinsong Wang |
Abstract | Recent deep networks achieved state of the art performance on a variety of semantic segmentation tasks. Despite such progress, these models often face challenges in real world wild tasks' where large difference between labeled training/source data and unseen test/target data exists. In particular, such difference is often referred to as domain gap’, and could cause significantly decreased performance which cannot be easily remedied by further increasing the representation power. Unsupervised domain adaptation (UDA) seeks to overcome such problem without target domain labels. In this paper, we propose a novel UDA framework based on an iterative self-training procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels. On top of self-training, we also propose a novel class-balanced self-training framework to avoid the gradual dominance of large classes on pseudo-label generation, and introduce spatial priors to refine generated labels. Comprehensive experiments show that the proposed methods achieve state of the art semantic segmentation performance under multiple major UDA settings. |
Tasks | Domain Adaptation, Semantic Segmentation, Synthetic-to-Real Translation, Unsupervised Domain Adaptation |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.07911v2 |
http://arxiv.org/pdf/1810.07911v2.pdf | |
PWC | https://paperswithcode.com/paper/domain-adaptation-for-semantic-segmentation |
Repo | |
Framework | |