July 29, 2019

3199 words 16 mins read

Paper Group ANR 23

Paper Group ANR 23

On Statistical Optimality of Variational Bayes. Adversarial Robustness: Softmax versus Openmax. CNN based Learning using Reflection and Retinex Models for Intrinsic Image Decomposition. Toward Geometric Deep SLAM. Of the People: Voting Is More Effective with Representative Candidates. Strong Baselines for Simple Question Answering over Knowledge Gr …

On Statistical Optimality of Variational Bayes

Title On Statistical Optimality of Variational Bayes
Authors Debdeep Pati, Anirban Bhattacharya, Yun Yang
Abstract The article addresses a long-standing open problem on the justification of using variational Bayes methods for parameter estimation. We provide general conditions for obtaining optimal risk bounds for point estimates acquired from mean-field variational Bayesian inference. The conditions pertain to the existence of certain test functions for the distance metric on the parameter space and minimal assumptions on the prior. A general recipe for verification of the conditions is outlined which is broadly applicable to existing Bayesian models with or without latent variables. As illustrations, specific applications to Latent Dirichlet Allocation and Gaussian mixture models are discussed.
Tasks Bayesian Inference
Published 2017-12-25
URL http://arxiv.org/abs/1712.08983v1
PDF http://arxiv.org/pdf/1712.08983v1.pdf
PWC https://paperswithcode.com/paper/on-statistical-optimality-of-variational
Repo
Framework

Adversarial Robustness: Softmax versus Openmax

Title Adversarial Robustness: Softmax versus Openmax
Authors Andras Rozsa, Manuel Günther, Terrance E. Boult
Abstract Deep neural networks (DNNs) provide state-of-the-art results on various tasks and are widely used in real world applications. However, it was discovered that machine learning models, including the best performing DNNs, suffer from a fundamental problem: they can unexpectedly and confidently misclassify examples formed by slightly perturbing otherwise correctly recognized inputs. Various approaches have been developed for efficiently generating these so-called adversarial examples, but those mostly rely on ascending the gradient of loss. In this paper, we introduce the novel logits optimized targeting system (LOTS) to directly manipulate deep features captured at the penultimate layer. Using LOTS, we analyze and compare the adversarial robustness of DNNs using the traditional Softmax layer with Openmax, which was designed to provide open set recognition by defining classes derived from deep representations, and is claimed to be more robust to adversarial perturbations. We demonstrate that Openmax provides less vulnerable systems than Softmax to traditional attacks, however, we show that it can be equally susceptible to more sophisticated adversarial generation techniques that directly work on deep representations.
Tasks Open Set Learning
Published 2017-08-05
URL http://arxiv.org/abs/1708.01697v1
PDF http://arxiv.org/pdf/1708.01697v1.pdf
PWC https://paperswithcode.com/paper/adversarial-robustness-softmax-versus-openmax
Repo
Framework

CNN based Learning using Reflection and Retinex Models for Intrinsic Image Decomposition

Title CNN based Learning using Reflection and Retinex Models for Intrinsic Image Decomposition
Authors Anil S. Baslamisli, Hoang-An Le, Theo Gevers
Abstract Most of the traditional work on intrinsic image decomposition rely on deriving priors about scene characteristics. On the other hand, recent research use deep learning models as in-and-out black box and do not consider the well-established, traditional image formation process as the basis of their intrinsic learning process. As a consequence, although current deep learning approaches show superior performance when considering quantitative benchmark results, traditional approaches are still dominant in achieving high qualitative results. In this paper, the aim is to exploit the best of the two worlds. A method is proposed that (1) is empowered by deep learning capabilities, (2) considers a physics-based reflection model to steer the learning process, and (3) exploits the traditional approach to obtain intrinsic images by exploiting reflectance and shading gradient information. The proposed model is fast to compute and allows for the integration of all intrinsic components. To train the new model, an object centered large-scale datasets with intrinsic ground-truth images are created. The evaluation results demonstrate that the new model outperforms existing methods. Visual inspection shows that the image formation loss function augments color reproduction and the use of gradient information produces sharper edges. Datasets, models and higher resolution images are available at https://ivi.fnwi.uva.nl/cv/retinet.
Tasks Intrinsic Image Decomposition
Published 2017-12-04
URL http://arxiv.org/abs/1712.01056v2
PDF http://arxiv.org/pdf/1712.01056v2.pdf
PWC https://paperswithcode.com/paper/cnn-based-learning-using-reflection-and
Repo
Framework

Toward Geometric Deep SLAM

Title Toward Geometric Deep SLAM
Authors Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich
Abstract We present a point tracking system powered by two deep convolutional neural networks. The first network, MagicPoint, operates on single images and extracts salient 2D points. The extracted points are “SLAM-ready” because they are by design isolated and well-distributed throughout the image. We compare this network against classical point detectors and discover a significant performance gap in the presence of image noise. As transformation estimation is more simple when the detected points are geometrically stable, we designed a second network, MagicWarp, which operates on pairs of point images (outputs of MagicPoint), and estimates the homography that relates the inputs. This transformation engine differs from traditional approaches because it does not use local point descriptors, only point locations. Both networks are trained with simple synthetic data, alleviating the requirement of expensive external camera ground truthing and advanced graphics rendering pipelines. The system is fast and lean, easily running 30+ FPS on a single CPU.
Tasks
Published 2017-07-24
URL http://arxiv.org/abs/1707.07410v1
PDF http://arxiv.org/pdf/1707.07410v1.pdf
PWC https://paperswithcode.com/paper/toward-geometric-deep-slam
Repo
Framework

Of the People: Voting Is More Effective with Representative Candidates

Title Of the People: Voting Is More Effective with Representative Candidates
Authors Yu Cheng, Shaddin Dughmi, David Kempe
Abstract In light of the classic impossibility results of Arrow and Gibbard and Satterthwaite regarding voting with ordinal rules, there has been recent interest in characterizing how well common voting rules approximate the social optimum. In order to quantify the quality of approximation, it is natural to consider the candidates and voters as embedded within a common metric space, and to ask how much further the chosen candidate is from the population as compared to the socially optimal one. We use this metric preference model to explore a fundamental and timely question: does the social welfare of a population improve when candidates are representative of the population? If so, then by how much, and how does the answer depend on the complexity of the metric space? We restrict attention to the most fundamental and common social choice setting: a population of voters, two independently drawn candidates, and a majority rule election. When candidates are not representative of the population, it is known that the candidate selected by the majority rule can be thrice as far from the population as the socially optimal one. We examine how this ratio improves when candidates are drawn independently from the population of voters. Our results are two-fold: When the metric is a line, the ratio improves from $3$ to $4-2\sqrt{2}$, roughly $1.1716$; this bound is tight. When the metric is arbitrary, we show a lower bound of $1.5$ and a constant upper bound strictly better than $2$ on the approximation ratio of the majority rule. The positive result depends in part on the assumption that candidates are independent and identically distributed. However, we show that independence alone is not enough to achieve the upper bound: even when candidates are drawn independently, if the population of candidates can be different from the voters, then an upper bound of $2$ on the approximation is tight.
Tasks
Published 2017-05-04
URL http://arxiv.org/abs/1705.01736v2
PDF http://arxiv.org/pdf/1705.01736v2.pdf
PWC https://paperswithcode.com/paper/of-the-people-voting-is-more-effective-with
Repo
Framework

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Title Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks
Authors Salman Mohammed, Peng Shi, Jimmy Lin
Abstract We examine the problem of question answering over knowledge graphs, focusing on simple questions that can be answered by the lookup of a single fact. Adopting a straightforward decomposition of the problem into entity detection, entity linking, relation prediction, and evidence combination, we explore simple yet strong baselines. On the popular SimpleQuestions dataset, we find that basic LSTMs and GRUs plus a few heuristics yield accuracies that approach the state of the art, and techniques that do not use neural networks also perform reasonably well. These results show that gains from sophisticated deep learning techniques proposed in the literature are quite modest and that some previous models exhibit unnecessary complexity.
Tasks Entity Linking, Knowledge Graphs, Question Answering
Published 2017-12-05
URL http://arxiv.org/abs/1712.01969v2
PDF http://arxiv.org/pdf/1712.01969v2.pdf
PWC https://paperswithcode.com/paper/strong-baselines-for-simple-question
Repo
Framework

Integrating both Visual and Audio Cues for Enhanced Video Caption

Title Integrating both Visual and Audio Cues for Enhanced Video Caption
Authors Wangli Hao, Zhaoxiang Zhang, He Guan, Guibo Zhu
Abstract Video caption refers to generating a descriptive sentence for a specific short video clip automatically, which has achieved remarkable success recently. However, most of the existing methods focus more on visual information while ignoring the synchronized audio cues. We propose three multimodal deep fusion strategies to maximize the benefits of visual-audio resonance information. The first one explores the impact on cross-modalities feature fusion from low to high order. The second establishes the visual-audio short-term dependency by sharing weights of corresponding front-end networks. The third extends the temporal dependency to long-term through sharing multimodal memory across visual and audio modalities. Extensive experiments have validated the effectiveness of our three cross-modalities fusion strategies on two benchmark datasets, including Microsoft Research Video to Text (MSRVTT) and Microsoft Video Description (MSVD). It is worth mentioning that sharing weight can coordinate visual-audio feature fusion effectively and achieve the state-of-art performance on both BELU and METEOR metrics. Furthermore, we first propose a dynamic multimodal feature fusion framework to deal with the part modalities missing case. Experimental results demonstrate that even in the audio absence mode, we can still obtain comparable results with the aid of the additional audio modality inference module.
Tasks Video Description
Published 2017-11-22
URL http://arxiv.org/abs/1711.08097v2
PDF http://arxiv.org/pdf/1711.08097v2.pdf
PWC https://paperswithcode.com/paper/integrating-both-visual-and-audio-cues-for
Repo
Framework

Asymmetric Action Abstractions for Multi-Unit Control in Adversarial Real-Time Games

Title Asymmetric Action Abstractions for Multi-Unit Control in Adversarial Real-Time Games
Authors Rubens O. Moraes, Levi H. S. Lelis
Abstract Action abstractions restrict the number of legal actions available during search in multi-unit real-time adversarial games, thus allowing algorithms to focus their search on a set of promising actions. Optimal strategies derived from un-abstracted spaces are guaranteed to be no worse than optimal strategies derived from action-abstracted spaces. In practice, however, due to real-time constraints and the state space size, one is only able to derive good strategies in un-abstracted spaces in small-scale games. In this paper we introduce search algorithms that use an action abstraction scheme we call asymmetric abstraction. Asymmetric abstractions retain the un-abstracted spaces’ theoretical advantage over regularly abstracted spaces while still allowing the search algorithms to derive effective strategies, even in large-scale games. Empirical results on combat scenarios that arise in a real-time strategy game show that our search algorithms are able to substantially outperform state-of-the-art approaches.
Tasks
Published 2017-11-22
URL http://arxiv.org/abs/1711.08101v1
PDF http://arxiv.org/pdf/1711.08101v1.pdf
PWC https://paperswithcode.com/paper/asymmetric-action-abstractions-for-multi-unit
Repo
Framework

Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era

Title Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era
Authors Chao Ning, Fengqi You
Abstract A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for uncertainty modeling. A DDSRO framework is further proposed based on the data-driven uncertainty model through a bi-level optimization structure. The outer optimization problem follows a two-stage stochastic programming approach to optimize the expected objective across different data classes; adaptive robust optimization is nested as the inner problem to ensure the robustness of the solution while maintaining computational tractability. A decomposition-based algorithm is further developed to solve the resulting multi-level optimization problem efficiently. Case studies on process network design and planning are presented to demonstrate the applicability of the proposed framework and algorithm.
Tasks
Published 2017-07-28
URL http://arxiv.org/abs/1707.09198v4
PDF http://arxiv.org/pdf/1707.09198v4.pdf
PWC https://paperswithcode.com/paper/data-driven-stochastic-robust-optimization-a
Repo
Framework

Tracking Single-Cells in Overcrowded Bacterial Colonies

Title Tracking Single-Cells in Overcrowded Bacterial Colonies
Authors Athanasios D. Balomenos, Panagiotis Tsakanikas, Elias S. Manolakos
Abstract Cell tracking enables data extraction from time-lapse “cell movies” and promotes modeling biological processes at the single-cell level. We introduce a new fully automated computational strategy to track accurately cells across frames in time-lapse movies. Our method is based on a dynamic neighborhoods formation and matching approach, inspired by motion estimation algorithms for video compression. Moreover, it exploits “divide and conquer” opportunities to solve effectively the challenging cells tracking problem in overcrowded bacterial colonies. Using cell movies generated by different labs we demonstrate that the accuracy of the proposed method remains very high (exceeds 97%) even when analyzing large overcrowded microbial colonies.
Tasks Motion Estimation, Video Compression
Published 2017-06-22
URL http://arxiv.org/abs/1706.07362v1
PDF http://arxiv.org/pdf/1706.07362v1.pdf
PWC https://paperswithcode.com/paper/tracking-single-cells-in-overcrowded
Repo
Framework

Learning the Exact Topology of Undirected Consensus Networks

Title Learning the Exact Topology of Undirected Consensus Networks
Authors Saurav Talukdar, Deepjyoti Deka, Sandeep Attree, Donatello Materassi, Murti V. Salapaka
Abstract In this article, we present a method to learn the interaction topology of a network of agents undergoing linear consensus updates in a non invasive manner. Our approach is based on multivariate Wiener filtering, which is known to recover spurious edges apart from the true edges in the topology. The main contribution of this work is to show that in the case of undirected consensus networks, all spurious links obtained using Wiener filtering can be identified using frequency response of the Wiener filters. Thus, the exact interaction topology of the agents is unveiled. The method presented requires time series measurements of the state of the agents and does not require any knowledge of link weights. To the best of our knowledge this is the first approach that provably reconstructs the structure of undirected consensus networks with correlated noise. We illustrate the effectiveness of the method developed through numerical simulations as well as experiments on a five node network of Raspberry Pis.
Tasks Time Series
Published 2017-09-29
URL http://arxiv.org/abs/1710.00032v1
PDF http://arxiv.org/pdf/1710.00032v1.pdf
PWC https://paperswithcode.com/paper/learning-the-exact-topology-of-undirected
Repo
Framework

Expressing Facial Structure and Appearance Information in Frequency Domain for Face Recognition

Title Expressing Facial Structure and Appearance Information in Frequency Domain for Face Recognition
Authors Chollette C. Olisah, Solomon Nunoo, Peter Ofedebe, Ghazali Sulong
Abstract Beneath the uncertain primitive visual features of face images are the primitive intrinsic structural patterns (PISP) essential for characterizing a sample face discriminative attributes. It is on this basis that this paper presents a simple yet effective facial descriptor formed from derivatives of Gaussian and Gabor Wavelets. The new descriptor is coined local edge gradient Gabor magnitude (LEGGM) pattern. LEGGM first uncovers the PISP locked in every pixel through determining the pixel gradient in relation to its neighbors using the Derivatives of Gaussians. Then, the resulting output is embedded into the global appearance of the face which are further processed using Gabor wavelets in order to express its frequency characteristics. Additionally, we adopted various subspace models for dimensionality reduction in order to ascertain the best fit model for reporting a more effective representation of the LEGGM patterns. The proposed descriptor-based face recognition method is evaluated on three databases: Plastic surgery, LFW, and GT face databases. Through experiments, using a base classifier, the efficacy of the proposed method is demonstrated, especially in the case of plastic surgery database. The heterogeneous database, which we created to typify real-world scenario, show that the proposed method is to an extent insensitive to image formation factors with impressive recognition performances.
Tasks Dimensionality Reduction, Face Recognition
Published 2017-04-28
URL http://arxiv.org/abs/1704.08949v1
PDF http://arxiv.org/pdf/1704.08949v1.pdf
PWC https://paperswithcode.com/paper/expressing-facial-structure-and-appearance
Repo
Framework

Curriculum Dropout

Title Curriculum Dropout
Authors Pietro Morerio, Jacopo Cavazza, Riccardo Volpi, Rene Vidal, Vittorio Murino
Abstract Dropout is a very effective way of regularizing neural networks. Stochastically “dropping out” units with a certain probability discourages over-specific co-adaptations of feature detectors, preventing overfitting and improving network generalization. Besides, Dropout can be interpreted as an approximate model aggregation technique, where an exponential number of smaller networks are averaged in order to get a more powerful ensemble. In this paper, we show that using a fixed dropout probability during training is a suboptimal choice. We thus propose a time scheduling for the probability of retaining neurons in the network. This induces an adaptive regularization scheme that smoothly increases the difficulty of the optimization problem. This idea of “starting easy” and adaptively increasing the difficulty of the learning problem has its roots in curriculum learning and allows one to train better models. Indeed, we prove that our optimization strategy implements a very general curriculum scheme, by gradually adding noise to both the input and intermediate feature representations within the network architecture. Experiments on seven image classification datasets and different network architectures show that our method, named Curriculum Dropout, frequently yields to better generalization and, at worst, performs just as well as the standard Dropout method.
Tasks Image Classification
Published 2017-03-18
URL http://arxiv.org/abs/1703.06229v2
PDF http://arxiv.org/pdf/1703.06229v2.pdf
PWC https://paperswithcode.com/paper/curriculum-dropout
Repo
Framework

Simplified Energy Landscape for Modularity Using Total Variation

Title Simplified Energy Landscape for Modularity Using Total Variation
Authors Zachary Boyd, Egil Bae, Xue-Cheng Tai, Andrea L. Bertozzi
Abstract Networks capture pairwise interactions between entities and are frequently used in applications such as social networks, food networks, and protein interaction networks, to name a few. Communities, cohesive groups of nodes, often form in these applications, and identifying them gives insight into the overall organization of the network. One common quality function used to identify community structure is modularity. In Hu et al. [SIAM J. App. Math., 73(6), 2013], it was shown that modularity optimization is equivalent to minimizing a particular nonconvex total variation (TV) based functional over a discrete domain. They solve this problem, assuming the number of communities is known, using a Merriman, Bence, Osher (MBO) scheme. We show that modularity optimization is equivalent to minimizing a convex TV-based functional over a discrete domain, again, assuming the number of communities is known. Furthermore, we show that modularity has no convex relaxation satisfying certain natural conditions. We therefore, find a manageable non-convex approximation using a Ginzburg Landau functional, which provably converges to the correct energy in the limit of a certain parameter. We then derive an MBO algorithm with fewer hand-tuned parameters than in Hu et al. and which is 7 times faster at solving the associated diffusion equation due to the fact that the underlying discretization is unconditionally stable. Our numerical tests include a hyperspectral video whose associated graph has 2.9x10^7 edges, which is roughly 37 times larger than was handled in the paper of Hu et al.
Tasks
Published 2017-07-28
URL http://arxiv.org/abs/1707.09285v3
PDF http://arxiv.org/pdf/1707.09285v3.pdf
PWC https://paperswithcode.com/paper/simplified-energy-landscape-for-modularity
Repo
Framework

Lexical Features in Coreference Resolution: To be Used With Caution

Title Lexical Features in Coreference Resolution: To be Used With Caution
Authors Nafise Sadat Moosavi, Michael Strube
Abstract Lexical features are a major source of information in state-of-the-art coreference resolvers. Lexical features implicitly model some of the linguistic phenomena at a fine granularity level. They are especially useful for representing the context of mentions. In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers. We show that if coreference resolvers mainly rely on lexical features, they can hardly generalize to unseen domains. Furthermore, we show that the current coreference resolution evaluation is clearly flawed by only evaluating on a specific split of a specific dataset in which there is a notable overlap between the training, development and test sets.
Tasks Coreference Resolution
Published 2017-04-22
URL http://arxiv.org/abs/1704.06779v1
PDF http://arxiv.org/pdf/1704.06779v1.pdf
PWC https://paperswithcode.com/paper/lexical-features-in-coreference-resolution-to
Repo
Framework
comments powered by Disqus