May 6, 2019

3180 words 15 mins read

Paper Group ANR 213

COCO: Performance Assessment. Estimation of low rank density matrices: bounds in Schatten norms and other distances. Distributed-memory large deformation diffeomorphic 3D image registration. Ternary Neural Networks for Resource-Efficient AI Applications. Frank-Wolfe Algorithms for Saddle Point Problems. How to be Fair and Diverse?. Spoofing detecti …

COCO: Performance Assessment


Title	COCO: Performance Assessment
Authors	Nikolaus Hansen, Anne Auger, Dimo Brockhoff, Dejan Tušar, Tea Tušar
Abstract	We present an any-time performance assessment for benchmarking numerical optimization algorithms in a black-box scenario, applied within the COCO benchmarking platform. The performance assessment is based on runtimes measured in number of objective function evaluations to reach one or several quality indicator target values. We argue that runtime is the only available measure with a generic, meaningful, and quantitative interpretation. We discuss the choice of the target values, runlength-based targets, and the aggregation of results by using simulated restarts, averages, and empirical distribution functions.
Tasks
Published	2016-05-11
URL	http://arxiv.org/abs/1605.03560v1
PDF	http://arxiv.org/pdf/1605.03560v1.pdf
PWC	https://paperswithcode.com/paper/coco-performance-assessment
Repo
Framework

Estimation of low rank density matrices: bounds in Schatten norms and other distances


Title	Estimation of low rank density matrices: bounds in Schatten norms and other distances
Authors	Dong Xia, Vladimir Koltchinskii
Abstract	Let ${\mathcal S}_m$ be the set of all $m\times m$ density matrices (Hermitian positively semi-definite matrices of unit trace). Consider a problem of estimation of an unknown density matrix $\rho\in {\mathcal S}_m$ based on outcomes of $n$ measurements of observables $X_1,\dots, X_n\in {\mathbb H}m$ (${\mathbb H}m$ being the space of $m\times m$ Hermitian matrices) for a quantum system identically prepared $n$ times in state $\rho.$ Outcomes $Y_1,\dots, Y_n$ of such measurements could be described by a trace regression model in which ${\mathbb E}{\rho}(Y_jX_j)={\rm tr}(\rho X_j), j=1,\dots, n.$ The design variables $X_1,\dots, X_n$ are often sampled at random from the uniform distribution in an orthonormal basis ${E_1,\dots, E{m^2}}$ of ${\mathbb H}m$ (such as Pauli basis). The goal is to estimate the unknown density matrix $\rho$ based on the data $(X_1,Y_1), \dots, (X_n,Y_n).$ Let $$ \hat Z:=\frac{m^2}{n}\sum{j=1}^n Y_j X_j $$ and let $\check \rho$ be the projection of $\hat Z$ onto the convex set ${\mathcal S}_m$ of density matrices. It is shown that for estimator $\check \rho$ the minimax lower bounds in classes of low rank density matrices (established earlier) are attained up logarithmic factors for all Schatten $p$-norm distances, $p\in [1,\infty]$ and for Bures version of quantum Hellinger distance. Moreover, for a slightly modified version of estimator $\check \rho$ the same property holds also for quantum relative entropy (Kullback-Leibler) distance between density matrices.
Tasks
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04600v1
PDF	http://arxiv.org/pdf/1604.04600v1.pdf
PWC	https://paperswithcode.com/paper/estimation-of-low-rank-density-matrices
Repo
Framework

Distributed-memory large deformation diffeomorphic 3D image registration


Title	Distributed-memory large deformation diffeomorphic 3D image registration
Authors	Andreas Mang, Amir Gholami, George Biros
Abstract	We present a parallel distributed-memory algorithm for large deformation diffeomorphic registration of volumetric images that produces large isochoric deformations (locally volume preserving). Image registration is a key technology in medical image analysis. Our algorithm uses a partial differential equation constrained optimal control formulation. Finding the optimal deformation map requires the solution of a highly nonlinear problem that involves pseudo-differential operators, biharmonic operators, and pure advection operators both forward and back- ward in time. A key issue is the time to solution, which poses the demand for efficient optimization methods as well as an effective utilization of high performance computing resources. To address this problem we use a preconditioned, inexact, Gauss-Newton- Krylov solver. Our algorithm integrates several components: a spectral discretization in space, a semi-Lagrangian formulation in time, analytic adjoints, different regularization functionals (including volume-preserving ones), a spectral preconditioner, a highly optimized distributed Fast Fourier Transform, and a cubic interpolation scheme for the semi-Lagrangian time-stepping. We demonstrate the scalability of our algorithm on images with resolution of up to $1024^3$ on the “Maverick” and “Stampede” systems at the Texas Advanced Computing Center (TACC). The critical problem in the medical imaging application domain is strong scaling, that is, solving registration problems of a moderate size of $256^3$—a typical resolution for medical images. We are able to solve the registration problem for images of this size in less than five seconds on 64 x86 nodes of TACC’s “Maverick” system.
Tasks	Image Registration
Published	2016-08-11
URL	http://arxiv.org/abs/1608.03630v1
PDF	http://arxiv.org/pdf/1608.03630v1.pdf
PWC	https://paperswithcode.com/paper/distributed-memory-large-deformation
Repo
Framework

Ternary Neural Networks for Resource-Efficient AI Applications


Title	Ternary Neural Networks for Resource-Efficient AI Applications
Authors	Hande Alemdar, Vincent Leroy, Adrien Prost-Boucle, Frédéric Pétrot
Abstract	The computation and storage requirements for Deep Neural Networks (DNNs) are usually high. This issue limits their deployability on ubiquitous computing devices such as smart phones, wearables and autonomous drones. In this paper, we propose ternary neural networks (TNNs) in order to make deep learning more resource-efficient. We train these TNNs using a teacher-student approach based on a novel, layer-wise greedy methodology. Thanks to our two-stage training procedure, the teacher network is still able to use state-of-the-art methods such as dropout and batch normalization to increase accuracy and reduce training time. Using only ternary weights and activations, the student ternary network learns to mimic the behavior of its teacher network without using any multiplication. Unlike its -1,1 binary counterparts, a ternary neural network inherently prunes the smaller weights by setting them to zero during training. This makes them sparser and thus more energy-efficient. We design a purpose-built hardware architecture for TNNs and implement it on FPGA and ASIC. We evaluate TNNs on several benchmark datasets and demonstrate up to 3.1x better energy efficiency with respect to the state of the art while also improving accuracy.
Tasks
Published	2016-09-01
URL	http://arxiv.org/abs/1609.00222v2
PDF	http://arxiv.org/pdf/1609.00222v2.pdf
PWC	https://paperswithcode.com/paper/ternary-neural-networks-for-resource
Repo
Framework

Frank-Wolfe Algorithms for Saddle Point Problems


Title	Frank-Wolfe Algorithms for Saddle Point Problems
Authors	Gauthier Gidel, Tony Jebara, Simon Lacoste-Julien
Abstract	We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems. Remarkably, the method only requires access to linear minimization oracles. Leveraging recent advances in FW optimization, we provide the first proof of convergence of a FW-type saddle point solver over polytopes, thereby partially answering a 30 year-old conjecture. We also survey other convergence results and highlight gaps in the theoretical underpinnings of FW-style algorithms. Motivating applications without known efficient alternatives are explored through structured prediction with combinatorial penalties as well as games over matching polytopes involving an exponential number of constraints.
Tasks	Structured Prediction
Published	2016-10-25
URL	http://arxiv.org/abs/1610.07797v3
PDF	http://arxiv.org/pdf/1610.07797v3.pdf
PWC	https://paperswithcode.com/paper/frank-wolfe-algorithms-for-saddle-point
Repo
Framework

How to be Fair and Diverse?


Title	How to be Fair and Diverse?
Authors	L. Elisa Celis, Amit Deshpande, Tarun Kathuria, Nisheeth K. Vishnoi
Abstract	Due to the recent cases of algorithmic bias in data-driven decision-making, machine learning methods are being put under the microscope in order to understand the root cause of these biases and how to correct them. Here, we consider a basic algorithmic task that is central in machine learning: subsampling from a large data set. Subsamples are used both as an end-goal in data summarization (where fairness could either be a legal, political or moral requirement) and to train algorithms (where biases in the samples are often a source of bias in the resulting model). Consequently, there is a growing effort to modify either the subsampling methods or the algorithms themselves in order to ensure fairness. However, in doing so, a question that seems to be overlooked is whether it is possible to produce fair subsamples that are also adequately representative of the feature space of the data set - an important and classic requirement in machine learning. Can diversity and fairness be simultaneously ensured? We start by noting that, in some applications, guaranteeing one does not necessarily guarantee the other, and a new approach is required. Subsequently, we present an algorithmic framework which allows us to produce both fair and diverse samples. Our experimental results on an image summarization task show marked improvements in fairness without compromising feature diversity by much, giving us the best of both the worlds.
Tasks	Data Summarization, Decision Making
Published	2016-10-23
URL	http://arxiv.org/abs/1610.07183v1
PDF	http://arxiv.org/pdf/1610.07183v1.pdf
PWC	https://paperswithcode.com/paper/how-to-be-fair-and-diverse
Repo
Framework

Spoofing detection under noisy conditions: a preliminary investigation and an initial database


Title	Spoofing detection under noisy conditions: a preliminary investigation and an initial database
Authors	Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li
Abstract	Spoofing detection for automatic speaker verification (ASV), which is to discriminate between live speech and attacks, has received increasing attentions recently. However, all the previous studies have been done on the clean data without significant additive noise. To simulate the real-life scenarios, we perform a preliminary investigation of spoofing detection under additive noisy conditions, and also describe an initial database for this task. The noisy database is based on the ASVspoof challenge 2015 database and generated by artificially adding background noises at different signal-to-noise ratios (SNRs). Five different additive noises are included. Our preliminary results show that using the model trained from clean data, the system performance degrades significantly in noisy conditions. Phase-based feature is more noise robust than magnitude-based features. And the systems perform significantly differ under different noise scenarios.
Tasks	Speaker Verification
Published	2016-02-09
URL	http://arxiv.org/abs/1602.02950v1
PDF	http://arxiv.org/pdf/1602.02950v1.pdf
PWC	https://paperswithcode.com/paper/spoofing-detection-under-noisy-conditions-a
Repo
Framework

Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains


Title	Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains
Authors	Andrew An Bian, Baharan Mirzasoleiman, Joachim M. Buhmann, Andreas Krause
Abstract	Submodular continuous functions are a category of (generally) non-convex/non-concave functions with a wide spectrum of applications. We characterize these functions and demonstrate that they can be maximized efficiently with approximation guarantees. Specifically, i) We introduce the weak DR property that gives a unified characterization of submodularity for all set, integer-lattice and continuous functions; ii) for maximizing monotone DR-submodular continuous functions under general down-closed convex constraints, we propose a Frank-Wolfe variant with $(1-1/e)$ approximation guarantee, and sub-linear convergence rate; iii) for maximizing general non-monotone submodular continuous functions subject to box constraints, we propose a DoubleGreedy algorithm with $1/3$ approximation guarantee. Submodular continuous functions naturally find applications in various real-world settings, including influence and revenue maximization with continuous assignments, sensor energy management, multi-resolution data summarization, facility location, etc. Experimental results show that the proposed algorithms efficiently generate superior solutions compared to baseline algorithms.
Tasks	Data Summarization
Published	2016-06-17
URL	https://arxiv.org/abs/1606.05615v5
PDF	https://arxiv.org/pdf/1606.05615v5.pdf
PWC	https://paperswithcode.com/paper/guaranteed-non-convex-optimization-submodular
Repo
Framework

Semantic Spaces


Title	Semantic Spaces
Authors	Yuri Manin, Matilde Marcolli
Abstract	Any natural language can be considered as a tool for producing large databases (consisting of texts, written, or discursive). This tool for its description in turn requires other large databases (dictionaries, grammars etc.). Nowadays, the notion of database is associated with computer processing and computer memory. However, a natural language resides also in human brains and functions in human communication, from interpersonal to intergenerational one. We discuss in this survey/research paper mathematical, in particular geometric, constructions, which help to bridge these two worlds. In particular, in this paper we consider the Vector Space Model of semantics based on frequency matrices, as used in Natural Language Processing. We investigate underlying geometries, formulated in terms of Grassmannians, projective spaces, and flag varieties. We formulate the relation between vector space models and semantic spaces based on semic axes in terms of projectability of subvarieties in Grassmannians and projective spaces. We interpret Latent Semantics as a geometric flow on Grassmannians. We also discuss how to formulate G"ardenfors’ notion of “meeting of minds” in our geometric setting.
Tasks
Published	2016-05-13
URL	http://arxiv.org/abs/1605.04238v1
PDF	http://arxiv.org/pdf/1605.04238v1.pdf
PWC	https://paperswithcode.com/paper/semantic-spaces
Repo
Framework

Topic Modelling and Event Identification from Twitter Textual Data


Title	Topic Modelling and Event Identification from Twitter Textual Data
Authors	Marina Sokolova, Kanyi Huang, Stan Matwin, Joshua Ramisch, Vera Sazonova, Renee Black, Chris Orwa, Sidney Ochieng, Nanjira Sambuli
Abstract	The tremendous growth of social media content on the Internet has inspired the development of the text analytics to understand and solve real-life problems. Leveraging statistical topic modelling helps researchers and practitioners in better comprehension of textual content as well as provides useful information for further analysis. Statistical topic modelling becomes especially important when we work with large volumes of dynamic text, e.g., Facebook or Twitter datasets. In this study, we summarize the message content of four data sets of Twitter messages relating to challenging social events in Kenya. We use Latent Dirichlet Allocation (LDA) topic modelling to analyze the content. Our study uses two evaluation measures, Normalized Mutual Information (NMI) and topic coherence analysis, to select the best LDA models. The obtained LDA results show that the tool can be effectively used to extract discussion topics and summarize them for further manual analysis
Tasks
Published	2016-08-08
URL	http://arxiv.org/abs/1608.02519v1
PDF	http://arxiv.org/pdf/1608.02519v1.pdf
PWC	https://paperswithcode.com/paper/topic-modelling-and-event-identification-from
Repo
Framework

Text Segmentation using Named Entity Recognition and Co-reference Resolution in English and Greek Texts


Title	Text Segmentation using Named Entity Recognition and Co-reference Resolution in English and Greek Texts
Authors	Pavlina Fragkou
Abstract	In this paper we examine the benefit of performing named entity recognition (NER) and co-reference resolution to an English and a Greek corpus used for text segmentation. The aim here is to examine whether the combination of text segmentation and information extraction can be beneficial for the identification of the various topics that appear in a document. NER was performed manually in the English corpus and was compared with the output produced by publicly available annotation tools while, an already existing tool was used for the Greek corpus. Produced annotations from both corpora were manually corrected and enriched to cover four types of named entities. Co-reference resolution i.e., substitution of every reference of the same instance with the same named entity identifier was subsequently performed. The evaluation, using five text segmentation algorithms for the English corpus and four for the Greek corpus leads to the conclusion that, the benefit highly depends on the segment’s topic, the number of named entity instances appearing in it, as well as the segment’s length.
Tasks	Named Entity Recognition
Published	2016-10-28
URL	http://arxiv.org/abs/1610.09226v1
PDF	http://arxiv.org/pdf/1610.09226v1.pdf
PWC	https://paperswithcode.com/paper/text-segmentation-using-named-entity
Repo
Framework

Exact and Inexact Subsampled Newton Methods for Optimization


Title	Exact and Inexact Subsampled Newton Methods for Optimization
Authors	Raghu Bollapragada, Richard Byrd, Jorge Nocedal
Abstract	The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how to coordinate the accuracy in the gradient and Hessian to yield a superlinear rate of convergence in expectation. The second part of the paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact). We provide a complexity analysis for this method based on the properties of the CG iteration and the quality of the Hessian approximation, and compare it with a method that employs a stochastic gradient iteration instead of the CG method. We report preliminary numerical results that illustrate the performance of inexact subsampled Newton methods on machine learning applications based on logistic regression.
Tasks	Stochastic Optimization
Published	2016-09-27
URL	http://arxiv.org/abs/1609.08502v1
PDF	http://arxiv.org/pdf/1609.08502v1.pdf
PWC	https://paperswithcode.com/paper/exact-and-inexact-subsampled-newton-methods
Repo
Framework

Building Ensembles of Adaptive Nested Dichotomies with Random-Pair Selection


Title	Building Ensembles of Adaptive Nested Dichotomies with Random-Pair Selection
Authors	Tim Leathart, Bernhard Pfahringer, Eibe Frank
Abstract	A system of nested dichotomies is a method of decomposing a multi-class problem into a collection of binary problems. Such a system recursively splits the set of classes into two subsets, and trains a binary classifier to distinguish between each subset. Even though ensembles of nested dichotomies with random structure have been shown to perform well in practice, using a more sophisticated class subset selection method can be used to improve classification accuracy. We investigate an approach to this problem called random-pair selection, and evaluate its effectiveness compared to other published methods of subset selection. We show that our method outperforms other methods in many cases when forming ensembles of nested dichotomies, and is at least on par in all other cases.
Tasks
Published	2016-04-07
URL	http://arxiv.org/abs/1604.01854v2
PDF	http://arxiv.org/pdf/1604.01854v2.pdf
PWC	https://paperswithcode.com/paper/building-ensembles-of-adaptive-nested
Repo
Framework

From Virtual to Real World Visual Perception using Domain Adaptation – The DPM as Example


Title	From Virtual to Real World Visual Perception using Domain Adaptation – The DPM as Example
Authors	Antonio M. Lopez, Jiaolong Xu, Jose L. Gomez, David Vazquez, German Ros
Abstract	Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world.
Tasks	Domain Adaptation
Published	2016-12-29
URL	http://arxiv.org/abs/1612.09134v1
PDF	http://arxiv.org/pdf/1612.09134v1.pdf
PWC	https://paperswithcode.com/paper/from-virtual-to-real-world-visual-perception
Repo
Framework

Tensor Decompositions for Identifying Directed Graph Topologies and Tracking Dynamic Networks


Title	Tensor Decompositions for Identifying Directed Graph Topologies and Tracking Dynamic Networks
Authors	Yanning Shen, Brian Baingana, Georgios B. Giannakis
Abstract	Directed networks are pervasive both in nature and engineered systems, often underlying the complex behavior observed in biological systems, microblogs and social interactions over the web, as well as global financial markets. Since their structures are often unobservable, in order to facilitate network analytics, one generally resorts to approaches capitalizing on measurable nodal processes to infer the unknown topology. Structural equation models (SEMs) are capable of incorporating exogenous inputs to resolve inherent directional ambiguities. However, conventional SEMs assume full knowledge of exogenous inputs, which may not be readily available in some practical settings. The present paper advocates a novel SEM-based topology inference approach that entails factorization of a three-way tensor, constructed from the observed nodal data, using the well-known parallel factor (PARAFAC) decomposition. It turns out that second-order piecewise stationary statistics of exogenous variables suffice to identify the hidden topology. Capitalizing on the uniqueness properties inherent to high-order tensor factorizations, it is shown that topology identification is possible under reasonably mild conditions. In addition, to facilitate real-time operation and inference of time-varying networks, an adaptive (PARAFAC) tensor decomposition scheme which tracks the topology-revealing tensor factors is developed. Extensive tests on simulated and real stock quote data demonstrate the merits of the novel tensor-based approach.
Tasks
Published	2016-10-26
URL	http://arxiv.org/abs/1610.08189v1
PDF	http://arxiv.org/pdf/1610.08189v1.pdf
PWC	https://paperswithcode.com/paper/tensor-decompositions-for-identifying
Repo
Framework