Paper Group ANR 702
Graph-based Clustering under Differential Privacy. Re-examination of Bregman functions and new properties of their divergences. Estimation of Inter-Sentiment Correlations Employing Deep Neural Network Models. NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection. Level-Based Analysis of the Univariate Marginal Distribution Al …
Graph-based Clustering under Differential Privacy
Title | Graph-based Clustering under Differential Privacy |
Authors | Rafael Pinot, Anne Morvan, Florian Yger, Cédric Gouy-Pailler, Jamal Atif |
Abstract | In this paper, we present the first differentially private clustering method for arbitrary-shaped node clusters in a graph. This algorithm takes as input only an approximate Minimum Spanning Tree (MST) $\mathcal{T}$ released under weight differential privacy constraints from the graph. Then, the underlying nonconvex clustering partition is successfully recovered from cutting optimal cuts on $\mathcal{T}$. As opposed to existing methods, our algorithm is theoretically well-motivated. Experiments support our theoretical findings. |
Tasks | |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03831v1 |
http://arxiv.org/pdf/1803.03831v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-based-clustering-under-differential |
Repo | |
Framework | |
Re-examination of Bregman functions and new properties of their divergences
Title | Re-examination of Bregman functions and new properties of their divergences |
Authors | Daniel Reem, Simeon Reich, Alvaro De Pierro |
Abstract | The Bregman divergence (Bregman distance, Bregman measure of distance) is a certain useful substitute for a distance, obtained from a well-chosen function (the “Bregman function”). Bregman functions and divergences have been extensively investigated during the last decades and have found applications in optimization, operations research, information theory, nonlinear analysis, machine learning and more. This paper re-examines various aspects related to the theory of Bregman functions and divergences. In particular, it presents many sufficient conditions which allow the construction of Bregman functions in a general setting and introduces new Bregman functions (such as a negative iterated log entropy). Moreover, it sheds new light on several known Bregman functions such as quadratic entropies, the negative Havrda-Charv'at-Tsallis entropy, and the negative Boltzmann-Gibbs-Shannon entropy, and it shows that the negative Burg entropy, which is not a Bregman function according to the classical theory but nevertheless is known to have “Bregmanian properties”, can, by our re-examination of the theory, be considered as a Bregman function. Our analysis yields several by-products of independent interest such as the introduction of the concept of relative uniform convexity (a certain generalization of uniform convexity), new properties of uniformly and strongly convex functions, and results in Banach space theory. |
Tasks | |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00641v4 |
http://arxiv.org/pdf/1803.00641v4.pdf | |
PWC | https://paperswithcode.com/paper/re-examination-of-bregman-functions-and-new |
Repo | |
Framework | |
Estimation of Inter-Sentiment Correlations Employing Deep Neural Network Models
Title | Estimation of Inter-Sentiment Correlations Employing Deep Neural Network Models |
Authors | Xinzhi Wang, Shengcheng Yuan, Hui Zhang, Yi Liu |
Abstract | This paper focuses on sentiment mining and sentiment correlation analysis of web events. Although neural network models have contributed a lot to mining text information, little attention is paid to analysis of the inter-sentiment correlations. This paper fills the gap between sentiment calculation and inter-sentiment correlations. In this paper, the social emotion is divided into six categories: love, joy, anger, sadness, fear, and surprise. Two deep neural network models are presented for sentiment calculation. Three datasets - the titles, the bodies, the comments of news articles - are collected, covering both objective and subjective texts in varying lengths (long and short). From each dataset, three kinds of features are extracted: explicit expression, implicit expression, and alphabet characters. The performance of the two models are analyzed, with respect to each of the three kinds of the features. There is controversial phenomenon on the interpretation of anger (fn) and love (gd). In subjective text, other emotions are easily to be considered as anger. By contrast, in objective news bodies and titles, it is easy to regard text as caused love (gd). It means, journalist may want to arouse emotion love by writing news, but cause anger after the news is published. This result reflects the sentiment complexity and unpredictability. |
Tasks | |
Published | 2018-11-24 |
URL | http://arxiv.org/abs/1811.09755v1 |
http://arxiv.org/pdf/1811.09755v1.pdf | |
PWC | https://paperswithcode.com/paper/estimation-of-inter-sentiment-correlations |
Repo | |
Framework | |
NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection
Title | NOTE-RCNN: NOise Tolerant Ensemble RCNN for Semi-Supervised Object Detection |
Authors | JIyang Gao, Jiang Wang, Shengyang Dai, Li-Jia Li, Ram Nevatia |
Abstract | The labeling cost of large number of bounding boxes is one of the main challenges for training modern object detectors. To reduce the dependence on expensive bounding box annotations, we propose a new semi-supervised object detection formulation, in which a few seed box level annotations and a large scale of image level annotations are used to train the detector. We adopt a training-mining framework, which is widely used in weakly supervised object detection tasks. However, the mining process inherently introduces various kinds of labelling noises: false negatives, false positives and inaccurate boundaries, which can be harmful for training the standard object detectors (e.g. Faster RCNN). We propose a novel NOise Tolerant Ensemble RCNN (NOTE-RCNN) object detector to handle such noisy labels. Comparing to standard Faster RCNN, it contains three highlights: an ensemble of two classification heads and a distillation head to avoid overfitting on noisy labels and improve the mining precision, masking the negative sample loss in box predictor to avoid the harm of false negative labels, and training box regression head only on seed annotations to eliminate the harm from inaccurate boundaries of mined bounding boxes. We evaluate the methods on ILSVRC 2013 and MSCOCO 2017 dataset; we observe that the detection accuracy consistently improves as we iterate between mining and training steps, and state-of-the-art performance is achieved. |
Tasks | Object Detection, Weakly Supervised Object Detection |
Published | 2018-12-01 |
URL | http://arxiv.org/abs/1812.00124v1 |
http://arxiv.org/pdf/1812.00124v1.pdf | |
PWC | https://paperswithcode.com/paper/note-rcnn-noise-tolerant-ensemble-rcnn-for |
Repo | |
Framework | |
Level-Based Analysis of the Univariate Marginal Distribution Algorithm
Title | Level-Based Analysis of the Univariate Marginal Distribution Algorithm |
Authors | Duc-Cuong Dang, Per Kristian Lehre, Phan Trung Hai Nguyen |
Abstract | Estimation of Distribution Algorithms (EDAs) are stochastic heuristics that search for optimal solutions by learning and sampling from probabilistic models. Despite their popularity in real-world applications, there is little rigorous understanding of their performance. Even for the Univariate Marginal Distribution Algorithm (UMDA) – a simple population-based EDA assuming independence between decision variables – the optimisation time on the linear problem OneMax was until recently undetermined. The incomplete theoretical understanding of EDAs is mainly due to lack of appropriate analytical tools. We show that the recently developed level-based theorem for non-elitist populations combined with anti-concentration results yield upper bounds on the expected optimisation time of the UMDA. This approach results in the bound $\mathcal{O}(n\lambda\log \lambda+n^2)$ on two problems, LeadingOnes and BinVal, for population sizes $\lambda>\mu=\Omega(\log n)$, where $\mu$ and $\lambda$ are parameters of the algorithm. We also prove that the UMDA with population sizes $\mu\in \mathcal{O}(\sqrt{n}) \cap \Omega(\log n)$ optimises OneMax in expected time $\mathcal{O}(\lambda n)$, and for larger population sizes $\mu=\Omega(\sqrt{n}\log n)$, in expected time $\mathcal{O}(\lambda\sqrt{n})$. The facility and generality of our arguments suggest that this is a promising approach to derive bounds on the expected optimisation time of EDAs. |
Tasks | |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10038v1 |
http://arxiv.org/pdf/1807.10038v1.pdf | |
PWC | https://paperswithcode.com/paper/level-based-analysis-of-the-univariate |
Repo | |
Framework | |
High Granular Operator Spaces, and Less-Contaminated General Rough Mereologies
Title | High Granular Operator Spaces, and Less-Contaminated General Rough Mereologies |
Authors | A. Mani |
Abstract | Granular operator spaces and variants had been introduced and used in theoretical investigations on the foundations of general rough sets by the present author over the last few years. In this research, higher order versions of these are presented uniformly as partial algebraic systems. They are also adapted for practical applications when the data is representable by data table-like structures according to a minimalist schema for avoiding contamination. Issues relating to valuations used in information systems or tables are also addressed. The concept of contamination introduced and studied by the present author across a number of her papers, concerns mixing up of information across semantic domains (or domains of discourse). Rough inclusion functions (\textsf{RIF}s), variants, and numeric functions often have a direct or indirect role in contaminating algorithms. Some solutions that seek to replace or avoid them have been proposed and investigated by the present author in some of her earlier papers. Because multiple kinds of solution are of interest to the contamination problem, granular generalizations of RIFs are proposed, and investigated. Interesting representation results are proved and a core algebraic strategy for generalizing Skowron-Polkowski style of rough mereology (though for a very different purpose) is formulated. A number of examples have been added to illustrate key parts of the proposal in higher order variants of granular operator spaces. Further algorithms grounded in mereological nearness, suited for decision-making in human-machine interaction contexts, are proposed by the present author. Applications of granular \textsf{RIF}s to partial/soft solutions of the inverse problem are also invented in this paper. |
Tasks | Decision Making |
Published | 2018-11-15 |
URL | https://arxiv.org/abs/1811.06560v4 |
https://arxiv.org/pdf/1811.06560v4.pdf | |
PWC | https://paperswithcode.com/paper/granularity-and-generalized-inclusion |
Repo | |
Framework | |
Automatic Full Compilation of Julia Programs and ML Models to Cloud TPUs
Title | Automatic Full Compilation of Julia Programs and ML Models to Cloud TPUs |
Authors | Keno Fischer, Elliot Saba |
Abstract | Google’s Cloud TPUs are a promising new hardware architecture for machine learning workloads. They have powered many of Google’s milestone machine learning achievements in recent years. Google has now made TPUs available for general use on their cloud platform and as of very recently has opened them up further to allow use by non-TensorFlow frontends. We describe a method and implementation for offloading suitable sections of Julia programs to TPUs via this new API and the Google XLA compiler. Our method is able to completely fuse the forward pass of a VGG19 model expressed as a Julia program into a single TPU executable to be offloaded to the device. Our method composes well with existing compiler-based automatic differentiation techniques on Julia code, and we are thus able to also automatically obtain the VGG19 backwards pass and similarly offload it to the TPU. Targeting TPUs using our compiler, we are able to evaluate the VGG19 forward pass on a batch of 100 images in 0.23s which compares favorably to the 52.4s required for the original model on the CPU. Our implementation is less than 1000 lines of Julia, with no TPU specific changes made to the core Julia compiler or any other Julia packages. |
Tasks | |
Published | 2018-10-23 |
URL | http://arxiv.org/abs/1810.09868v1 |
http://arxiv.org/pdf/1810.09868v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-full-compilation-of-julia-programs |
Repo | |
Framework | |
DBpedia NIF: Open, Large-Scale and Multilingual Knowledge Extraction Corpus
Title | DBpedia NIF: Open, Large-Scale and Multilingual Knowledge Extraction Corpus |
Authors | Milan Dojchinovski, Julio Hernandez, Markus Ackermann, Amit Kirschenbaum, Sebastian Hellmann |
Abstract | In the past decade, the DBpedia community has put significant amount of effort on developing technical infrastructure and methods for efficient extraction of structured information from Wikipedia. These efforts have been primarily focused on harvesting, refinement and publishing semi-structured information found in Wikipedia articles, such as information from infoboxes, categorization information, images, wikilinks and citations. Nevertheless, still vast amount of valuable information is contained in the unstructured Wikipedia article texts. In this paper, we present DBpedia NIF - a large-scale and multilingual knowledge extraction corpus. The aim of the dataset is two-fold: to dramatically broaden and deepen the amount of structured information in DBpedia, and to provide large-scale and multilingual language resource for development of various NLP and IR task. The dataset provides the content of all articles for 128 Wikipedia languages. We describe the dataset creation process and the NLP Interchange Format (NIF) used to model the content, links and the structure the information of the Wikipedia articles. The dataset has been further enriched with about 25% more links and selected partitions published as Linked Data. Finally, we describe the maintenance and sustainability plans, and selected use cases of the dataset from the TextExt knowledge extraction challenge. |
Tasks | |
Published | 2018-12-26 |
URL | http://arxiv.org/abs/1812.10315v1 |
http://arxiv.org/pdf/1812.10315v1.pdf | |
PWC | https://paperswithcode.com/paper/dbpedia-nif-open-large-scale-and-multilingual |
Repo | |
Framework | |
Forecasting Disease Trajectories in Alzheimer’s Disease Using Deep Learning
Title | Forecasting Disease Trajectories in Alzheimer’s Disease Using Deep Learning |
Authors | Bryan Lim, Mihaela van der Schaar |
Abstract | Joint models for longitudinal and time-to-event data are commonly used in longitudinal studies to forecast disease trajectories over time. Despite the many advantages of joint modeling, the standard forms suffer from limitations that arise from a fixed model specification and computational difficulties when applied to large datasets. We adopt a deep learning approach to address these limitations, enhancing existing methods with the flexibility and scalability of deep neural networks while retaining the benefits of joint modeling. Using data from the Alzheimer’s Disease Neuroimaging Institute, we show improvements in performance and scalability compared to traditional methods. |
Tasks | |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.03159v1 |
http://arxiv.org/pdf/1807.03159v1.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-disease-trajectories-in |
Repo | |
Framework | |
Probably approximately correct learning of Horn envelopes from queries
Title | Probably approximately correct learning of Horn envelopes from queries |
Authors | Daniel Borchmann, Tom Hanika, Sergei Obiedkov |
Abstract | We propose an algorithm for learning the Horn envelope of an arbitrary domain using an expert, or an oracle, capable of answering certain types of queries about this domain. Attribute exploration from formal concept analysis is a procedure that solves this problem, but the number of queries it may ask is exponential in the size of the resulting Horn formula in the worst case. We recall a well-known polynomial-time algorithm for learning Horn formulas with membership and equivalence queries and modify it to obtain a polynomial-time probably approximately correct algorithm for learning the Horn envelope of an arbitrary domain. |
Tasks | |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.06149v1 |
http://arxiv.org/pdf/1807.06149v1.pdf | |
PWC | https://paperswithcode.com/paper/probably-approximately-correct-learning-of |
Repo | |
Framework | |
Region Detection in Markov Random Fields: Gaussian Case
Title | Region Detection in Markov Random Fields: Gaussian Case |
Authors | Ilya Soloveychik, Vahid Tarokh |
Abstract | We consider the problem of model selection in Gaussian Markov fields in the sample deficient scenario. The benchmark information-theoretic results in the case of d-regular graphs require the number of samples to be at least proportional to the logarithm of the number of vertices to allow consistent graph recovery. When the number of samples is less than this amount, reliable detection of all edges is impossible. In many applications, it is more important to learn the distribution of the edge (coupling) parameters over the network than the specific locations of the edges. Assuming that the entire graph can be partitioned into a number of spatial regions with similar edge parameters and reasonably regular boundaries, we develop new information-theoretic sample complexity bounds and show that a bounded number of samples can be sufficient to consistently recover these regions. Finally, we introduce and analyze an efficient region growing algorithm capable of recovering the regions with high accuracy. We show that it is consistent and demonstrate its performance benefits in synthetic simulations. |
Tasks | Model Selection |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.03848v8 |
http://arxiv.org/pdf/1802.03848v8.pdf | |
PWC | https://paperswithcode.com/paper/region-detection-in-markov-random-fields |
Repo | |
Framework | |
Entropy Rate Estimation for Markov Chains with Large State Space
Title | Entropy Rate Estimation for Markov Chains with Large State Space |
Authors | Yanjun Han, Jiantao Jiao, Chuan-Zheng Lee, Tsachy Weissman, Yihong Wu, Tiancheng Yu |
Abstract | Estimating the entropy based on data is one of the prototypical problems in distribution property testing and estimation. For estimating the Shannon entropy of a distribution on $S$ elements with independent samples, [Paninski2004] showed that the sample complexity is sublinear in $S$, and [Valiant–Valiant2011] showed that consistent estimation of Shannon entropy is possible if and only if the sample size $n$ far exceeds $\frac{S}{\log S}$. In this paper we consider the problem of estimating the entropy rate of a stationary reversible Markov chain with $S$ states from a sample path of $n$ observations. We show that: (1) As long as the Markov chain mixes not too slowly, i.e., the relaxation time is at most $O(\frac{S}{\ln^3 S})$, consistent estimation is achievable when $n \gg \frac{S^2}{\log S}$. (2) As long as the Markov chain has some slight dependency, i.e., the relaxation time is at least $1+\Omega(\frac{\ln^2 S}{\sqrt{S}})$, consistent estimation is impossible when $n \lesssim \frac{S^2}{\log S}$. Under both assumptions, the optimal estimation accuracy is shown to be $\Theta(\frac{S^2}{n \log S})$. In comparison, the empirical entropy rate requires at least $\Omega(S^2)$ samples to be consistent, even when the Markov chain is memoryless. In addition to synthetic experiments, we also apply the estimators that achieve the optimal sample complexity to estimate the entropy rate of the English language in the Penn Treebank and the Google One Billion Words corpora, which provides a natural benchmark for language modeling and relates it directly to the widely used perplexity measure. |
Tasks | Language Modelling |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07889v3 |
http://arxiv.org/pdf/1802.07889v3.pdf | |
PWC | https://paperswithcode.com/paper/entropy-rate-estimation-for-markov-chains |
Repo | |
Framework | |
Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure
Title | Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure |
Authors | Deng-Ping Fan, ShengChuan Zhang, Yu-Huan Wu, Ming-Ming Cheng, Bo Ren, Rongrong Ji, Paul L Rosin |
Abstract | Existing face sketch synthesis (FSS) similarity measures are sensitive to slight image degradation (e.g., noise, blur). However, human perception of the similarity of two sketches will consider both structure and texture as essential factors and is not sensitive to slight (“pixel-level”) mismatches. Consequently, the use of existing similarity measures can lead to better algorithms receiving a lower score than worse algorithms. This unreliable evaluation has significantly hindered the development of the FSS field. To solve this problem, we propose a novel and robust style similarity measure called Scoot-measure (Structure CO-Occurrence Texture Measure), which simultaneously evaluates “block-level” spatial structure and co-occurrence texture statistics. In addition, we further propose 4 new meta-measures and create 2 new datasets to perform a comprehensive evaluation of several widely-used FSS measures on two large databases. Experimental results demonstrate that our measure not only provides a reliable evaluation but also achieves significantly improved performance. Specifically, the study indicated a higher degree (78.8%) of correlation between our measure and human judgment than the best prior measure (58.6%). Our code will be made available. |
Tasks | Face Sketch Synthesis |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.02975v1 |
http://arxiv.org/pdf/1804.02975v1.pdf | |
PWC | https://paperswithcode.com/paper/face-sketch-synthesis-style-similaritya-new |
Repo | |
Framework | |
Causal Identification under Markov Equivalence
Title | Causal Identification under Markov Equivalence |
Authors | Amin Jaber, Jiji Zhang, Elias Bareinboim |
Abstract | Assessing the magnitude of cause-and-effect relations is one of the central challenges found throughout the empirical sciences. The problem of identification of causal effects is concerned with determining whether a causal effect can be computed from a combination of observational data and substantive knowledge about the domain under investigation, which is formally expressed in the form of a causal graph. In many practical settings, however, the knowledge available for the researcher is not strong enough so as to specify a unique causal graph. Another line of investigation attempts to use observational data to learn a qualitative description of the domain called a Markov equivalence class, which is the collection of causal graphs that share the same set of observed features. In this paper, we marry both approaches and study the problem of causal identification from an equivalence class, represented by a partial ancestral graph (PAG). We start by deriving a set of graphical properties of PAGs that are carried over to its induced subgraphs. We then develop an algorithm to compute the effect of an arbitrary set of variables on an arbitrary outcome set. We show that the algorithm is strictly more powerful than the current state of the art found in the literature. |
Tasks | Causal Identification |
Published | 2018-12-15 |
URL | http://arxiv.org/abs/1812.06209v1 |
http://arxiv.org/pdf/1812.06209v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-identification-under-markov |
Repo | |
Framework | |
Experimental Implementation of a Quantum Autoencoder via Quantum Adders
Title | Experimental Implementation of a Quantum Autoencoder via Quantum Adders |
Authors | Yongcheng Ding, Lucas Lamata, Mikel Sanz, Xi Chen, Enrique Solano |
Abstract | Quantum autoencoders allow for reducing the amount of resources in a quantum computation by mapping the original Hilbert space onto a reduced space with the relevant information. Recently, it was proposed to employ approximate quantum adders to implement quantum autoencoders in quantum technologies. Here, we carry out the experimental implementation of this proposal in the Rigetti cloud quantum computer employing up to three qubits. The experimental fidelities are in good agreement with the theoretical prediction, thus proving the feasibility to realize quantum autoencoders via quantum adders in state-of-the-art superconducting quantum technologies. |
Tasks | |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10643v2 |
http://arxiv.org/pdf/1807.10643v2.pdf | |
PWC | https://paperswithcode.com/paper/experimental-implementation-of-a-quantum |
Repo | |
Framework | |