Paper Group ANR 989
Learning Mixtures of Smooth Product Distributions: Identifiability and Algorithm. An attempt at beating the 3D U-Net. Learning to Optimize Computational Resources: Frugal Training with Generalization Guarantees. End-to-End Learning Deep CRF models for Multi-Object Tracking. Robust Membership Encoding: Inference Attacks and Copyright Protection for …
Learning Mixtures of Smooth Product Distributions: Identifiability and Algorithm
Title | Learning Mixtures of Smooth Product Distributions: Identifiability and Algorithm |
Authors | Nikos Kargas, Nicholas D. Sidiropoulos |
Abstract | We study the problem of learning a mixture model of non-parametric product distributions. The problem of learning a mixture model is that of finding the component distributions along with the mixing weights using observed samples generated from the mixture. The problem is well-studied in the parametric setting, i.e., when the component distributions are members of a parametric family – such as Gaussian distributions. In this work, we focus on multivariate mixtures of non-parametric product distributions and propose a two-stage approach which recovers the component distributions of the mixture under a smoothness condition. Our approach builds upon the identifiability properties of the canonical polyadic (low-rank) decomposition of tensors, in tandem with Fourier and Shannon-Nyquist sampling staples from signal processing. We demonstrate the effectiveness of the approach on synthetic and real datasets. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01156v1 |
http://arxiv.org/pdf/1904.01156v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-mixtures-of-smooth-product |
Repo | |
Framework | |
An attempt at beating the 3D U-Net
Title | An attempt at beating the 3D U-Net |
Authors | Fabian Isensee, Klaus H. Maier-Hein |
Abstract | The U-Net is arguably the most successful segmentation architecture in the medical domain. Here we apply a 3D U-Net to the 2019 Kidney and Kidney Tumor Segmentation Challenge and attempt to improve upon it by augmenting it with residual and pre-activation residual blocks. Cross-validation results on the training cases suggest only very minor, barely measurable improvements. Due to marginally higher dice scores, the residual 3D U-Net is chosen for test set prediction. With a Composite Dice score of 91.23 on the test set, our method outperformed all 105 competing teams and won the KiTS2019 challenge by a small margin. |
Tasks | |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.02182v2 |
https://arxiv.org/pdf/1908.02182v2.pdf | |
PWC | https://paperswithcode.com/paper/an-attempt-at-beating-the-3d-u-net |
Repo | |
Framework | |
Learning to Optimize Computational Resources: Frugal Training with Generalization Guarantees
Title | Learning to Optimize Computational Resources: Frugal Training with Generalization Guarantees |
Authors | Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik |
Abstract | Algorithms typically come with tunable parameters that have a considerable impact on the computational resources they consume. Too often, practitioners must hand-tune the parameters, a tedious and error-prone task. A recent line of research provides algorithms that return nearly-optimal parameters from within a finite set. These algorithms can be used when the parameter space is infinite by providing as input a random sample of parameters. This data-independent discretization, however, might miss pockets of nearly-optimal parameters: prior research has presented scenarios where the only viable parameters lie within an arbitrarily small region. We provide an algorithm that learns a finite set of promising parameters from within an infinite set. Our algorithm can help compile a configuration portfolio, or it can be used to select the input to a configuration algorithm for finite parameter spaces. Our approach applies to any configuration problem that satisfies a simple yet ubiquitous structure: the algorithm’s performance is a piecewise constant function of its parameters. Prior research has exhibited this structure in domains from integer programming to clustering. |
Tasks | |
Published | 2019-05-26 |
URL | https://arxiv.org/abs/1905.10819v2 |
https://arxiv.org/pdf/1905.10819v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-optimize-computational-resources |
Repo | |
Framework | |
End-to-End Learning Deep CRF models for Multi-Object Tracking
Title | End-to-End Learning Deep CRF models for Multi-Object Tracking |
Authors | Jun Xiang, Ma Chao, Guohan Xu, Jianhua Hou |
Abstract | Existing deep multi-object tracking (MOT) approaches first learn a deep representation to describe target objects and then associate detection results by optimizing a linear assignment problem. Despite demonstrated successes, it is challenging to discriminate target objects under mutual occlusion or to reduce identity switches in crowded scenes. In this paper, we propose learning deep conditional random field (CRF) networks, aiming to model the assignment costs as unary potentials and the long-term dependencies among detection results as pairwise potentials. Specifically, we use a bidirectional long short-term memory (LSTM) network to encode the long-term dependencies. We pose the CRF inference as a recurrent neural network learning process using the standard gradient descent algorithm, where unary and pairwise potentials are jointly optimized in an end-to-end manner. Extensive experimental results on the challenging MOT datasets including MOT-2015 and MOT-2016, demonstrate that our approach achieves the state of the art performances in comparison with published works on both benchmarks. |
Tasks | Multi-Object Tracking, Object Tracking |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1907.12176v1 |
https://arxiv.org/pdf/1907.12176v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-learning-deep-crf-models-for-multi |
Repo | |
Framework | |
Robust Membership Encoding: Inference Attacks and Copyright Protection for Deep Learning
Title | Robust Membership Encoding: Inference Attacks and Copyright Protection for Deep Learning |
Authors | Congzheng Song, Reza Shokri |
Abstract | Machine learning as a service (MLaaS), and algorithm marketplaces are on a rise. Data holders can easily train complex models on their data using third party provided learning codes. Training accurate ML models requires massive labeled data and advanced learning algorithms. The resulting models are considered as intellectual property of the model owners and their copyright should be protected. Also, MLaaS needs to be trusted not to embed secret information about the training data into the model, such that it could be later retrieved when the model is deployed. In this paper, we present \emph{membership encoding} for training deep neural networks and encoding the membership information, i.e. whether a data point is used for training, for a subset of training data. Membership encoding has several applications in different scenarios, including robust watermarking for model copyright protection, and also the risk analysis of stealthy data embedding privacy attacks. Our encoding algorithm can determine the membership of significantly redacted data points, and is also robust to model compression and fine-tuning. It also enables encoding a significant fraction of the training set, with negligible drop in the model’s prediction accuracy. |
Tasks | Model Compression |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1909.12982v2 |
https://arxiv.org/pdf/1909.12982v2.pdf | |
PWC | https://paperswithcode.com/paper/membership-encoding-for-deep-learning |
Repo | |
Framework | |
Learning dynamic polynomial proofs
Title | Learning dynamic polynomial proofs |
Authors | Alhussein Fawzi, Mateusz Malinowski, Hamza Fawzi, Omar Fawzi |
Abstract | Polynomial inequalities lie at the heart of many mathematical disciplines. In this paper, we consider the fundamental computational task of automatically searching for proofs of polynomial inequalities. We adopt the framework of semi-algebraic proof systems that manipulate polynomial inequalities via elementary inference rules that infer new inequalities from the premises. These proof systems are known to be very powerful, but searching for proofs remains a major difficulty. In this work, we introduce a machine learning based method to search for a dynamic proof within these proof systems. We propose a deep reinforcement learning framework that learns an embedding of the polynomials and guides the choice of inference rules, taking the inherent symmetries of the problem as an inductive bias. We compare our approach with powerful and widely-studied linear programming hierarchies based on static proof systems, and show that our method reduces the size of the linear program by several orders of magnitude while also improving performance. These results hence pave the way towards augmenting powerful and well-studied semi-algebraic proof systems with machine learning guiding strategies for enhancing the expressivity of such proof systems. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01681v1 |
https://arxiv.org/pdf/1906.01681v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-dynamic-polynomial-proofs |
Repo | |
Framework | |
DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression
Title | DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression |
Authors | Sian Jin, Sheng Di, Xin Liang, Jiannan Tian, Dingwen Tao, Franck Cappello |
Abstract | DNNs have been quickly and broadly exploited to improve the data analysis quality in many complex science and engineering applications. Today’s DNNs are becoming deeper and wider because of increasing demand on the analysis quality and more and more complex applications to resolve. The wide and deep DNNs, however, require large amounts of resources, significantly restricting their utilization on resource-constrained systems. Although some network simplification methods have been proposed to address this issue, they suffer from either low compression ratios or high compression errors, which may introduce a costly retraining process for the target accuracy. In this paper, we propose DeepSZ: an accuracy-loss bounded neural network compression framework, which involves four key steps: network pruning, error bound assessment, optimization for error bound configuration, and compressed model generation, featuring a high compression ratio and low encoding time. The contribution is three-fold. (1) We develop an adaptive approach to select the feasible error bounds for each layer. (2) We build a model to estimate the overall loss of accuracy based on the accuracy degradation caused by individual decompressed layers. (3) We develop an efficient optimization algorithm to determine the best-fit configuration of error bounds in order to maximize the compression ratio under the user-set accuracy constraint. Experiments show that DeepSZ can compress AlexNet and VGG-16 on the ImageNet by a compression ratio of 46X and 116X, respectively, and compress LeNet-300-100 and LeNet-5 on the MNIST by a compression ratio of 57X and 56X, respectively, with only up to 0.3% loss of accuracy. Compared with other state-of-the-art methods, DeepSZ can improve the compression ratio by up to 1.43X, the DNN encoding performance by up to 4.0X (with four Nvidia Tesla V100 GPUs), and the decoding performance by up to 6.2X. |
Tasks | Network Pruning, Neural Network Compression |
Published | 2019-01-26 |
URL | http://arxiv.org/abs/1901.09124v2 |
http://arxiv.org/pdf/1901.09124v2.pdf | |
PWC | https://paperswithcode.com/paper/deepsz-a-novel-framework-to-compress-deep |
Repo | |
Framework | |
Anticipation in collaborative music performance using fuzzy systems: a case study
Title | Anticipation in collaborative music performance using fuzzy systems: a case study |
Authors | Oscar Thörn, Peter Fögel, Peter Knudsen, Luis de Miranda, Alessandro Saffiotti |
Abstract | In order to collaborate and co-create with humans, an AI system must be capable of both reactive and anticipatory behavior. We present a case study of such a system in the domain of musical improvisation. We consider a duo consisting of a human pianist accompained by an off-the-shelf virtual drummer, and we design an AI system to control the perfomance parameters of the drummer (e.g., patterns, intensity, or complexity) as a function of what the human pianist is playing. The AI system utilizes a model elicited from the musicians and encoded through fuzzy logic. This paper outlines the methodology, design, and development process of this system. An evaluation in public concerts is upcoming. This case study is seen as a step in the broader investigation of anticipation and creative processes in mixed human-robot, or “anthrobotic” systems. |
Tasks | |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02155v1 |
https://arxiv.org/pdf/1906.02155v1.pdf | |
PWC | https://paperswithcode.com/paper/anticipation-in-collaborative-music |
Repo | |
Framework | |
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering
Title | Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering |
Authors | Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong |
Abstract | Answering questions that require multi-hop reasoning at web-scale necessitates retrieving multiple evidence documents, one of which often has little lexical or semantic relationship to the question. This paper introduces a new graph-based recurrent retrieval approach that learns to retrieve reasoning paths over the Wikipedia graph to answer multi-hop open-domain questions. Our retriever model trains a recurrent neural network that learns to sequentially retrieve evidence paragraphs in the reasoning path by conditioning on the previously retrieved documents. Our reader model ranks the reasoning paths and extracts the answer span included in the best reasoning path. Experimental results show state-of-the-art results in three open-domain QA datasets, showcasing the effectiveness and robustness of our method. Notably, our method achieves significant improvement in HotpotQA, outperforming the previous best model by more than 14 points. |
Tasks | Question Answering |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10470v2 |
https://arxiv.org/pdf/1911.10470v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-retrieve-reasoning-paths-over-1 |
Repo | |
Framework | |
MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation
Title | MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation |
Authors | Thomas Searle, Zeljko Kraljevic, Rebecca Bendayan, Daniel Bean, Richard Dobson |
Abstract | We present MedCATTrainer an interface for building, improving and customising a given Named Entity Recognition and Linking (NER+L) model for biomedical domain text. NER+L is often used as a first step in deriving value from clinical text. Collecting labelled data for training models is difficult due to the need for specialist domain knowledge. MedCATTrainer offers an interactive web-interface to inspect and improve recognised entities from an underlying NER+L model via active learning. Secondary use of data for clinical research often has task and context specific criteria. MedCATTrainer provides a further interface to define and collect supervised learning training data for researcher specific use cases. Initial results suggest our approach allows for efficient and accurate collection of research use case specific training data. |
Tasks | Active Learning, Named Entity Recognition |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.07322v1 |
https://arxiv.org/pdf/1907.07322v1.pdf | |
PWC | https://paperswithcode.com/paper/medcattrainer-a-biomedical-free-text |
Repo | |
Framework | |
Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings
Title | Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings |
Authors | Zenan Zhai, Dat Quoc Nguyen, Saber A. Akhondi, Camilo Thorne, Christian Druckenbrodt, Trevor Cohn, Michelle Gregory, Karin Verspoor |
Abstract | Chemical patents are an important resource for chemical information. However, few chemical Named Entity Recognition (NER) systems have been evaluated on patent documents, due in part to their structural and linguistic complexity. In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo word representations for chemical patents. We compare word embeddings pre-trained on biomedical and chemical patent corpora. The effect of tokenizers optimized for the chemical domain on NER performance in chemical patents is also explored. The results on two patent corpora show that contextualized word representations generated from ELMo substantially improve chemical NER performance w.r.t. the current state-of-the-art. We also show that domain-specific resources such as word embeddings trained on chemical patents and chemical-specific tokenizers have a positive impact on NER performance. |
Tasks | Named Entity Recognition, Word Embeddings |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02679v1 |
https://arxiv.org/pdf/1907.02679v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-chemical-named-entity-recognition |
Repo | |
Framework | |
Open Problems in a Logic of Gossips
Title | Open Problems in a Logic of Gossips |
Authors | Krzysztof R. Apt, Dominik Wojtczak |
Abstract | Gossip protocols are programs used in a setting in which each agent holds a secret and the aim is to reach a situation in which all agents know all secrets. Such protocols rely on a point-to-point or group communication. Distributed epistemic gossip protocols use epistemic formulas in the component programs for the agents. The advantage of the use of epistemic logic is that the resulting protocols are very concise and amenable for a simple verification. Recently, we introduced a natural modal logic that allows one to express distributed epistemic gossip protocols and to reason about their correctness. We proved that the resulting protocols are implementable and that all aspects of their correctness, including termination, are decidable. To establish these results we showed that both the definition of semantics and of truth of the underlying logic are decidable. We also showed that the analogous results hold for an extension of this logic with the ‘common knowledge’ operator. However, several, often deceptively simple, questions about this logic and the corresponding gossip protocols remain open. The purpose of this paper is to list and elucidate these questions and provide for them an appropriate background information in the form of partial of related results. |
Tasks | |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09097v1 |
https://arxiv.org/pdf/1907.09097v1.pdf | |
PWC | https://paperswithcode.com/paper/open-problems-in-a-logic-of-gossips |
Repo | |
Framework | |
A Halo Merger Tree Generation and Evaluation Framework
Title | A Halo Merger Tree Generation and Evaluation Framework |
Authors | Sandra Robles, Jonathan S. Gómez, Adín Ramírez Rivera, Jenny A. González, Nelson D. Padilla, Diego Dujovne |
Abstract | Semi-analytic models are best suited to compare galaxy formation and evolution theories with observations. These models rely heavily on halo merger trees, and their realistic features (i.e., no drastic changes on halo mass or jumps on physical locations). Our aim is to provide a new framework for halo merger tree generation that takes advantage of the results of large volume simulations, with a modest computational cost. We treat halo merger tree construction as a matrix generation problem, and propose a Generative Adversarial Network that learns to generate realistic halo merger trees. We evaluate our proposal on merger trees from the EAGLE simulation suite, and show the quality of the generated trees. |
Tasks | |
Published | 2019-06-22 |
URL | https://arxiv.org/abs/1906.09382v1 |
https://arxiv.org/pdf/1906.09382v1.pdf | |
PWC | https://paperswithcode.com/paper/a-halo-merger-tree-generation-and-evaluation |
Repo | |
Framework | |
Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy
Title | Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy |
Authors | Ramtin Keramati, Christoph Dann, Alex Tamkin, Emma Brunskill |
Abstract | While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensitive objectives such as conditional value at risk (CVaR) are more suitable for many high-stakes applications. However, relatively little is known about how to explore to quickly learn policies with good CVaR. In this paper, we present the first algorithm for sample-efficient learning of CVaR-optimal policies in Markov decision processes based on the optimism in the face of uncertainty principle. This method relies on a novel optimistic version of the distributional Bellman operator that moves probability mass from the lower to the upper tail of the return distribution. We prove asymptotic convergence and optimism of this operator for the tabular policy evaluation case. We further demonstrate that our algorithm finds CVaR-optimal policies substantially faster than existing baselines in several simulated environments with discrete and continuous state spaces. |
Tasks | |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.01546v1 |
https://arxiv.org/pdf/1911.01546v1.pdf | |
PWC | https://paperswithcode.com/paper/being-optimistic-to-be-conservative-quickly |
Repo | |
Framework | |
Perceptual Image Anomaly Detection
Title | Perceptual Image Anomaly Detection |
Authors | Nina Tuluptceva, Bart Bakker, Irina Fedulova, Anton Konushin |
Abstract | We present a novel method for image anomaly detection, where algorithms that use samples drawn from some distribution of “normal” data, aim to detect out-of-distribution (abnormal) samples. Our approach includes a combination of encoder and generator for mapping an image distribution to a predefined latent distribution and vice versa. It leverages Generative Adversarial Networks to learn these data distributions and uses perceptual loss for the detection of image abnormality. To accomplish this goal, we introduce a new similarity metric, which expresses the perceived similarity between images and is robust to changes in image contrast. Secondly, we introduce a novel approach for the selection of weights of a multi-objective loss function (image reconstruction and distribution mapping) in the absence of a validation dataset for hyperparameter tuning. After training, our model measures the abnormality of the input image as the perceptual dissimilarity between it and the closest generated image of the modeled data distribution. The proposed approach is extensively evaluated on several publicly available image benchmarks and achieves state-of-the-art performance. |
Tasks | Anomaly Detection, Image Reconstruction |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05904v2 |
https://arxiv.org/pdf/1909.05904v2.pdf | |
PWC | https://paperswithcode.com/paper/perceptual-image-anomaly-detection |
Repo | |
Framework | |