January 31, 2020

3238 words 16 mins read

Paper Group ANR 54

On the Power of Multiple Anonymous Messages. Super-resolution meets machine learning: approximation of measures. Supervised Multiscale Dimension Reduction for Spatial Interaction Networks. CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning. Automatic Segmentation of Muscle Tissue and Inter-muscular …

On the Power of Multiple Anonymous Messages


Title	On the Power of Multiple Anonymous Messages
Authors	Badih Ghazi, Noah Golowich, Ravi Kumar, Rasmus Pagh, Ameya Velingker
Abstract	An exciting new development in differential privacy is the shuffled model, in which an anonymous channel enables circumventing the large errors that are necessary in the local model, while relying on much weaker trust assumptions than in the central model. In this paper, we study basic counting problems in the shuffled model and establish separations between the error that can be achieved in the single-message shuffled model and in the shuffled model with multiple messages per user. For the frequency estimation problem with $n$ users and for a domain of size $B$, we obtain: - A nearly tight lower bound of $\tilde{\Omega}( \min(n^{1/4}, \sqrt{B}))$ on the error in the single-message shuffled model. This implies that the protocols obtained from the amplification via shuffling work of Erlingsson et al. (SODA 2019) and Balle et al. (Crypto 2019) are essentially optimal for single-message protocols. - A nearly tight lower bound of $\Omega\left(\frac{\log{B}}{\log\log{B}}\right)$ on the sample complexity with constant relative error in the single-message shuffled model. This improves on the lower bound of $\Omega(\log^{1/17} B)$ obtained by Cheu et al. (Eurocrypt 2019). - Protocols in the multi-message shuffled model with $\mathrm{poly}(\log{B}, \log{n})$ bits of communication per user and $\mathrm{poly}\log{B}$ error, which provide an exponential improvement on the error compared to what is possible with single-message algorithms. For the related selection problem, we also show a nearly tight sample complexity lower bound of $\Omega(B)$ in the single-message shuffled model. This improves on the $\Omega(B^{1/17})$ lower bound obtained by Cheu et al. (Eurocrypt 2019), and when combined with their $\tilde{O}(\sqrt{B})$-error multi-message algorithm, implies the first separation between single-message and multi-message protocols for this problem.
Tasks
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11358v3
PDF	https://arxiv.org/pdf/1908.11358v3.pdf
PWC	https://paperswithcode.com/paper/private-heavy-hitters-and-range-queries-in
Repo
Framework

Super-resolution meets machine learning: approximation of measures


Title	Super-resolution meets machine learning: approximation of measures
Authors	H. N. Mhaskar
Abstract	The problem of super-resolution in general terms is to recuperate a finitely supported measure $\mu$ given finitely many of its coefficients $\hat{\mu}(k)$ with respect to some orthonormal system. The interesting case concerns situations, where the number of coefficients required is substantially smaller than a power of the reciprocal of the minimal separation among the points in the support of $\mu$. In this paper, we consider the more severe problem of recuperating $\mu$ approximately without any assumption on $\mu$ beyond having a finite total variation. In particular, $\mu$ may be supported on a continuum, so that the minimal separation among the points in the support of $\mu$ is $0$. A variant of this problem is also of interest in machine learning as well as the inverse problem of de-convolution. We define an appropriate notion of a distance between the target measure and its recuperated version, give an explicit expression for the recuperation operator, and estimate the distance between $\mu$ and its approximation. We show that these estimates are the best possible in many different ways. We also explain why for a finitely supported measure the approximation quality of its recuperation is bounded from below if the amount of information is smaller than what is demanded in the super-resolution problem.
Tasks	Super-Resolution
Published	2019-07-10
URL	https://arxiv.org/abs/1907.04895v1
PDF	https://arxiv.org/pdf/1907.04895v1.pdf
PWC	https://paperswithcode.com/paper/super-resolution-meets-machine-learning
Repo
Framework

Supervised Multiscale Dimension Reduction for Spatial Interaction Networks


Title	Supervised Multiscale Dimension Reduction for Spatial Interaction Networks
Authors	Shaobo Han, David B. Dunson
Abstract	We introduce a multiscale supervised dimension reduction method for SPatial Interaction Network (SPIN) data, which consist of a collection of spatially coordinated interactions. This type of predictor arises when the sampling unit of data is composed of a collection of primitive variables, each of them being essentially unique, so that it becomes necessary to group the variables in order to simplify the representation and enhance interpretability. In this paper, we introduce an empirical Bayes approach called spinlets, which first constructs a partitioning tree to guide the reduction over multiple spatial granularities, and then refines the representation of predictors according to the relevance to the response. We consider an inverse Poisson regression model and propose a new multiscale generalized double Pareto prior, which is induced via a tree-structured parameter expansion scheme. Our approach is motivated by an application in soccer analytics, in which we obtain compact vectorial representations and readily interpretable visualizations of the complex network objects, supervised by the response of interest.
Tasks	Dimensionality Reduction
Published	2019-01-01
URL	https://arxiv.org/abs/1901.00172v3
PDF	https://arxiv.org/pdf/1901.00172v3.pdf
PWC	https://paperswithcode.com/paper/supervised-multiscale-dimension-reduction-for
Repo
Framework


Title	CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning
Authors	Chi Zhang, Guosheng Lin, Fayao Liu, Rui Yao, Chunhua Shen
Abstract	Recent progress in semantic segmentation is driven by deep Convolutional Neural Networks and large-scale labeled image datasets. However, data labeling for pixel-wise segmentation is tedious and costly. Moreover, a trained model can only make predictions within a set of pre-defined classes. In this paper, we present CANet, a class-agnostic segmentation network that performs few-shot segmentation on new classes with only a few annotated images available. Our network consists of a two-branch dense comparison module which performs multi-level feature comparison between the support image and the query image, and an iterative optimization module which iteratively refines the predicted results. Furthermore, we introduce an attention mechanism to effectively fuse information from multiple support examples under the setting of k-shot learning. Experiments on PASCAL VOC 2012 show that our method achieves a mean Intersection-over-Union score of 55.4% for 1-shot segmentation and 57.1% for 5-shot segmentation, outperforming state-of-the-art methods by a large margin of 14.6% and 13.2%, respectively.
Tasks	Few-Shot Learning, Semantic Segmentation
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02351v1
PDF	http://arxiv.org/pdf/1903.02351v1.pdf
PWC	https://paperswithcode.com/paper/canet-class-agnostic-segmentation-networks
Repo
Framework

Automatic Segmentation of Muscle Tissue and Inter-muscular Fat in Thigh and Calf MRI Images


Title	Automatic Segmentation of Muscle Tissue and Inter-muscular Fat in Thigh and Calf MRI Images
Authors	Rula Amer, Jannette Nassar, David Bendahan, Hayit Greenspan, Noam Ben-Eliezer
Abstract	Magnetic resonance imaging (MRI) of thigh and calf muscles is one of the most effective techniques for estimating fat infiltration into muscular dystrophies. The infiltration of adipose tissue into the diseased muscle region varies in its severity across, and within, patients. In order to efficiently quantify the infiltration of fat, accurate segmentation of muscle and fat is needed. An estimation of the amount of infiltrated fat is typically done visually by experts. Several algorithmic solutions have been proposed for automatic segmentation. While these methods may work well in mild cases, they struggle in moderate and severe cases due to the high variability in the intensity of infiltration, and the tissue’s heterogeneous nature. To address these challenges, we propose a deep-learning approach, producing robust results with high Dice Similarity Coefficient (DSC) of 0.964, 0.917 and 0.933 for muscle-region, healthy muscle and inter-muscular adipose tissue (IMAT) segmentation, respectively.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.04866v1
PDF	https://arxiv.org/pdf/1910.04866v1.pdf
PWC	https://paperswithcode.com/paper/automatic-segmentation-of-muscle-tissue-and
Repo
Framework

Bridging the Gap between Pre-Training and Fine-Tuning for End-to-End Speech Translation


Title	Bridging the Gap between Pre-Training and Fine-Tuning for End-to-End Speech Translation
Authors	Chengyi Wang, Yu Wu, Shujie Liu, Zhenglu Yang, Ming Zhou
Abstract	End-to-end speech translation, a hot topic in recent years, aims to translate a segment of audio into a specific language with an end-to-end model. Conventional approaches employ multi-task learning and pre-training methods for this task, but they suffer from the huge gap between pre-training and fine-tuning. To address these issues, we propose a Tandem Connectionist Encoding Network (TCEN) which bridges the gap by reusing all subnets in fine-tuning, keeping the roles of subnets consistent, and pre-training the attention module. Furthermore, we propose two simple but effective methods to guarantee the speech encoder outputs and the MT encoder inputs are consistent in terms of semantic representation and sequence length. Experimental results show that our model outperforms baselines 2.2 BLEU on a large benchmark dataset.
Tasks	Multi-Task Learning
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07575v3
PDF	https://arxiv.org/pdf/1909.07575v3.pdf
PWC	https://paperswithcode.com/paper/bridging-the-gap-between-pre-training-and
Repo
Framework

RL4health: Crowdsourcing Reinforcement Learning for Knee Replacement Pathway Optimization


Title	RL4health: Crowdsourcing Reinforcement Learning for Knee Replacement Pathway Optimization
Authors	Hao Lu, Mengdi Wang
Abstract	Joint replacement is the most common inpatient surgical treatment in the US. We investigate the clinical pathway optimization for knee replacement, which is a sequential decision process from onset to recovery. Based on episodic claims from previous cases, we view the pathway optimization as an intelligence crowdsourcing problem and learn the optimal decision policy from data by imitating the best expert at every intermediate state. We develop a reinforcement learning-based pipeline that uses value iteration, state compression and aggregation learning, kernel representation and cross validation to predict the best treatment policy. It also provides forecast of the clinical pathway under the optimized policy. Empirical validation shows that the optimized policy reduces the overall cost by 7 percent and reduces the excessive cost premium by 33 percent.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1906.01407v1
PDF	https://arxiv.org/pdf/1906.01407v1.pdf
PWC	https://paperswithcode.com/paper/190601407
Repo
Framework

HinDom: A Robust Malicious Domain Detection System based on Heterogeneous Information Network with Transductive Classification


Title	HinDom: A Robust Malicious Domain Detection System based on Heterogeneous Information Network with Transductive Classification
Authors	Xiaoqing Sun, Mingkai Tong, Jiahai Yang
Abstract	Domain name system (DNS) is a crucial part of the Internet, yet has been widely exploited by cyber attackers. Apart from making static methods like blacklists or sinkholes infeasible, some weasel attackers can even bypass detection systems with machine learning based classifiers. As a solution to this problem, we propose a robust domain detection system named HinDom. Instead of relying on manually selected features, HinDom models the DNS scene as a Heterogeneous Information Network (HIN) consist of clients, domains, IP addresses and their diverse relationships. Besides, the metapath-based transductive classification method enables HinDom to detect malicious domains with only a small fraction of labeled samples. So far as we know, this is the first work to apply HIN in DNS analysis. We build a prototype of HinDom and evaluate it in CERNET2 and TUNET. The results reveal that HinDom is accurate, robust and can identify previously unknown malicious domains.
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01590v1
PDF	https://arxiv.org/pdf/1909.01590v1.pdf
PWC	https://paperswithcode.com/paper/hindom-a-robust-malicious-domain-detection
Repo
Framework

TamperNN: Efficient Tampering Detection of Deployed Neural Nets


Title	TamperNN: Efficient Tampering Detection of Deployed Neural Nets
Authors	Erwan Le Merrer, Gilles Tredan
Abstract	Neural networks are powering the deployment of embedded devices and Internet of Things. Applications range from personal assistants to critical ones such as self-driving cars. It has been shown recently that models obtained from neural nets can be trojaned ; an attacker can then trigger an arbitrary model behavior facing crafted inputs. This has a critical impact on the security and reliability of those deployed devices. We introduce novel algorithms to detect the tampering with deployed models, classifiers in particular. In the remote interaction setup we consider, the proposed strategy is to identify markers of the model input space that are likely to change class if the model is attacked, allowing a user to detect a possible tampering. This setup makes our proposal compatible with a wide range of scenarios, such as embedded models, or models exposed through prediction APIs. We experiment those tampering detection algorithms on the canonical MNIST dataset, over three different types of neural nets, and facing five different attacks (trojaning, quantization, fine-tuning, compression and watermarking). We then validate over five large models (VGG16, VGG19, ResNet, MobileNet, DenseNet) with a state of the art dataset (VGGFace2), and report results demonstrating the possibility of an efficient detection of model tampering.
Tasks	Quantization, Self-Driving Cars
Published	2019-03-01
URL	https://arxiv.org/abs/1903.00317v2
PDF	https://arxiv.org/pdf/1903.00317v2.pdf
PWC	https://paperswithcode.com/paper/tampernn-efficient-tampering-detection-of
Repo
Framework

Actor Critic with Differentially Private Critic


Title	Actor Critic with Differentially Private Critic
Authors	Jonathan Lebensold, William Hamilton, Borja Balle, Doina Precup
Abstract	Reinforcement learning algorithms are known to be sample inefficient, and often performance on one task can be substantially improved by leveraging information (e.g., via pre-training) on other related tasks. In this work, we propose a technique to achieve such knowledge transfer in cases where agent trajectories contain sensitive or private information, such as in the healthcare domain. Our approach leverages a differentially private policy evaluation algorithm to initialize an actor-critic model and improve the effectiveness of learning in downstream tasks. We empirically show this technique increases sample efficiency in resource-constrained control problems while preserving the privacy of trajectories collected in an upstream task.
Tasks	Transfer Learning
Published	2019-10-14
URL	https://arxiv.org/abs/1910.05876v1
PDF	https://arxiv.org/pdf/1910.05876v1.pdf
PWC	https://paperswithcode.com/paper/actor-critic-with-differentially-private
Repo
Framework

Assessing Data Quality of Annotations with Krippendorff Alpha For Applications in Computer Vision


Title	Assessing Data Quality of Annotations with Krippendorff Alpha For Applications in Computer Vision
Authors	Joseph Nassar, Viveca Pavon-Harr, Marc Bosch, Ian McCulloh
Abstract	Current supervised deep learning frameworks rely on annotated data for modeling the underlying data distribution of a given task. In particular for computer vision algorithms powered by deep learning, the quality of annotated data is the most critical factor in achieving the desired algorithm performance. Data annotation is, typically, a manual process where the annotator follows guidelines and operates in a best-guess manner. Labeling criteria among annotators can show discrepancies in labeling results. This may impact the algorithm inference performance. Given the popularity and widespread use of deep learning among computer vision, more and more custom datasets are needed to train neural networks to tackle different kinds of tasks. Unfortunately, there is no full understanding of the factors that affect annotated data quality, and how it translates into algorithm performance. In this paper we studied this problem for object detection and recognition.We conducted several data annotation experiments to measure inter annotator agreement and consistency, as well as how the selection of ground truth impacts the perceived algorithm performance.We propose a methodology to monitor the quality of annotations during the labeling of images and how it can be used to measure performance. We also show that neglecting to monitor the annotation process can result in significant loss in algorithm precision. Through these experiments, we observe that knowledge of the labeling process, training data, and ground truth data used for algorithm evaluation are fundamental components to accurately assess trustworthiness of an AI system.
Tasks	Object Detection
Published	2019-12-20
URL	https://arxiv.org/abs/1912.10107v1
PDF	https://arxiv.org/pdf/1912.10107v1.pdf
PWC	https://paperswithcode.com/paper/assessing-data-quality-of-annotations-with
Repo
Framework

Multi-agent Path Finding with Continuous Time Viewed Through Satisfiability Modulo Theories (SMT)


Title	Multi-agent Path Finding with Continuous Time Viewed Through Satisfiability Modulo Theories (SMT)
Authors	Pavel Surynek
Abstract	This paper addresses a variant of multi-agent path finding (MAPF) in continuous space and time. We present a new solving approach based on satisfiability modulo theories (SMT) to obtain makespan optimal solutions. The standard MAPF is a task of navigating agents in an undirected graph from given starting vertices to given goal vertices so that agents do not collide with each other in vertices of the graph. In the continuous version (MAPF$^\mathcal{R}$) agents move in an $n$-dimensional Euclidean space along straight lines that interconnect predefined positions. For simplicity, we work with circular omni-directional agents having constant velocities in the 2D plane. As agents can have different sizes and move smoothly along lines, a non-colliding movement along certain lines with small agents can result in a collision if the same movement is performed with larger agents. Our SMT-based approach for MAPF$^\mathcal{R}$ called SMT-CBS$^\mathcal{R}$ reformulates the Conflict-based Search (CBS) algorithm in terms of SMT concepts. We suggest lazy generation of decision variables and constraints. Each time a new conflict is discovered, the underlying encoding is extended with new variables and constraints to eliminate the conflict. We compared SMT-CBS$^\mathcal{R}$ and adaptations of CBS for the continuous variant of MAPF experimentally.
Tasks	Multi-Agent Path Finding
Published	2019-03-23
URL	http://arxiv.org/abs/1903.09820v1
PDF	http://arxiv.org/pdf/1903.09820v1.pdf
PWC	https://paperswithcode.com/paper/multi-agent-path-finding-with-continuous-time
Repo
Framework

On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman


Title	On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman
Authors	Chao Gao, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor
Abstract	How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for reinforcement learning (RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for RL — past work has shown that model-free RL algorithms fail to achieve significant learning without artificially reducing the environment’s complexity. In this paper, we illuminate reasons behind this failure by providing a thorough analysis on the hardness of random exploration in Pommerman. While model-free random exploration is typically futile, we develop a model-based automatic reasoning module that can be used for safer exploration by pruning actions that will surely lead the agent to death. We empirically demonstrate that this module can significantly improve learning.
Tasks
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11788v1
PDF	https://arxiv.org/pdf/1907.11788v1.pdf
PWC	https://paperswithcode.com/paper/on-hard-exploration-for-reinforcement
Repo
Framework

Stable Manipulation in Voting


Title	Stable Manipulation in Voting
Authors	Aditya Anand, Palash Dey
Abstract	We introduce the problem of {\em stable manipulation} where the manipulators need to compute if there exist votes for the manipulators which make their preferred alternative win the election even if the manipulators’ knowledge about others’ votes are little inaccurate, that is, manipulation remains successful even under small perturbation of the non-manipulators’ votes. We show that every scoring rule, maximin, Bucklin, and simplified Bucklin voting rules are stably manipulable in polynomial time for single manipulator. In contrast, stable manipulation becomes intractable for the Copeland$^\alpha$ voting rule for every $\alpha\in[0,1]$ even for single manipulator. However for a constant number of alternatives, we show that the stable manipulation problem is polynomial time solvable for every anonymous and efficient voting rules. Finally we empirically show that the probability that a uniformly random profile is stably manipulable decreases drastically even if manipulator possess little uncertainty about others’ votes.
Tasks
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03162v1
PDF	https://arxiv.org/pdf/1909.03162v1.pdf
PWC	https://paperswithcode.com/paper/stable-manipulation-in-voting
Repo
Framework

A Medical Literature Search System for Identifying Effective Treatments in Precision Medicine


Title	A Medical Literature Search System for Identifying Effective Treatments in Precision Medicine
Authors	Jiaming Qu, Yue Wang
Abstract	The Precision Medicine Initiative states that treatments for a patient should take into account not only the patient’s disease, but his/her specific genetic variation as well. The vast biomedical literature holds the potential for physicians to identify effective treatment options for a cancer patient. However, the complexity and ambiguity of medical terms can result in vocabulary mismatch between the physician’s query and the literature. The physician’s search intent (finding treatments instead of other types of studies) is difficult to explicitly formulate in a query. Therefore, simple ad hot retrieval approach will suffer from low recall and precision. In this paper, we propose a new retrieval system that helps physicians identify effective treatments in precision medicine. Given a cancer patient with a specific disease, genetic variation, and demographic information, the system aims to identify biomedical publications that report effective treatments. We approach this goal from two directions. First, we expand the original disease and gene terms using biomedical knowledge bases to improve recall of the initial retrieval. We then improve precision by promoting treatment-related publications to the top using a machine learning reranker trained on 2017 Text Retrieval Conference Precision Medicine (PM) track corpus. Batch evaluation results on 2018 PM track corpus show that the proposed approach effectively improves both recall and precision, achieving performance comparable to the top entries on the leaderboard of 2018 PM track.
Tasks
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07428v2
PDF	http://arxiv.org/pdf/1904.07428v2.pdf
PWC	https://paperswithcode.com/paper/a-medical-literature-search-system-for
Repo
Framework