Paper Group ANR 1511
A Modern Retrospective on Probabilistic Numerics. Procrustes registration of two-dimensional statistical shape models without correspondences. Fine-grained lesion annotation in CT images with knowledge mined from radiology reports. Persistence Curves: A canonical framework for summarizing persistence diagrams. Visualizing Representational Dynamics …
A Modern Retrospective on Probabilistic Numerics
Title | A Modern Retrospective on Probabilistic Numerics |
Authors | C. J. Oates, T. J. Sullivan |
Abstract | This article attempts to place the emergence of probabilistic numerics as a mathematical-statistical research field within its historical context and to explore how its gradual development can be related both to applications and to a modern formal treatment. We highlight in particular the parallel contributions of Sul’din and Larkin in the 1960s and how their pioneering early ideas have reached a degree of maturity in the intervening period, mediated by paradigms such as average-case analysis and information-based complexity. We provide a subjective assessment of the state of research in probabilistic numerics and highlight some difficulties to be addressed by future works. |
Tasks | |
Published | 2019-01-14 |
URL | https://arxiv.org/abs/1901.04457v3 |
https://arxiv.org/pdf/1901.04457v3.pdf | |
PWC | https://paperswithcode.com/paper/a-modern-retrospective-on-probabilistic |
Repo | |
Framework | |
Procrustes registration of two-dimensional statistical shape models without correspondences
Title | Procrustes registration of two-dimensional statistical shape models without correspondences |
Authors | Alma Eguizabal, Peter J. Schreier, Jürgen Schmidt |
Abstract | Statistical shape models are a useful tool in image processing and computer vision. A Procrustres registration of the contours of the same shape is typically perform to align the training samples to learn the statistical shape model. A Procrustes registration between two contours with known correspondences is straightforward. However, these correspondences are not generally available. Manually placed landmarks are often used for correspondence in the design of statistical shape models. However, determining manual landmarks on the contours is time-consuming and often error-prone. One solution to simultaneously find correspondence and registration is the Iterative Closest Point (ICP) algorithm. However, ICP requires an initial position of the contours that is close to registration, and it is not robust against outliers. We propose a new strategy, based on Dynamic Time Warping, that efficiently solves the Procrustes registration problem without correspondences. We study the registration performance in a collection of different shape data sets and show that our technique outperforms competing techniques based on the ICP approach. Our strategy is applied to an ensemble of contours of the same shape as an extension of the generalized Procrustes analysis accounting for a lack of correspondence. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11431v2 |
https://arxiv.org/pdf/1911.11431v2.pdf | |
PWC | https://paperswithcode.com/paper/procrustes-registration-of-two-dimensional |
Repo | |
Framework | |
Fine-grained lesion annotation in CT images with knowledge mined from radiology reports
Title | Fine-grained lesion annotation in CT images with knowledge mined from radiology reports |
Authors | Ke Yan, Yifan Peng, Zhiyong Lu, Ronald M. Summers |
Abstract | In radiologists’ routine work, one major task is to read a medical image, e.g., a CT scan, find significant lesions, and write sentences in the radiology report to describe them. In this paper, we study the lesion description or annotation problem as an important step of computer-aided diagnosis (CAD). Given a lesion image, our aim is to predict multiple relevant labels, such as the lesion’s body part, type, and attributes. To address this problem, we define a set of 145 labels based on RadLex to describe a large variety of lesions in the DeepLesion dataset. We directly mine training labels from the lesion’s corresponding sentence in the radiology report, which requires minimal manual effort and is easily generalizable to large data and label sets. A multi-label convolutional neural network is then proposed for images with multi-scale structure and a noise-robust loss. Quantitative and qualitative experiments demonstrate the effectiveness of the framework. The average area under ROC curve on 1,872 test lesions is 0.9083. |
Tasks | |
Published | 2019-03-04 |
URL | http://arxiv.org/abs/1903.01505v2 |
http://arxiv.org/pdf/1903.01505v2.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-lesion-annotation-in-ct-images |
Repo | |
Framework | |
Persistence Curves: A canonical framework for summarizing persistence diagrams
Title | Persistence Curves: A canonical framework for summarizing persistence diagrams |
Authors | Yu-Min Chung, Austin Lawson |
Abstract | Persistence diagrams are one of the main tools in the field of Topological Data Analysis (TDA). They contain fruitful information about the shape of data. The use of machine learning algorithms on the space of persistence diagrams proves to be challenging as the space is complicated. For that reason, transforming these diagrams in a way that is compatible with machine learning is an important topic currently researched in TDA. In this paper, our main contribution consists of three components. First, we develop a general framework of vectorizing diagrams that we call the \textit{Persistence Curves} (PCs). We show that some well-known summaries, such as Betti number curves, the Euler Characteristic Curve, and Persistence Landscapes fall under the PC framework or are easily derived from it. Second, we provide a theoretical foundation for the stability analysis of PCs. In addition, we propose several new summaries based on PC framework and investigate their stability. Finally, we demonstrate the practical uses of PCs on the texture classification on four public available texture datasets. We show the result of our proposed PCs outperforms several existing TDA methods. |
Tasks | Texture Classification, Topological Data Analysis |
Published | 2019-04-16 |
URL | https://arxiv.org/abs/1904.07768v2 |
https://arxiv.org/pdf/1904.07768v2.pdf | |
PWC | https://paperswithcode.com/paper/persistence-curves-a-canonical-framework-for |
Repo | |
Framework | |
Visualizing Representational Dynamics with Multidimensional Scaling Alignment
Title | Visualizing Representational Dynamics with Multidimensional Scaling Alignment |
Authors | Baihan Lin, Marieke Mur, Tim Kietzmann, Nikolaus Kriegeskorte |
Abstract | Representational similarity analysis (RSA) has been shown to be an effective framework to characterize brain-activity profiles and deep neural network activations as representational geometry by computing the pairwise distances of the response patterns as a representational dissimilarity matrix (RDM). However, how to properly analyze and visualize the representational geometry as dynamics over the time course from stimulus onset to offset is not well understood. In this work, we formulated the pipeline to understand representational dynamics with RDM movies and Procrustes-aligned Multidimensional Scaling (pMDS), and applied it to neural recording of monkey IT cortex. Our results suggest that the the multidimensional scaling alignment can genuinely capture the dynamics of the category-specific representation spaces with multiple visualization possibilities, and that object categorization may be hierarchical, multi-staged, and oscillatory (or recurrent). |
Tasks | |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.09264v2 |
https://arxiv.org/pdf/1906.09264v2.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-representational-dynamics-with |
Repo | |
Framework | |
To Relieve Your Headache of Training an MRF, Take AdVIL
Title | To Relieve Your Headache of Training an MRF, Take AdVIL |
Authors | Chongxuan Li, Chao Du, Kun Xu, Max Welling, Jun Zhu, Bo Zhang |
Abstract | We propose a black-box algorithm called {\it Adversarial Variational Inference and Learning} (AdVIL) to perform inference and learning on a general Markov random field (MRF). AdVIL employs two variational distributions to approximately infer the latent variables and estimate the partition function of an MRF, respectively. The two variational distributions provide an estimate of the negative log-likelihood of the MRF as a minimax optimization problem, which is solved by stochastic gradient descent. AdVIL is proven convergent under certain conditions. On one hand, compared with contrastive divergence, AdVIL requires a minimal assumption about the model structure and can deal with a broader family of MRFs. On the other hand, compared with existing black-box methods, AdVIL provides a tighter estimate of the log partition function and achieves much better empirical results. |
Tasks | |
Published | 2019-01-24 |
URL | https://arxiv.org/abs/1901.08400v3 |
https://arxiv.org/pdf/1901.08400v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-variational-inference-and |
Repo | |
Framework | |
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model
Title | Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model |
Authors | Gi-Soo Kim, Myunghee Cho Paik |
Abstract | Contextual multi-armed bandit (MAB) algorithms have been shown promising for maximizing cumulative rewards in sequential decision tasks such as news article recommendation systems, web page ad placement algorithms, and mobile health. However, most of the proposed contextual MAB algorithms assume linear relationships between the reward and the context of the action. This paper proposes a new contextual MAB algorithm for a relaxed, semiparametric reward model that supports nonstationarity. The proposed method is less restrictive, easier to implement and faster than two alternative algorithms that consider the same model, while achieving a tight regret upper bound. We prove that the high-probability upper bound of the regret incurred by the proposed algorithm has the same order as the Thompson sampling algorithm for linear reward models. The proposed and existing algorithms are evaluated via simulation and also applied to Yahoo! news article recommendation log data. |
Tasks | Recommendation Systems |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1901.11221v1 |
http://arxiv.org/pdf/1901.11221v1.pdf | |
PWC | https://paperswithcode.com/paper/contextual-multi-armed-bandit-algorithm-for |
Repo | |
Framework | |
Shielding Collaborative Learning: Mitigating Poisoning Attacks through Client-Side Detection
Title | Shielding Collaborative Learning: Mitigating Poisoning Attacks through Client-Side Detection |
Authors | Lingchen Zhao, Shengshan Hu, Qian Wang, Jianlin Jiang, Chao Shen, Xiangyang Luo, Pengfei Hu |
Abstract | Collaborative learning allows multiple clients to train a joint model without sharing their data with each other. Each client performs training locally and then submits the model updates to a central server for aggregation. Since the server has no visibility into the process of generating the updates, collaborative learning is vulnerable to poisoning attacks where a malicious client can generate a poisoned update to introduce backdoor functionality to the joint model. The existing solutions for detecting poisoned updates, however, fail to defend against the recently proposed attacks, especially in the non-IID setting. In this paper, we present a novel defense scheme to detect anomalous updates in both IID and non-IID settings. Our key idea is to realize client-side cross-validation, where each update is evaluated over other clients’ local data. The server will adjust the weights of the updates based on the evaluation results when performing aggregation. To adapt to the unbalanced distribution of data in the non-IID setting, a dynamic client allocation mechanism is designed to assign detection tasks to the most suitable clients. During the detection process, we also protect the client-level privacy to prevent malicious clients from stealing the training data of other clients, by integrating differential privacy with our design without degrading the detection performance. Our experimental evaluations on two real-world datasets show that our scheme is significantly robust to two representative poisoning attacks. |
Tasks | |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13111v2 |
https://arxiv.org/pdf/1910.13111v2.pdf | |
PWC | https://paperswithcode.com/paper/shielding-collaborative-learning-mitigating |
Repo | |
Framework | |
Learning Household Task Knowledge from WikiHow Descriptions
Title | Learning Household Task Knowledge from WikiHow Descriptions |
Authors | Yilun Zhou, Julie A. Shah, Steven Schockaert |
Abstract | Commonsense procedural knowledge is important for AI agents and robots that operate in a human environment. While previous attempts at constructing procedural knowledge are mostly rule- and template-based, recent advances in deep learning provide the possibility of acquiring such knowledge directly from natural language sources. As a first step in this direction, we propose a model to learn embeddings for tasks, as well as the individual steps that need to be taken to solve them, based on WikiHow articles. We learn these embeddings such that they are predictive of both step relevance and step ordering. We also experiment with the use of integer programming for inferring consistent global step orderings from noisy pairwise predictions. |
Tasks | |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06414v1 |
https://arxiv.org/pdf/1909.06414v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-household-task-knowledge-from-1 |
Repo | |
Framework | |
Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning
Title | Health-Informed Policy Gradients for Multi-Agent Reinforcement Learning |
Authors | Ross E. Allen, Javona White Bear, Jayesh K. Gupta, Mykel J. Kochenderfer |
Abstract | This paper proposes a definition of system health in the context of multiple agents optimizing a joint reward function. We use this definition as a credit assignment term in a policy gradient algorithm to distinguish the contributions of individual agents to the global reward. The health-informed credit assignment is then extended to a multi-agent variant of the proximal policy optimization algorithm and demonstrated on simple particle environments that have characteristics such as system health, risk-taking, semi-expendable agents, continuous action spaces, and partial observability. We show significant improvement in learning performance compared to policy gradient methods that do not perform multi-agent credit assignment. |
Tasks | Multi-agent Reinforcement Learning, Policy Gradient Methods |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.01022v3 |
https://arxiv.org/pdf/1908.01022v3.pdf | |
PWC | https://paperswithcode.com/paper/health-informed-policy-gradients-for-multi |
Repo | |
Framework | |
On Distributed Quantization for Classification
Title | On Distributed Quantization for Classification |
Authors | Osama A. Hanna, Yahya H. Ezzeldin, Tara Sadjadpour, Christina Fragouli, Suhas Diggavi |
Abstract | We consider the problem of distributed feature quantization, where the goal is to enable a pretrained classifier at a central node to carry out its classification on features that are gathered from distributed nodes through communication constrained channels. We propose the design of distributed quantization schemes specifically tailored to the classification task: unlike quantization schemes that help the central node reconstruct the original signal as accurately as possible, our focus is not reconstruction accuracy, but instead correct classification. Our work does not make any apriori distributional assumptions on the data, but instead uses training data for the quantizer design. Our main contributions include: we prove NP-hardness of finding optimal quantizers in the general case; we design an optimal scheme for a special case; we propose quantization algorithms, that leverage discrete neural representations and training data, and can be designed in polynomial-time for any number of features, any number of classes, and arbitrary division of features across the distributed nodes. We find that tailoring the quantizers to the classification task can offer significant savings: as compared to alternatives, we can achieve more than a factor of two reduction in terms of the number of bits communicated, for the same classification accuracy. |
Tasks | Quantization |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00216v1 |
https://arxiv.org/pdf/1911.00216v1.pdf | |
PWC | https://paperswithcode.com/paper/on-distributed-quantization-for |
Repo | |
Framework | |
On the Shattering Coefficient of Supervised Learning Algorithms
Title | On the Shattering Coefficient of Supervised Learning Algorithms |
Authors | Rodrigo Fernandes de Mello |
Abstract | The Statistical Learning Theory (SLT) provides the theoretical background to ensure that a supervised algorithm generalizes the mapping $f: \mathcal{X} \to \mathcal{Y}$ given $f$ is selected from its search space bias $\mathcal{F}$. This formal result depends on the Shattering coefficient function $\mathcal{N}(\mathcal{F},2n)$ to upper bound the empirical risk minimization principle, from which one can estimate the necessary training sample size to ensure the probabilistic learning convergence and, most importantly, the characterization of the capacity of $\mathcal{F}$, including its under and overfitting abilities while addressing specific target problems. In this context, we propose a new approach to estimate the maximal number of hyperplanes required to shatter a given sample, i.e., to separate every pair of points from one another, based on the recent contributions by Har-Peled and Jones in the dataset partitioning scenario, and use such foundation to analytically compute the Shattering coefficient function for both binary and multi-class problems. As main contributions, one can use our approach to study the complexity of the search space bias $\mathcal{F}$, estimate training sample sizes, and parametrize the number of hyperplanes a learning algorithm needs to address some supervised task, what is specially appealing to deep neural networks. Experiments were performed to illustrate the advantages of our approach while studying the search space $\mathcal{F}$ on synthetic and one toy datasets and on two widely-used deep learning benchmarks (MNIST and CIFAR-10). In order to permit reproducibility and the use of our approach, our source code is made available at~\url{https://bitbucket.org/rodrigo_mello/shattering-rcode}. |
Tasks | |
Published | 2019-11-13 |
URL | https://arxiv.org/abs/1911.05461v1 |
https://arxiv.org/pdf/1911.05461v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-shattering-coefficient-of-supervised |
Repo | |
Framework | |
Adversarial Robustness through Local Linearization
Title | Adversarial Robustness through Local Linearization |
Authors | Chongli Qin, James Martens, Sven Gowal, Dilip Krishnan, Krishnamurthy Dvijotham, Alhussein Fawzi, Soham De, Robert Stanforth, Pushmeet Kohli |
Abstract | Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of the model and number of input dimensions increase. Further, training against less expensive and therefore weaker adversaries produces models that are robust against weak attacks but break down under attacks that are stronger. This is often attributed to the phenomenon of gradient obfuscation; such models have a highly non-linear loss surface in the vicinity of training examples, making it hard for gradient-based attacks to succeed even though adversarial examples still exist. In this work, we introduce a novel regularizer that encourages the loss to behave linearly in the vicinity of the training data, thereby penalizing gradient obfuscation while encouraging robustness. We show via extensive experiments on CIFAR-10 and ImageNet, that models trained with our regularizer avoid gradient obfuscation and can be trained significantly faster than adversarial training. Using this regularizer, we exceed current state of the art and achieve 47% adversarial accuracy for ImageNet with l-infinity adversarial perturbations of radius 4/255 under an untargeted, strong, white-box attack. Additionally, we match state of the art results for CIFAR-10 at 8/255. |
Tasks | |
Published | 2019-07-04 |
URL | https://arxiv.org/abs/1907.02610v2 |
https://arxiv.org/pdf/1907.02610v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-robustness-through-local |
Repo | |
Framework | |
Efficient online learning with kernels for adversarial large scale problems
Title | Efficient online learning with kernels for adversarial large scale problems |
Authors | Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi |
Abstract | We are interested in a framework of online learning with kernels for low-dimensional but large-scale and potentially adversarial datasets. We study the computational and theoretical performance of online variations of kernel Ridge regression. Despite its simplicity, the algorithm we study is the first to achieve the optimal regret for a wide range of kernels with a per-round complexity of order $n^\alpha$ with $\alpha < 2$. The algorithm we consider is based on approximating the kernel with the linear span of basis functions. Our contributions is two-fold: 1) For the Gaussian kernel, we propose to build the basis beforehand (independently of the data) through Taylor expansion. For $d$-dimensional inputs, we provide a (close to) optimal regret of order $O((\log n)^{d+1})$ with per-round time complexity and space complexity $O((\log n)^{2d})$. This makes the algorithm a suitable choice as soon as $n \gg e^d$ which is likely to happen in a scenario with small dimensional and large-scale dataset; 2) For general kernels with low effective dimension, the basis functions are updated sequentially in a data-adaptive fashion by sampling Nystr{"o}m points. In this case, our algorithm improves the computational trade-off known for online kernel regression. |
Tasks | |
Published | 2019-02-26 |
URL | https://arxiv.org/abs/1902.09917v2 |
https://arxiv.org/pdf/1902.09917v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-online-learning-with-kernels-for |
Repo | |
Framework | |
Prime Sample Attention in Object Detection
Title | Prime Sample Attention in Object Detection |
Authors | Yuhang Cao, Kai Chen, Chen Change Loy, Dahua Lin |
Abstract | It is a common paradigm in object detection frameworks to treat all samples equally and target at maximizing the performance on average. In this work, we revisit this paradigm through a careful study on how different samples contribute to the overall performance measured in terms of mAP. Our study suggests that the samples in each mini-batch are neither independent nor equally important, and therefore a better classifier on average does not necessarily mean higher mAP. Motivated by this study, we propose the notion of Prime Samples, those that play a key role in driving the detection performance. We further develop a simple yet effective sampling and learning strategy called PrIme Sample Attention (PISA) that directs the focus of the training process towards such samples. Our experiments demonstrate that it is often more effective to focus on prime samples than hard samples when training a detector. Particularly, On the MSCOCO dataset, PISA outperforms the random sampling baseline and hard mining schemes, e.g., OHEM and Focal Loss, consistently by around 2% on both single-stage and two-stage detectors, even with a strong backbone ResNeXt-101. |
Tasks | Object Detection |
Published | 2019-04-09 |
URL | https://arxiv.org/abs/1904.04821v2 |
https://arxiv.org/pdf/1904.04821v2.pdf | |
PWC | https://paperswithcode.com/paper/prime-sample-attention-in-object-detection |
Repo | |
Framework | |