January 30, 2020

2863 words 14 mins read

Paper Group ANR 294

Soft Triangles for Expert Aggregation. Assessing the Robustness of Bayesian Dark Knowledge to Posterior Uncertainty. Multi-scale deep neural networks for real image super-resolution. Vertex Nomination, Consistent Estimation, and Adversarial Modification. Topic Modeling the Reading and Writing Behavior of Information Foragers. Prostate Cancer Detect …

Soft Triangles for Expert Aggregation


Title	Soft Triangles for Expert Aggregation
Authors	Paul B. Kantor
Abstract	We consider the problem of eliciting expert assessments of an uncertain parameter. The context is risk control, where there are, in fact, three uncertain parameters to be estimates. Two of these are probabilities, requiring the that the experts be guided in the concept of “uncertainty about uncertainty.” We propose a novel formulation for expert estimates, which relies on the range and the median, rather than the variance and the mean. We discuss the process of elicitation, and provide precise formulas for these new distributions.
Tasks
Published	2019-09-02
URL	https://arxiv.org/abs/1909.01801v1
PDF	https://arxiv.org/pdf/1909.01801v1.pdf
PWC	https://paperswithcode.com/paper/soft-triangles-for-expert-aggregation
Repo
Framework

Assessing the Robustness of Bayesian Dark Knowledge to Posterior Uncertainty


Title	Assessing the Robustness of Bayesian Dark Knowledge to Posterior Uncertainty
Authors	Meet P. Vadera, Benjamin M. Marlin
Abstract	Bayesian Dark Knowledge is a method for compressing the posterior predictive distribution of a neural network model into a more compact form. Specifically, the method attempts to compress a Monte Carlo approximation to the parameter posterior into a single network representing the posterior predictive distribution. Further, the authors show that this approach is successful in the classification setting using a student network whose architecture matches that of a single network in the teacher ensemble. In this work, we examine the robustness of Bayesian Dark Knowledge to higher levels of posterior uncertainty. We show that using a student network that matches the teacher architecture may fail to yield acceptable performance. We study an approach to close the resulting performance gap by increasing student model capacity.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01724v2
PDF	https://arxiv.org/pdf/1906.01724v2.pdf
PWC	https://paperswithcode.com/paper/assessing-the-robustness-of-bayesian-dark
Repo
Framework

Multi-scale deep neural networks for real image super-resolution


Title	Multi-scale deep neural networks for real image super-resolution
Authors	Shangqi Gao, Xiahai Zhuang
Abstract	Single image super-resolution (SR) is extremely difficult if the upscaling factors of image pairs are unknown and different from each other, which is common in real image SR. To tackle the difficulty, we develop two multi-scale deep neural networks (MsDNN) in this work. Firstly, due to the high computation complexity in high-resolution spaces, we process an input image mainly in two different downscaling spaces, which could greatly lower the usage of GPU memory. Then, to reconstruct the details of an image, we design a multi-scale residual network (MsRN) in the downscaling spaces based on the residual blocks. Besides, we propose a multi-scale dense network based on the dense blocks to compare with MsRN. Finally, our empirical experiments show the robustness of MsDNN for image SR when the upscaling factor is unknown. According to the preliminary results of NTIRE 2019 image SR challenge, our team (ZXHresearch@fudan) ranks 21-st among all participants. The implementation of MsDNN is released https://github.com/shangqigao/gsq-image-SR
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-04-24
URL	http://arxiv.org/abs/1904.10698v1
PDF	http://arxiv.org/pdf/1904.10698v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-deep-neural-networks-for-real
Repo
Framework

Vertex Nomination, Consistent Estimation, and Adversarial Modification


Title	Vertex Nomination, Consistent Estimation, and Adversarial Modification
Authors	Joshua Agterberg, Youngser Park, Jonathan Larson, Christopher White, Carey E. Priebe, Vince Lyzinski
Abstract	Given a pair of graphs $G_1$ and $G_2$ and a vertex set of interest in $G_1$, the vertex nomination problem seeks to find the corresponding vertices of interest in $G_2$ (if they exist) and produce a rank list of the vertices in $G_2$, with the corresponding vertices of interest in $G_2$ concentrating, ideally, at the top of the rank list. In this paper we study the effect of an adversarial contamination model on the performance of a spectral graph embedding-based vertex nomination scheme. In both real and simulated examples, we demonstrate that this vertex nomination scheme performs effectively in the uncontaminated setting; adversarial network contamination adversely impacts the performance of our VN scheme; and network regularization successfully mitigates the impact of the contamination. In addition to furthering the theoretic basis of consistency in vertex nomination, the adversarial noise model posited herein is grounded in theoretical developments that allow us to frame the role of an adversary in terms of maximal vertex nomination consistency classes.
Tasks	Graph Embedding
Published	2019-05-06
URL	https://arxiv.org/abs/1905.01776v2
PDF	https://arxiv.org/pdf/1905.01776v2.pdf
PWC	https://paperswithcode.com/paper/vertex-nomination-consistent-estimation-and
Repo
Framework

Topic Modeling the Reading and Writing Behavior of Information Foragers


Title	Topic Modeling the Reading and Writing Behavior of Information Foragers
Authors	Jaimie Murdock
Abstract	The general problem of “information foraging” in an environment about which agents have incomplete information has been explored in many fields, including cognitive psychology, neuroscience, economics, finance, ecology, and computer science. In all of these areas, the searcher aims to enhance future performance by surveying enough of existing knowledge to orient themselves in the information space. Individuals can be viewed as conducting a cognitive search in which they must balance exploration of ideas that are novel to them against exploitation of knowledge in domains in which they are already expert. In this dissertation, I present several case studies that demonstrate how reading and writing behaviors interact to construct personal knowledge bases. These studies use LDA topic modeling to represent the information environment of the texts each author read and wrote. Three studies revolve around Charles Darwin. Darwin left detailed records of every book he read for 23 years, from disembarking from the H.M.S. Beagle to just after publication of The Origin of Species. Additionally, he left copies of his drafts before publication. I characterize his reading behavior, then show how that reading behavior interacted with the drafts and subsequent revisions of The Origin of Species, and expand the dataset to include later readings and writings. Then, through a study of Thomas Jefferson’s correspondence, I expand the study to non-book data. Finally, through an examination of neuroscience citation data, I move from individual behavior to collective behavior in constructing an information environment. Together, these studies reveal “the interplay between individual and collective phenomena where innovation takes place” (Tria et al. 2014).
Tasks
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00488v1
PDF	https://arxiv.org/pdf/1907.00488v1.pdf
PWC	https://paperswithcode.com/paper/topic-modeling-the-reading-and-writing
Repo
Framework

Prostate Cancer Detection using Deep Convolutional Neural Networks


Title	Prostate Cancer Detection using Deep Convolutional Neural Networks
Authors	Sunghwan Yoo, Isha Gujrathi, Masoom A. Haider, Farzad Khalvati
Abstract	Prostate cancer is one of the most common forms of cancer and the third leading cause of cancer death in North America. As an integrated part of computer-aided detection (CAD) tools, diffusion-weighted magnetic resonance imaging (DWI) has been intensively studied for accurate detection of prostate cancer. With deep convolutional neural networks (CNNs) significant success in computer vision tasks such as object detection and segmentation, different CNNs architectures are increasingly investigated in medical imaging research community as promising solutions for designing more accurate CAD tools for cancer detection. In this work, we developed and implemented an automated CNNs-based pipeline for detection of clinically significant prostate cancer (PCa) for a given axial DWI image and for each patient. DWI images of 427 patients were used as the dataset, which contained 175 patients with PCa and 252 healthy patients. To measure the performance of the proposed pipeline, a test set of 108 (out of 427) patients were set aside and not used in the training phase. The proposed pipeline achieved area under the receiver operating characteristic curve (AUC) of 0.87 (95% Confidence Interval (CI): 0.84-0.90) and 0.84 (95% CI: 0.76-0.91) at slice level and patient level, respectively.
Tasks	Object Detection
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13145v1
PDF	https://arxiv.org/pdf/1905.13145v1.pdf
PWC	https://paperswithcode.com/paper/prostate-cancer-detection-using-deep
Repo
Framework

Leveraging Newswire Treebanks for Parsing Conversational Data with Argument Scrambling


Title	Leveraging Newswire Treebanks for Parsing Conversational Data with Argument Scrambling
Authors	Riyaz Ahmad Bhat, Irshad Ahmad Bhat, Dipti Misra Sharma
Abstract	We investigate the problem of parsing conversational data of morphologically-rich languages such as Hindi where argument scrambling occurs frequently. We evaluate a state-of-the-art non-linear transition-based parsing system on a new dataset containing 506 dependency trees for sentences from Bollywood (Hindi) movie scripts and Twitter posts of Hindi monolingual speakers. We show that a dependency parser trained on a newswire treebank is strongly biased towards the canonical structures and degrades when applied to conversational data. Inspired by Transformational Generative Grammar, we mitigate the sampling bias by generating all theoretically possible alternative word orders of a clause from the existing (kernel) structures in the treebank. Training our parser on canonical and transformed structures improves performance on conversational data by around 9% LAS over the baseline newswire parser.
Tasks
Published	2019-02-13
URL	http://arxiv.org/abs/1902.05085v1
PDF	http://arxiv.org/pdf/1902.05085v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-newswire-treebanks-for-parsing
Repo
Framework

Neural Naturalist: Generating Fine-Grained Image Comparisons


Title	Neural Naturalist: Generating Fine-Grained Image Comparisons
Authors	Maxwell Forbes, Christine Kaeser-Chen, Piyush Sharma, Serge Belongie
Abstract	We introduce the new Birds-to-Words dataset of 41k sentences describing fine-grained differences between photographs of birds. The language collected is highly detailed, while remaining understandable to the everyday observer (e.g., “heart-shaped face,” “squat body”). Paragraph-length descriptions naturally adapt to varying levels of taxonomic and visual distance—drawn from a novel stratified sampling approach—with the appropriate level of detail. We propose a new model called Neural Naturalist that uses a joint image encoding and comparative module to generate comparative language, and evaluate the results with humans who must use the descriptions to distinguish real images. Our results indicate promising potential for neural models to explain differences in visual embedding space using natural language, as well as a concrete path for machine learning to aid citizen scientists in their effort to preserve biodiversity.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04101v3
PDF	https://arxiv.org/pdf/1909.04101v3.pdf
PWC	https://paperswithcode.com/paper/neural-naturalist-generating-fine-grained
Repo
Framework

The ARIEL-CMU Systems for LoReHLT18


Title	The ARIEL-CMU Systems for LoReHLT18
Authors	Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W Black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown
Abstract	This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).
Tasks	Machine Translation
Published	2019-02-24
URL	http://arxiv.org/abs/1902.08899v1
PDF	http://arxiv.org/pdf/1902.08899v1.pdf
PWC	https://paperswithcode.com/paper/the-ariel-cmu-systems-for-lorehlt18
Repo
Framework

Generalization Bounds for Convolutional Neural Networks


Title	Generalization Bounds for Convolutional Neural Networks
Authors	Shan Lin, Jingwei Zhang
Abstract	Convolutional neural networks (CNNs) have achieved breakthrough performances in a wide range of applications including image classification, semantic segmentation, and object detection. Previous research on characterizing the generalization ability of neural networks mostly focuses on fully connected neural networks (FNNs), regarding CNNs as a special case of FNNs without taking into account the special structure of convolutional layers. In this work, we propose a tighter generalization bound for CNNs by exploiting the sparse and permutation structure of its weight matrices. As the generalization bound relies on the spectral norm of weight matrices, we further study spectral norms of three commonly used convolution operations including standard convolution, depthwise convolution, and pointwise convolution. Theoretical and experimental results both demonstrate that our bounds for CNNs are tighter than existing bounds.
Tasks	Image Classification, Object Detection, Semantic Segmentation
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01487v1
PDF	https://arxiv.org/pdf/1910.01487v1.pdf
PWC	https://paperswithcode.com/paper/generalization-bounds-for-convolutional
Repo
Framework

Why ResNet Works? Residuals Generalize


Title	Why ResNet Works? Residuals Generalize
Authors	Fengxiang He, Tongliang Liu, Dacheng Tao
Abstract	Residual connections significantly boost the performance of deep neural networks. However, there are few theoretical results that address the influence of residuals on the hypothesis complexity and the generalization ability of deep neural networks. This paper studies the influence of residual connections on the hypothesis complexity of the neural network in terms of the covering number of its hypothesis space. We prove that the upper bound of the covering number is the same as chain-like neural networks, if the total numbers of the weight matrices and nonlinearities are fixed, no matter whether they are in the residuals or not. This result demonstrates that residual connections may not increase the hypothesis complexity of the neural network compared with the chain-like counterpart. Based on the upper bound of the covering number, we then obtain an $\mathcal O(1 / \sqrt{N})$ margin-based multi-class generalization bound for ResNet, as an exemplary case of any deep neural network with residual connections. Generalization guarantees for similar state-of-the-art neural network architectures, such as DenseNet and ResNeXt, are straight-forward. From our generalization bound, a practical implementation is summarized: to approach a good generalization ability, we need to use regularization terms to control the magnitude of the norms of weight matrices not to increase too much, which justifies the standard technique of weight decay.
Tasks
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01367v1
PDF	http://arxiv.org/pdf/1904.01367v1.pdf
PWC	https://paperswithcode.com/paper/why-resnet-works-residuals-generalize
Repo
Framework

The Canonical Distortion Measure for Vector Quantization and Function Approximation


Title	The Canonical Distortion Measure for Vector Quantization and Function Approximation
Authors	Jonathan Baxter
Abstract	To measure the quality of a set of vector quantization points a means of measuring the distance between a random point and its quantization is required. Common metrics such as the {\em Hamming} and {\em Euclidean} metrics, while mathematically simple, are inappropriate for comparing natural signals such as speech or images. In this paper it is shown how an {\em environment} of functions on an input space $X$ induces a {\em canonical distortion measure} (CDM) on X. The depiction ‘canonical” is justified because it is shown that optimizing the reconstruction error of X with respect to the CDM gives rise to optimal piecewise constant approximations of the functions in the environment. The CDM is calculated in closed form for several different function classes. An algorithm for training neural networks to implement the CDM is presented along with some encouraging experimental results.
Tasks	Quantization
Published	2019-11-14
URL	https://arxiv.org/abs/1911.06319v1
PDF	https://arxiv.org/pdf/1911.06319v1.pdf
PWC	https://paperswithcode.com/paper/the-canonical-distortion-measure-for-vector
Repo
Framework

Constrained Design of Deep Iris Networks


Title	Constrained Design of Deep Iris Networks
Authors	Kien Nguyen, Clinton Fookes, Sridha Sridharan
Abstract	Despite the promise of recent deep neural networks in the iris recognition setting, there are vital properties of the classic IrisCode which are almost unable to be achieved with current deep iris networks: the compactness of model and the small number of computing operations (FLOPs). This paper re-models the iris network design process as a constrained optimization problem which takes model size and computation into account as learning criteria. On one hand, this allows us to fully automate the network design process to search for the best iris network confined to the computation and model compactness constraints. On the other hand, it allows us to investigate the optimality of the classic IrisCode and recent iris networks. It also allows us to learn an optimal iris network and demonstrate state-of-the-art performance with less computation and memory requirements.
Tasks	Iris Recognition
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09481v1
PDF	https://arxiv.org/pdf/1905.09481v1.pdf
PWC	https://paperswithcode.com/paper/constrained-design-of-deep-iris-networks
Repo
Framework

Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias


Title	Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias
Authors	Patrick Forré, Joris M. Mooij
Abstract	We prove the main rules of causal calculus (also called do-calculus) for i/o structural causal models (ioSCMs), a generalization of a recently proposed general class of non-/linear structural causal models that allow for cycles, latent confounders and arbitrary probability distributions. We also generalize adjustment criteria and formulas from the acyclic setting to the general one (i.e. ioSCMs). Such criteria then allow to estimate (conditional) causal effects from observational data that was (partially) gathered under selection bias and cycles. This generalizes the backdoor criterion, the selection-backdoor criterion and extensions of these to arbitrary ioSCMs. Together, our results thus enable causal reasoning in the presence of cycles, latent confounders and selection bias. Finally, we extend the ID algorithm for the identification of causal effects to ioSCMs.
Tasks
Published	2019-01-02
URL	https://arxiv.org/abs/1901.00433v2
PDF	https://arxiv.org/pdf/1901.00433v2.pdf
PWC	https://paperswithcode.com/paper/causal-calculus-in-the-presence-of-cycles
Repo
Framework

Multiresolution Transformer Networks: Recurrence is Not Essential for Modeling Hierarchical Structure


Title	Multiresolution Transformer Networks: Recurrence is Not Essential for Modeling Hierarchical Structure
Authors	Vikas K. Garg, Inderjit S. Dhillon, Hsiang-Fu Yu
Abstract	The architecture of Transformer is based entirely on self-attention, and has been shown to outperform models that employ recurrence on sequence transduction tasks such as machine translation. The superior performance of Transformer has been attributed to propagating signals over shorter distances, between positions in the input and the output, compared to the recurrent architectures. We establish connections between the dynamics in Transformer and recurrent networks to argue that several factors including gradient flow along an ensemble of multiple weakly dependent paths play a paramount role in the success of Transformer. We then leverage the dynamics to introduce {\em Multiresolution Transformer Networks} as the first architecture that exploits hierarchical structure in data via self-attention. Our models significantly outperform state-of-the-art recurrent and hierarchical recurrent models on two real-world datasets for query suggestion, namely, \aol and \amazon. In particular, on AOL data, our model registers at least 20% improvement on each precision score, and over 25% improvement on the BLEU score with respect to the best performing recurrent model. We thus provide strong evidence that recurrence is not essential for modeling hierarchical structure.
Tasks	Machine Translation
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10408v1
PDF	https://arxiv.org/pdf/1908.10408v1.pdf
PWC	https://paperswithcode.com/paper/multiresolution-transformer-networks
Repo
Framework