October 19, 2019

3150 words 15 mins read

Paper Group ANR 121

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models. HiTR: Hierarchical Topic Model Re-estimation for Measuring Topical Diversity of Documents. Adaptive Confidence Smoothing for Generalized Zero-Shot Learning. An Incremental Iterated Response Model of Pragmatics. DeepLogic: Towards End-to-End Differentiable Logical Reasoning. …

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models


Title	An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models
Authors	Shudong Hao, Michael J. Paul
Abstract	Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features. While many multilingual topic models have been developed, their assumptions on the training corpus are quite varied, and it is not clear how well the models can be applied under various training conditions. In this paper, we systematically study the knowledge transfer mechanisms behind different multilingual topic models, and through a broad set of experiments with four models on ten languages, we provide empirical insights that can inform the selection and future development of multilingual topic models.
Tasks	Topic Models, Transfer Learning
Published	2018-10-13
URL	https://arxiv.org/abs/1810.05867v2
PDF	https://arxiv.org/pdf/1810.05867v2.pdf
PWC	https://paperswithcode.com/paper/understanding-crosslingual-transfer
Repo
Framework

HiTR: Hierarchical Topic Model Re-estimation for Measuring Topical Diversity of Documents


Title	HiTR: Hierarchical Topic Model Re-estimation for Measuring Topical Diversity of Documents
Authors	Hosein Azarbonyad, Mostafa Dehghani, Tom Kenter, Maarten Marx, Jaap Kamps, Maarten de Rijke
Abstract	A high degree of topical diversity is often considered to be an important characteristic of interesting text documents. A recent proposal for measuring topical diversity identifies three distributions for assessing the diversity of documents: distributions of words within documents, words within topics, and topics within documents. Topic models play a central role in this approach and, hence, their quality is crucial to the efficacy of measuring topical diversity. The quality of topic models is affected by two causes: generality and impurity of topics. General topics only include common information of a background corpus and are assigned to most of the documents. Impure topics contain words that are not related to the topic. Impurity lowers the interpretability of topic models. Impure topics are likely to get assigned to documents erroneously. We propose a hierarchical re-estimation process aimed at removing generality and impurity. Our approach has three re-estimation components: (1) document re-estimation, which removes general words from the documents; (2) topic re-estimation, which re-estimates the distribution over words of each topic; and (3) topic assignment re-estimation, which re-estimates for each document its distributions over topics. For measuring topical diversity of text documents, our HiTR approach improves over the state-of-the-art measured on PubMed dataset.
Tasks	Topic Models
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05436v1
PDF	http://arxiv.org/pdf/1810.05436v1.pdf
PWC	https://paperswithcode.com/paper/hitr-hierarchical-topic-model-re-estimation
Repo
Framework

Adaptive Confidence Smoothing for Generalized Zero-Shot Learning


Title	Adaptive Confidence Smoothing for Generalized Zero-Shot Learning
Authors	Yuval Atzmon, Gal Chechik
Abstract	Generalized zero-shot learning (GZSL) is the problem of learning a classifier where some classes have samples and others are learned from side information, like semantic attributes or text description, in a zero-shot learning fashion (ZSL). Training a single model that operates in these two regimes simultaneously is challenging. Here we describe a probabilistic approach that breaks the model into three modular components, and then combines them in a consistent way. Specifically, our model consists of three classifiers: A “gating” model that makes soft decisions if a sample is from a “seen” class, and two experts: a ZSL expert, and an expert model for seen classes. We address two main difficulties in this approach: How to provide an accurate estimate of the gating probability without any training samples for unseen classes; and how to use expert predictions when it observes samples outside of its domain. The key insight to our approach is to pass information between the three models to improve each one’s accuracy, while maintaining the modular structure. We test our approach, adaptive confidence smoothing (COSMO), on four standard GZSL benchmark datasets and find that it largely outperforms state-of-the-art GZSL models. COSMO is also the first model that closes the gap and surpasses the performance of generative models for GZSL, even-though it is a light-weight model that is much easier to train and tune. Notably, COSMO offers a new view for developing zero-shot models. Thanks to COSMO’s modular structure, instead of trying to perform well both on seen and on unseen classes, models can focus on accurate classification of unseen classes, and later consider seen class models.
Tasks	Zero-Shot Learning
Published	2018-12-24
URL	https://arxiv.org/abs/1812.09903v3
PDF	https://arxiv.org/pdf/1812.09903v3.pdf
PWC	https://paperswithcode.com/paper/domain-aware-generalized-zero-shot-learning
Repo
Framework

An Incremental Iterated Response Model of Pragmatics


Title	An Incremental Iterated Response Model of Pragmatics
Authors	Reuben Cohn-Gordon, Noah D. Goodman, Christopher Potts
Abstract	Recent Iterated Response (IR) models of pragmatics conceptualize language use as a recursive process in which agents reason about each other to increase communicative efficiency. These models are generally defined over complete utterances. However, there is substantial evidence that pragmatic reasoning takes place incrementally during production and comprehension. We address this with an incremental IR model. We compare the incremental and global versions using computational simulations, and we assess the incremental model against existing experimental data and in the TUNA corpus for referring expression generation, showing that the model can capture phenomena out of reach of global versions.
Tasks
Published	2018-09-30
URL	http://arxiv.org/abs/1810.00367v2
PDF	http://arxiv.org/pdf/1810.00367v2.pdf
PWC	https://paperswithcode.com/paper/an-incremental-iterated-response-model-of
Repo
Framework

DeepLogic: Towards End-to-End Differentiable Logical Reasoning


Title	DeepLogic: Towards End-to-End Differentiable Logical Reasoning
Authors	Nuri Cingillioglu, Alessandra Russo
Abstract	Combining machine learning with logic-based expert systems in order to get the best of both worlds are becoming increasingly popular. However, to what extent machine learning can already learn to reason over rule-based knowledge is still an open problem. In this paper, we explore how symbolic logic, defined as logic programs at a character level, is learned to be represented in a high-dimensional vector space using RNN-based iterative neural networks to perform reasoning. We create a new dataset that defines 12 classes of logic programs exemplifying increased level of complexity of logical reasoning and train the networks in an end-to-end fashion to learn whether a logic program entails a given query. We analyse how learning the inference algorithm gives rise to representations of atoms, literals and rules within logic programs and evaluate against increasing lengths of predicate and constant symbols as well as increasing steps of multi-hop reasoning.
Tasks
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07433v3
PDF	http://arxiv.org/pdf/1805.07433v3.pdf
PWC	https://paperswithcode.com/paper/deeplogic-towards-end-to-end-differentiable
Repo
Framework

Weakly Supervised Estimation of Shadow Confidence Maps in Fetal Ultrasound Imaging


Title	Weakly Supervised Estimation of Shadow Confidence Maps in Fetal Ultrasound Imaging
Authors	Qingjie Meng, Matthew Sinclair, Veronika Zimmer, Benjamin Hou, Martin Rajchl, Nicolas Toussaint, Ozan Oktay, Jo Schlemper, Alberto Gomez, James Housden, Jacqueline Matthew, Daniel Rueckert, Julia Schnabel, Bernhard Kainz
Abstract	Detecting acoustic shadows in ultrasound images is important in many clinical and engineering applications. Real-time feedback of acoustic shadows can guide sonographers to a standardized diagnostic viewing plane with minimal artifacts and can provide additional information for other automatic image analysis algorithms. However, automatically detecting shadow regions using learning-based algorithms is challenging because pixel-wise ground truth annotation of acoustic shadows is subjective and time consuming. In this paper we propose a weakly supervised method for automatic confidence estimation of acoustic shadow regions. Our method is able to generate a dense shadow-focused confidence map. In our method, a shadow-seg module is built to learn general shadow features for shadow segmentation, based on global image-level annotations as well as a small number of coarse pixel-wise shadow annotations. A transfer function is introduced to extend the obtained binary shadow segmentation to a reference confidence map. Additionally, a confidence estimation network is proposed to learn the mapping between input images and the reference confidence maps. This network is able to predict shadow confidence maps directly from input images during inference. We use evaluation metrics such as DICE, inter-class correlation and etc. to verify the effectiveness of our method. Our method is more consistent than human annotation, and outperforms the state-of-the-art quantitatively in shadow segmentation and qualitatively in confidence estimation of shadow regions. We further demonstrate the applicability of our method by integrating shadow confidence maps into tasks such as ultrasound image classification, multi-view image fusion and automated biometric measurements.
Tasks	Image Classification, Shadow Confidence Maps In Ultrasound Imaging
Published	2018-11-20
URL	https://arxiv.org/abs/1811.08164v3
PDF	https://arxiv.org/pdf/1811.08164v3.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-estimation-of-shadow
Repo
Framework

Generative Adversarial Forests for Better Conditioned Adversarial Learning


Title	Generative Adversarial Forests for Better Conditioned Adversarial Learning
Authors	Yan Zuo, Gil Avraham, Tom Drummond
Abstract	In recent times, many of the breakthroughs in various vision-related tasks have revolved around improving learning of deep models; these methods have ranged from network architectural improvements such as Residual Networks, to various forms of regularisation such as Batch Normalisation. In essence, many of these techniques revolve around better conditioning, allowing for deeper and deeper models to be successfully learned. In this paper, we look towards better conditioning Generative Adversarial Networks (GANs) in an unsupervised learning setting. Our method embeds the powerful discriminating capabilities of a decision forest into the discriminator of a GAN. This results in a better conditioned model which learns in an extremely stable way. We demonstrate empirical results which show both clear qualitative and quantitative evidence of the effectiveness of our approach, gaining significant performance improvements over several popular GAN-based approaches on the Oxford Flowers and Aligned Celebrity Faces datasets.
Tasks
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05185v1
PDF	http://arxiv.org/pdf/1805.05185v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-forests-for-better
Repo
Framework

Learning to Generate Structured Queries from Natural Language with Indirect Supervision


Title	Learning to Generate Structured Queries from Natural Language with Indirect Supervision
Authors	Ziwei Bai, Bo Yu, Bowen Wu, Zhuoran Wang, Baoxun Wang
Abstract	Generating structured query language (SQL) from natural language is an emerging research topic. This paper presents a new learning paradigm from indirect supervision of the answers to natural language questions, instead of SQL queries. This paradigm facilitates the acquisition of training data due to the abundant resources of question-answer pairs for various domains in the Internet, and expels the difficult SQL annotation job. An end-to-end neural model integrating with reinforcement learning is proposed to learn SQL generation policy within the answer-driven learning paradigm. The model is evaluated on datasets of different domains, including movie and academic publication. Experimental results show that our model outperforms the baseline models.
Tasks
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03195v1
PDF	http://arxiv.org/pdf/1809.03195v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-generate-structured-queries-from
Repo
Framework

Stagewise Training Accelerates Convergence of Testing Error Over SGD


Title	Stagewise Training Accelerates Convergence of Testing Error Over SGD
Authors	Zhuoning Yuan, Yan Yan, Rong Jin, Tianbao Yang
Abstract	Stagewise training strategy is widely used for learning neural networks, which runs a stochastic algorithm (e.g., SGD) starting with a relatively large step size (aka learning rate) and geometrically decreasing the step size after a number of iterations. It has been observed that the stagewise SGD has much faster convergence than the vanilla SGD with a polynomially decaying step size in terms of both training error and testing error. {\it But how to explain this phenomenon has been largely ignored by existing studies.} This paper provides some theoretical evidence for explaining this faster convergence. In particular, we consider a stagewise training strategy for minimizing empirical risk that satisfies the Polyak-\L ojasiewicz (PL) condition, which has been observed/proved for neural networks and also holds for a broad family of convex functions. For convex loss functions and two classes of “nice-behaviored” non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error. Experiments on stagewise learning of deep residual networks exhibits that it satisfies one type of non-convexity assumption and therefore can be explained by our theory. Of independent interest, the testing error bounds for the considered non-convex loss functions are dimensionality and norm independent.
Tasks
Published	2018-12-10
URL	http://arxiv.org/abs/1812.03934v3
PDF	http://arxiv.org/pdf/1812.03934v3.pdf
PWC	https://paperswithcode.com/paper/stagewise-training-accelerates-convergence-of
Repo
Framework

Differentially Private Confidence Intervals for Empirical Risk Minimization


Title	Differentially Private Confidence Intervals for Empirical Risk Minimization
Authors	Yue Wang, Daniel Kifer, Jaewoo Lee
Abstract	The process of data mining with differential privacy produces results that are affected by two types of noise: sampling noise due to data collection and privacy noise that is designed to prevent the reconstruction of sensitive information. In this paper, we consider the problem of designing confidence intervals for the parameters of a variety of differentially private machine learning models. The algorithms can provide confidence intervals that satisfy differential privacy (as well as the more recently proposed concentrated differential privacy) and can be used with existing differentially private mechanisms that train models using objective perturbation and output perturbation.
Tasks
Published	2018-04-11
URL	http://arxiv.org/abs/1804.03794v1
PDF	http://arxiv.org/pdf/1804.03794v1.pdf
PWC	https://paperswithcode.com/paper/differentially-private-confidence-intervals
Repo
Framework

Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications


Title	Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications
Authors	Fanhua Shang, James Cheng, Yuanyuan Liu, Zhi-Quan Luo, Zhouchen Lin
Abstract	The heavy-tailed distributions of corrupted outliers and singular values of all channels in low-level vision have proven effective priors for many applications such as background modeling, photometric stereo and image alignment. And they can be well modeled by a hyper-Laplacian. However, the use of such distributions generally leads to challenging non-convex, non-smooth and non-Lipschitz problems, and makes existing algorithms very slow for large-scale applications. Together with the analytic solutions to lp-norm minimization with two specific values of p, i.e., p=1/2 and p=2/3, we propose two novel bilinear factor matrix norm minimization models for robust principal component analysis. We first define the double nuclear norm and Frobenius/nuclear hybrid norm penalties, and then prove that they are in essence the Schatten-1/2 and 2/3 quasi-norms, respectively, which lead to much more tractable and scalable Lipschitz optimization problems. Our experimental analysis shows that both our methods yield more accurate solutions than original Schatten quasi-norm minimization, even when the number of observations is very limited. Finally, we apply our penalties to various low-level vision problems, e.g., text removal, moving object detection, image alignment and inpainting, and show that our methods usually outperform the state-of-the-art methods.
Tasks	Object Detection
Published	2018-10-11
URL	http://arxiv.org/abs/1810.05186v1
PDF	http://arxiv.org/pdf/1810.05186v1.pdf
PWC	https://paperswithcode.com/paper/bilinear-factor-matrix-norm-minimization-for
Repo
Framework

Towards Smart City Innovation Under the Perspective of Software-Defined Networking, Artificial Intelligence and Big Data


Title	Towards Smart City Innovation Under the Perspective of Software-Defined Networking, Artificial Intelligence and Big Data
Authors	Joberto S. B. Martins
Abstract	Smart city projects address many of the current problems afflicting high populated areas and cities and, as such, are a target for government, institutions and private organizations that plan to explore its foreseen advantages. In technical terms, smart city projects present a complex set of requirements including a large number users with highly different and heterogeneous requirements. In this scenario, this paper proposes and analyses the impact and perspectives on adopting software-defined networking and artificial intelligence as innovative approaches for smart city project development and deployment. Big data is also considered as an inherent element of most smart city project that must be tackled. A framework layered view is proposed with a discussion about software-defined networking and machine learning impacts on innovation followed by a use case that demonstrates the potential benefits of cognitive learning for smart cities. It is argued that the complexity of smart city projects do require new innovative approaches that potentially result in more efficient and intelligent systems.
Tasks
Published	2018-10-27
URL	http://arxiv.org/abs/1810.11665v1
PDF	http://arxiv.org/pdf/1810.11665v1.pdf
PWC	https://paperswithcode.com/paper/towards-smart-city-innovation-under-the
Repo
Framework

New Insights into Bootstrapping for Bandits


Title	New Insights into Bootstrapping for Bandits
Authors	Sharan Vaswani, Branislav Kveton, Zheng Wen, Anup Rao, Mark Schmidt, Yasin Abbasi-Yadkori
Abstract	We investigate the use of bootstrapping in the bandit setting. We first show that the commonly used non-parametric bootstrapping (NPB) procedure can be provably inefficient and establish a near-linear lower bound on the regret incurred by it under the bandit model with Bernoulli rewards. We show that NPB with an appropriate amount of forced exploration can result in sub-linear albeit sub-optimal regret. As an alternative to NPB, we propose a weighted bootstrapping (WB) procedure. For Bernoulli rewards, WB with multiplicative exponential weights is mathematically equivalent to Thompson sampling (TS) and results in near-optimal regret bounds. Similarly, in the bandit setting with Gaussian rewards, we show that WB with additive Gaussian weights achieves near-optimal regret. Beyond these special cases, we show that WB leads to better empirical performance than TS for several reward distributions bounded on $[0,1]$. For the contextual bandit setting, we give practical guidelines that make bootstrapping simple and efficient to implement and result in good empirical performance on real-world datasets.
Tasks
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09793v1
PDF	http://arxiv.org/pdf/1805.09793v1.pdf
PWC	https://paperswithcode.com/paper/new-insights-into-bootstrapping-for-bandits
Repo
Framework

DNN or k-NN: That is the Generalize vs. Memorize Question


Title	DNN or k-NN: That is the Generalize vs. Memorize Question
Authors	Gilad Cohen, Guillermo Sapiro, Raja Giryes
Abstract	This paper studies the relationship between the classification performed by deep neural networks (DNNs) and the decision of various classical classifiers, namely k-nearest neighbours (k-NN), support vector machines (SVM) and logistic regression (LR), at various layers of the network. This comparison provides us with new insights as to the ability of neural networks to both memorize the training data and generalize to new data at the same time, where k-NN serves as the ideal estimator that perfectly memorizes the data. We show that memorization of non-generalizing networks happens only at the last layers. Moreover, the behavior of DNNs compared to the linear classifiers SVM and LR is quite the same on the training and test data regardless of whether the network generalizes. On the other hand, the similarity to k-NN holds only at the absence of overfitting. Our results suggests that k-NN behavior of the network on new data is a sign of generalization. Moreover, it shows that memorization and generalization, which are traditionally considered to be contradicting to each other, are compatible and complementary.
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06822v6
PDF	http://arxiv.org/pdf/1805.06822v6.pdf
PWC	https://paperswithcode.com/paper/dnn-or-k-nn-that-is-the-generalize-vs
Repo
Framework

Virtualization of tissue staining in digital pathology using an unsupervised deep learning approach


Title	Virtualization of tissue staining in digital pathology using an unsupervised deep learning approach
Authors	Amal Lahiani, Jacob Gildenblat, Irina Klaman, Shadi Albarqouni, Nassir Navab, Eldad Klaiman
Abstract	Histopathological evaluation of tissue samples is a key practice in patient diagnosis and drug development, especially in oncology. Historically, Hematoxylin and Eosin (H&E) has been used by pathologists as a gold standard staining. However, in many cases, various target specific stains, including immunohistochemistry (IHC), are needed in order to highlight specific structures in the tissue. As tissue is scarce and staining procedures are tedious, it would be beneficial to generate images of stained tissue virtually. Virtual staining could also generate in-silico multiplexing of different stains on the same tissue segment. In this paper, we present a sample application that generates FAP-CK virtual IHC images from Ki67-CD8 real IHC images using an unsupervised deep learning approach based on CycleGAN. We also propose a method to deal with tiling artifacts caused by normalization layers and we validate our approach by comparing the results of tissue analysis algorithms for virtual and real images.
Tasks
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06415v1
PDF	http://arxiv.org/pdf/1810.06415v1.pdf
PWC	https://paperswithcode.com/paper/virtualization-of-tissue-staining-in-digital
Repo
Framework