Paper Group ANR 121
An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models. HiTR: Hierarchical Topic Model Re-estimation for Measuring Topical Diversity of Documents. Adaptive Confidence Smoothing for Generalized Zero-Shot Learning. An Incremental Iterated Response Model of Pragmatics. DeepLogic: Towards End-to-End Differentiable Logical Reasoning. …
An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models
Title | An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models |
Authors | Shudong Hao, Michael J. Paul |
Abstract | Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features. While many multilingual topic models have been developed, their assumptions on the training corpus are quite varied, and it is not clear how well the models can be applied under various training conditions. In this paper, we systematically study the knowledge transfer mechanisms behind different multilingual topic models, and through a broad set of experiments with four models on ten languages, we provide empirical insights that can inform the selection and future development of multilingual topic models. |
Tasks | Topic Models, Transfer Learning |
Published | 2018-10-13 |
URL | https://arxiv.org/abs/1810.05867v2 |
https://arxiv.org/pdf/1810.05867v2.pdf | |
PWC | https://paperswithcode.com/paper/understanding-crosslingual-transfer |
Repo | |
Framework | |
HiTR: Hierarchical Topic Model Re-estimation for Measuring Topical Diversity of Documents
Title | HiTR: Hierarchical Topic Model Re-estimation for Measuring Topical Diversity of Documents |
Authors | Hosein Azarbonyad, Mostafa Dehghani, Tom Kenter, Maarten Marx, Jaap Kamps, Maarten de Rijke |
Abstract | A high degree of topical diversity is often considered to be an important characteristic of interesting text documents. A recent proposal for measuring topical diversity identifies three distributions for assessing the diversity of documents: distributions of words within documents, words within topics, and topics within documents. Topic models play a central role in this approach and, hence, their quality is crucial to the efficacy of measuring topical diversity. The quality of topic models is affected by two causes: generality and impurity of topics. General topics only include common information of a background corpus and are assigned to most of the documents. Impure topics contain words that are not related to the topic. Impurity lowers the interpretability of topic models. Impure topics are likely to get assigned to documents erroneously. We propose a hierarchical re-estimation process aimed at removing generality and impurity. Our approach has three re-estimation components: (1) document re-estimation, which removes general words from the documents; (2) topic re-estimation, which re-estimates the distribution over words of each topic; and (3) topic assignment re-estimation, which re-estimates for each document its distributions over topics. For measuring topical diversity of text documents, our HiTR approach improves over the state-of-the-art measured on PubMed dataset. |
Tasks | Topic Models |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05436v1 |
http://arxiv.org/pdf/1810.05436v1.pdf | |
PWC | https://paperswithcode.com/paper/hitr-hierarchical-topic-model-re-estimation |
Repo | |
Framework | |
Adaptive Confidence Smoothing for Generalized Zero-Shot Learning
Title | Adaptive Confidence Smoothing for Generalized Zero-Shot Learning |
Authors | Yuval Atzmon, Gal Chechik |
Abstract | Generalized zero-shot learning (GZSL) is the problem of learning a classifier where some classes have samples and others are learned from side information, like semantic attributes or text description, in a zero-shot learning fashion (ZSL). Training a single model that operates in these two regimes simultaneously is challenging. Here we describe a probabilistic approach that breaks the model into three modular components, and then combines them in a consistent way. Specifically, our model consists of three classifiers: A “gating” model that makes soft decisions if a sample is from a “seen” class, and two experts: a ZSL expert, and an expert model for seen classes. We address two main difficulties in this approach: How to provide an accurate estimate of the gating probability without any training samples for unseen classes; and how to use expert predictions when it observes samples outside of its domain. The key insight to our approach is to pass information between the three models to improve each one’s accuracy, while maintaining the modular structure. We test our approach, adaptive confidence smoothing (COSMO), on four standard GZSL benchmark datasets and find that it largely outperforms state-of-the-art GZSL models. COSMO is also the first model that closes the gap and surpasses the performance of generative models for GZSL, even-though it is a light-weight model that is much easier to train and tune. Notably, COSMO offers a new view for developing zero-shot models. Thanks to COSMO’s modular structure, instead of trying to perform well both on seen and on unseen classes, models can focus on accurate classification of unseen classes, and later consider seen class models. |
Tasks | Zero-Shot Learning |
Published | 2018-12-24 |
URL | https://arxiv.org/abs/1812.09903v3 |
https://arxiv.org/pdf/1812.09903v3.pdf | |
PWC | https://paperswithcode.com/paper/domain-aware-generalized-zero-shot-learning |
Repo | |
Framework | |
An Incremental Iterated Response Model of Pragmatics
Title | An Incremental Iterated Response Model of Pragmatics |
Authors | Reuben Cohn-Gordon, Noah D. Goodman, Christopher Potts |
Abstract | Recent Iterated Response (IR) models of pragmatics conceptualize language use as a recursive process in which agents reason about each other to increase communicative efficiency. These models are generally defined over complete utterances. However, there is substantial evidence that pragmatic reasoning takes place incrementally during production and comprehension. We address this with an incremental IR model. We compare the incremental and global versions using computational simulations, and we assess the incremental model against existing experimental data and in the TUNA corpus for referring expression generation, showing that the model can capture phenomena out of reach of global versions. |
Tasks | |
Published | 2018-09-30 |
URL | http://arxiv.org/abs/1810.00367v2 |
http://arxiv.org/pdf/1810.00367v2.pdf | |
PWC | https://paperswithcode.com/paper/an-incremental-iterated-response-model-of |
Repo | |
Framework | |
DeepLogic: Towards End-to-End Differentiable Logical Reasoning
Title | DeepLogic: Towards End-to-End Differentiable Logical Reasoning |
Authors | Nuri Cingillioglu, Alessandra Russo |
Abstract | Combining machine learning with logic-based expert systems in order to get the best of both worlds are becoming increasingly popular. However, to what extent machine learning can already learn to reason over rule-based knowledge is still an open problem. In this paper, we explore how symbolic logic, defined as logic programs at a character level, is learned to be represented in a high-dimensional vector space using RNN-based iterative neural networks to perform reasoning. We create a new dataset that defines 12 classes of logic programs exemplifying increased level of complexity of logical reasoning and train the networks in an end-to-end fashion to learn whether a logic program entails a given query. We analyse how learning the inference algorithm gives rise to representations of atoms, literals and rules within logic programs and evaluate against increasing lengths of predicate and constant symbols as well as increasing steps of multi-hop reasoning. |
Tasks | |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07433v3 |
http://arxiv.org/pdf/1805.07433v3.pdf | |
PWC | https://paperswithcode.com/paper/deeplogic-towards-end-to-end-differentiable |
Repo | |
Framework | |
Weakly Supervised Estimation of Shadow Confidence Maps in Fetal Ultrasound Imaging
Title | Weakly Supervised Estimation of Shadow Confidence Maps in Fetal Ultrasound Imaging |
Authors | Qingjie Meng, Matthew Sinclair, Veronika Zimmer, Benjamin Hou, Martin Rajchl, Nicolas Toussaint, Ozan Oktay, Jo Schlemper, Alberto Gomez, James Housden, Jacqueline Matthew, Daniel Rueckert, Julia Schnabel, Bernhard Kainz |
Abstract | Detecting acoustic shadows in ultrasound images is important in many clinical and engineering applications. Real-time feedback of acoustic shadows can guide sonographers to a standardized diagnostic viewing plane with minimal artifacts and can provide additional information for other automatic image analysis algorithms. However, automatically detecting shadow regions using learning-based algorithms is challenging because pixel-wise ground truth annotation of acoustic shadows is subjective and time consuming. In this paper we propose a weakly supervised method for automatic confidence estimation of acoustic shadow regions. Our method is able to generate a dense shadow-focused confidence map. In our method, a shadow-seg module is built to learn general shadow features for shadow segmentation, based on global image-level annotations as well as a small number of coarse pixel-wise shadow annotations. A transfer function is introduced to extend the obtained binary shadow segmentation to a reference confidence map. Additionally, a confidence estimation network is proposed to learn the mapping between input images and the reference confidence maps. This network is able to predict shadow confidence maps directly from input images during inference. We use evaluation metrics such as DICE, inter-class correlation and etc. to verify the effectiveness of our method. Our method is more consistent than human annotation, and outperforms the state-of-the-art quantitatively in shadow segmentation and qualitatively in confidence estimation of shadow regions. We further demonstrate the applicability of our method by integrating shadow confidence maps into tasks such as ultrasound image classification, multi-view image fusion and automated biometric measurements. |
Tasks | Image Classification, Shadow Confidence Maps In Ultrasound Imaging |
Published | 2018-11-20 |
URL | https://arxiv.org/abs/1811.08164v3 |
https://arxiv.org/pdf/1811.08164v3.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-estimation-of-shadow |
Repo | |
Framework | |
Generative Adversarial Forests for Better Conditioned Adversarial Learning
Title | Generative Adversarial Forests for Better Conditioned Adversarial Learning |
Authors | Yan Zuo, Gil Avraham, Tom Drummond |
Abstract | In recent times, many of the breakthroughs in various vision-related tasks have revolved around improving learning of deep models; these methods have ranged from network architectural improvements such as Residual Networks, to various forms of regularisation such as Batch Normalisation. In essence, many of these techniques revolve around better conditioning, allowing for deeper and deeper models to be successfully learned. In this paper, we look towards better conditioning Generative Adversarial Networks (GANs) in an unsupervised learning setting. Our method embeds the powerful discriminating capabilities of a decision forest into the discriminator of a GAN. This results in a better conditioned model which learns in an extremely stable way. We demonstrate empirical results which show both clear qualitative and quantitative evidence of the effectiveness of our approach, gaining significant performance improvements over several popular GAN-based approaches on the Oxford Flowers and Aligned Celebrity Faces datasets. |
Tasks | |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05185v1 |
http://arxiv.org/pdf/1805.05185v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-forests-for-better |
Repo | |
Framework | |
Learning to Generate Structured Queries from Natural Language with Indirect Supervision
Title | Learning to Generate Structured Queries from Natural Language with Indirect Supervision |
Authors | Ziwei Bai, Bo Yu, Bowen Wu, Zhuoran Wang, Baoxun Wang |
Abstract | Generating structured query language (SQL) from natural language is an emerging research topic. This paper presents a new learning paradigm from indirect supervision of the answers to natural language questions, instead of SQL queries. This paradigm facilitates the acquisition of training data due to the abundant resources of question-answer pairs for various domains in the Internet, and expels the difficult SQL annotation job. An end-to-end neural model integrating with reinforcement learning is proposed to learn SQL generation policy within the answer-driven learning paradigm. The model is evaluated on datasets of different domains, including movie and academic publication. Experimental results show that our model outperforms the baseline models. |
Tasks | |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03195v1 |
http://arxiv.org/pdf/1809.03195v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-generate-structured-queries-from |
Repo | |
Framework | |
Stagewise Training Accelerates Convergence of Testing Error Over SGD
Title | Stagewise Training Accelerates Convergence of Testing Error Over SGD |
Authors | Zhuoning Yuan, Yan Yan, Rong Jin, Tianbao Yang |
Abstract | Stagewise training strategy is widely used for learning neural networks, which runs a stochastic algorithm (e.g., SGD) starting with a relatively large step size (aka learning rate) and geometrically decreasing the step size after a number of iterations. It has been observed that the stagewise SGD has much faster convergence than the vanilla SGD with a polynomially decaying step size in terms of both training error and testing error. {\it But how to explain this phenomenon has been largely ignored by existing studies.} This paper provides some theoretical evidence for explaining this faster convergence. In particular, we consider a stagewise training strategy for minimizing empirical risk that satisfies the Polyak-\L ojasiewicz (PL) condition, which has been observed/proved for neural networks and also holds for a broad family of convex functions. For convex loss functions and two classes of “nice-behaviored” non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error. Experiments on stagewise learning of deep residual networks exhibits that it satisfies one type of non-convexity assumption and therefore can be explained by our theory. Of independent interest, the testing error bounds for the considered non-convex loss functions are dimensionality and norm independent. |
Tasks | |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.03934v3 |
http://arxiv.org/pdf/1812.03934v3.pdf | |
PWC | https://paperswithcode.com/paper/stagewise-training-accelerates-convergence-of |
Repo | |
Framework | |
Differentially Private Confidence Intervals for Empirical Risk Minimization
Title | Differentially Private Confidence Intervals for Empirical Risk Minimization |
Authors | Yue Wang, Daniel Kifer, Jaewoo Lee |
Abstract | The process of data mining with differential privacy produces results that are affected by two types of noise: sampling noise due to data collection and privacy noise that is designed to prevent the reconstruction of sensitive information. In this paper, we consider the problem of designing confidence intervals for the parameters of a variety of differentially private machine learning models. The algorithms can provide confidence intervals that satisfy differential privacy (as well as the more recently proposed concentrated differential privacy) and can be used with existing differentially private mechanisms that train models using objective perturbation and output perturbation. |
Tasks | |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.03794v1 |
http://arxiv.org/pdf/1804.03794v1.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-confidence-intervals |
Repo | |
Framework | |
Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications
Title | Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications |
Authors | Fanhua Shang, James Cheng, Yuanyuan Liu, Zhi-Quan Luo, Zhouchen Lin |
Abstract | The heavy-tailed distributions of corrupted outliers and singular values of all channels in low-level vision have proven effective priors for many applications such as background modeling, photometric stereo and image alignment. And they can be well modeled by a hyper-Laplacian. However, the use of such distributions generally leads to challenging non-convex, non-smooth and non-Lipschitz problems, and makes existing algorithms very slow for large-scale applications. Together with the analytic solutions to lp-norm minimization with two specific values of p, i.e., p=1/2 and p=2/3, we propose two novel bilinear factor matrix norm minimization models for robust principal component analysis. We first define the double nuclear norm and Frobenius/nuclear hybrid norm penalties, and then prove that they are in essence the Schatten-1/2 and 2/3 quasi-norms, respectively, which lead to much more tractable and scalable Lipschitz optimization problems. Our experimental analysis shows that both our methods yield more accurate solutions than original Schatten quasi-norm minimization, even when the number of observations is very limited. Finally, we apply our penalties to various low-level vision problems, e.g., text removal, moving object detection, image alignment and inpainting, and show that our methods usually outperform the state-of-the-art methods. |
Tasks | Object Detection |
Published | 2018-10-11 |
URL | http://arxiv.org/abs/1810.05186v1 |
http://arxiv.org/pdf/1810.05186v1.pdf | |
PWC | https://paperswithcode.com/paper/bilinear-factor-matrix-norm-minimization-for |
Repo | |
Framework | |
Towards Smart City Innovation Under the Perspective of Software-Defined Networking, Artificial Intelligence and Big Data
Title | Towards Smart City Innovation Under the Perspective of Software-Defined Networking, Artificial Intelligence and Big Data |
Authors | Joberto S. B. Martins |
Abstract | Smart city projects address many of the current problems afflicting high populated areas and cities and, as such, are a target for government, institutions and private organizations that plan to explore its foreseen advantages. In technical terms, smart city projects present a complex set of requirements including a large number users with highly different and heterogeneous requirements. In this scenario, this paper proposes and analyses the impact and perspectives on adopting software-defined networking and artificial intelligence as innovative approaches for smart city project development and deployment. Big data is also considered as an inherent element of most smart city project that must be tackled. A framework layered view is proposed with a discussion about software-defined networking and machine learning impacts on innovation followed by a use case that demonstrates the potential benefits of cognitive learning for smart cities. It is argued that the complexity of smart city projects do require new innovative approaches that potentially result in more efficient and intelligent systems. |
Tasks | |
Published | 2018-10-27 |
URL | http://arxiv.org/abs/1810.11665v1 |
http://arxiv.org/pdf/1810.11665v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-smart-city-innovation-under-the |
Repo | |
Framework | |
New Insights into Bootstrapping for Bandits
Title | New Insights into Bootstrapping for Bandits |
Authors | Sharan Vaswani, Branislav Kveton, Zheng Wen, Anup Rao, Mark Schmidt, Yasin Abbasi-Yadkori |
Abstract | We investigate the use of bootstrapping in the bandit setting. We first show that the commonly used non-parametric bootstrapping (NPB) procedure can be provably inefficient and establish a near-linear lower bound on the regret incurred by it under the bandit model with Bernoulli rewards. We show that NPB with an appropriate amount of forced exploration can result in sub-linear albeit sub-optimal regret. As an alternative to NPB, we propose a weighted bootstrapping (WB) procedure. For Bernoulli rewards, WB with multiplicative exponential weights is mathematically equivalent to Thompson sampling (TS) and results in near-optimal regret bounds. Similarly, in the bandit setting with Gaussian rewards, we show that WB with additive Gaussian weights achieves near-optimal regret. Beyond these special cases, we show that WB leads to better empirical performance than TS for several reward distributions bounded on $[0,1]$. For the contextual bandit setting, we give practical guidelines that make bootstrapping simple and efficient to implement and result in good empirical performance on real-world datasets. |
Tasks | |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09793v1 |
http://arxiv.org/pdf/1805.09793v1.pdf | |
PWC | https://paperswithcode.com/paper/new-insights-into-bootstrapping-for-bandits |
Repo | |
Framework | |
DNN or k-NN: That is the Generalize vs. Memorize Question
Title | DNN or k-NN: That is the Generalize vs. Memorize Question |
Authors | Gilad Cohen, Guillermo Sapiro, Raja Giryes |
Abstract | This paper studies the relationship between the classification performed by deep neural networks (DNNs) and the decision of various classical classifiers, namely k-nearest neighbours (k-NN), support vector machines (SVM) and logistic regression (LR), at various layers of the network. This comparison provides us with new insights as to the ability of neural networks to both memorize the training data and generalize to new data at the same time, where k-NN serves as the ideal estimator that perfectly memorizes the data. We show that memorization of non-generalizing networks happens only at the last layers. Moreover, the behavior of DNNs compared to the linear classifiers SVM and LR is quite the same on the training and test data regardless of whether the network generalizes. On the other hand, the similarity to k-NN holds only at the absence of overfitting. Our results suggests that k-NN behavior of the network on new data is a sign of generalization. Moreover, it shows that memorization and generalization, which are traditionally considered to be contradicting to each other, are compatible and complementary. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06822v6 |
http://arxiv.org/pdf/1805.06822v6.pdf | |
PWC | https://paperswithcode.com/paper/dnn-or-k-nn-that-is-the-generalize-vs |
Repo | |
Framework | |
Virtualization of tissue staining in digital pathology using an unsupervised deep learning approach
Title | Virtualization of tissue staining in digital pathology using an unsupervised deep learning approach |
Authors | Amal Lahiani, Jacob Gildenblat, Irina Klaman, Shadi Albarqouni, Nassir Navab, Eldad Klaiman |
Abstract | Histopathological evaluation of tissue samples is a key practice in patient diagnosis and drug development, especially in oncology. Historically, Hematoxylin and Eosin (H&E) has been used by pathologists as a gold standard staining. However, in many cases, various target specific stains, including immunohistochemistry (IHC), are needed in order to highlight specific structures in the tissue. As tissue is scarce and staining procedures are tedious, it would be beneficial to generate images of stained tissue virtually. Virtual staining could also generate in-silico multiplexing of different stains on the same tissue segment. In this paper, we present a sample application that generates FAP-CK virtual IHC images from Ki67-CD8 real IHC images using an unsupervised deep learning approach based on CycleGAN. We also propose a method to deal with tiling artifacts caused by normalization layers and we validate our approach by comparing the results of tissue analysis algorithms for virtual and real images. |
Tasks | |
Published | 2018-10-15 |
URL | http://arxiv.org/abs/1810.06415v1 |
http://arxiv.org/pdf/1810.06415v1.pdf | |
PWC | https://paperswithcode.com/paper/virtualization-of-tissue-staining-in-digital |
Repo | |
Framework | |