Paper Group ANR 588
On the asymptotic optimality of the comb strategy for prediction with expert advice. Visually Grounded Neural Syntax Acquisition. Differential Similarity in Higher Dimensional Spaces: Theory and Applications. Morphological Segmentation Inside-Out. Enabling Explainable Fusion in Deep Learning with Fuzzy Integral Neural Networks. Face-to-Parameter Tr …
On the asymptotic optimality of the comb strategy for prediction with expert advice
Title | On the asymptotic optimality of the comb strategy for prediction with expert advice |
Authors | Erhan Bayraktar, Ibrahim Ekren, Yili Zhang |
Abstract | For the problem of prediction with expert advice in the adversarial setting with geometric stopping, we compute the exact leading order expansion for the long time behavior of the value function. Then, we use this expansion to prove that as conjectured in Gravin et al. [12], the comb strategies are indeed asymptotically optimal for the adversary in the case of 4 experts. |
Tasks | |
Published | 2019-02-06 |
URL | https://arxiv.org/abs/1902.02368v2 |
https://arxiv.org/pdf/1902.02368v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-asymptotic-optimality-of-the-comb |
Repo | |
Framework | |
Visually Grounded Neural Syntax Acquisition
Title | Visually Grounded Neural Syntax Acquisition |
Authors | Haoyue Shi, Jiayuan Mao, Kevin Gimpel, Karen Livescu |
Abstract | We present the Visually Grounded Neural Syntax Learner (VG-NSL), an approach for learning syntactic representations and structures without any explicit supervision. The model learns by looking at natural images and reading paired captions. VG-NSL generates constituency parse trees of texts, recursively composes representations for constituents, and matches them with images. We define concreteness of constituents by their matching scores with images, and use it to guide the parsing of text. Experiments on the MSCOCO data set show that VG-NSL outperforms various unsupervised parsing approaches that do not use visual grounding, in terms of F1 scores against gold parse trees. We find that VGNSL is much more stable with respect to the choice of random initialization and the amount of training data. We also find that the concreteness acquired by VG-NSL correlates well with a similar measure defined by linguists. Finally, we also apply VG-NSL to multiple languages in the Multi30K data set, showing that our model consistently outperforms prior unsupervised approaches. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.02890v2 |
https://arxiv.org/pdf/1906.02890v2.pdf | |
PWC | https://paperswithcode.com/paper/visually-grounded-neural-syntax-acquisition |
Repo | |
Framework | |
Differential Similarity in Higher Dimensional Spaces: Theory and Applications
Title | Differential Similarity in Higher Dimensional Spaces: Theory and Applications |
Authors | L. Thorne McCarty |
Abstract | This paper presents an extension and an elaboration of the theory of differential similarity, which was originally proposed in arXiv:1401.2411 [cs.LG]. The goal is to develop an algorithm for clustering and coding that combines a geometric model with a probabilistic model in a principled way. For simplicity, the geometric model in the earlier paper was restricted to the three-dimensional case. The present paper removes this restriction, and considers the full $n$-dimensional case. Although the mathematical model is the same, the strategies for computing solutions in the $n$-dimensional case are different, and one of the main purposes of this paper is to develop and analyze these strategies. Another main purpose is to devise techniques for estimating the parameters of the model from sample data, again in $n$ dimensions. We evaluate the solution strategies and the estimation techniques by applying them to two familiar real-world examples: the classical MNIST dataset and the CIFAR-10 dataset. |
Tasks | |
Published | 2019-02-10 |
URL | http://arxiv.org/abs/1902.03667v1 |
http://arxiv.org/pdf/1902.03667v1.pdf | |
PWC | https://paperswithcode.com/paper/differential-similarity-in-higher-dimensional |
Repo | |
Framework | |
Morphological Segmentation Inside-Out
Title | Morphological Segmentation Inside-Out |
Authors | Ryan Cotterell, Arun Kumar, Hinrich Schütze |
Abstract | Morphological segmentation has traditionally been modeled with non-hierarchical models, which yield flat segmentations as output. In many cases, however, proper morphological analysis requires hierarchical structure – especially in the case of derivational morphology. In this work, we introduce a discriminative, joint model of morphological segmentation along with the orthographic changes that occur during word formation. To the best of our knowledge, this is the first attempt to approach discriminative segmentation with a context-free model. Additionally, we release an annotated treebank of 7454 English words with constituency parses, encouraging future research in this area. |
Tasks | Morphological Analysis |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04916v1 |
https://arxiv.org/pdf/1911.04916v1.pdf | |
PWC | https://paperswithcode.com/paper/morphological-segmentation-inside-out-1 |
Repo | |
Framework | |
Enabling Explainable Fusion in Deep Learning with Fuzzy Integral Neural Networks
Title | Enabling Explainable Fusion in Deep Learning with Fuzzy Integral Neural Networks |
Authors | Muhammad Aminul Islam, Derek T. Anderson, Anthony J. Pinar, Timothy C. Havens, Grant Scott, James M. Keller |
Abstract | Information fusion is an essential part of numerous engineering systems and biological functions, e.g., human cognition. Fusion occurs at many levels, ranging from the low-level combination of signals to the high-level aggregation of heterogeneous decision-making processes. While the last decade has witnessed an explosion of research in deep learning, fusion in neural networks has not observed the same revolution. Specifically, most neural fusion approaches are ad hoc, are not understood, are distributed versus localized, and/or explainability is low (if present at all). Herein, we prove that the fuzzy Choquet integral (ChI), a powerful nonlinear aggregation function, can be represented as a multi-layer network, referred to hereafter as ChIMP. We also put forth an improved ChIMP (iChIMP) that leads to a stochastic gradient descent-based optimization in light of the exponential number of ChI inequality constraints. An additional benefit of ChIMP/iChIMP is that it enables eXplainable AI (XAI). Synthetic validation experiments are provided and iChIMP is applied to the fusion of a set of heterogeneous architecture deep models in remote sensing. We show an improvement in model accuracy and our previously established XAI indices shed light on the quality of our data, model, and its decisions. |
Tasks | Decision Making |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04394v1 |
https://arxiv.org/pdf/1905.04394v1.pdf | |
PWC | https://paperswithcode.com/paper/enabling-explainable-fusion-in-deep-learning |
Repo | |
Framework | |
Face-to-Parameter Translation for Game Character Auto-Creation
Title | Face-to-Parameter Translation for Game Character Auto-Creation |
Authors | Tianyang Shi, Yi Yuan, Changjie Fan, Zhengxia Zou, Zhenwei Shi, Yong Liu |
Abstract | Character customization system is an important component in Role-Playing Games (RPGs), where players are allowed to edit the facial appearance of their in-game characters with their own preferences rather than using default templates. This paper proposes a method for automatically creating in-game characters of players according to an input face photo. We formulate the above “artistic creation” process under a facial similarity measurement and parameter searching paradigm by solving an optimization problem over a large set of physically meaningful facial parameters. To effectively minimize the distance between the created face and the real one, two loss functions, i.e. a “discriminative loss” and a “facial content loss”, are specifically designed. As the rendering process of a game engine is not differentiable, a generative network is further introduced as an “imitator” to imitate the physical behavior of the game engine so that the proposed method can be implemented under a neural style transfer framework and the parameters can be optimized by gradient descent. Experimental results demonstrate that our method achieves a high degree of generation similarity between the input face photo and the created in-game character in terms of both global appearance and local details. Our method has been deployed in a new game last year and has now been used by players over 1 million times. |
Tasks | Style Transfer |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01064v1 |
https://arxiv.org/pdf/1909.01064v1.pdf | |
PWC | https://paperswithcode.com/paper/face-to-parameter-translation-for-game |
Repo | |
Framework | |
Divided We Stand: A Novel Residual Group Attention Mechanism for Medical Image Segmentation
Title | Divided We Stand: A Novel Residual Group Attention Mechanism for Medical Image Segmentation |
Authors | Chaitanya Kaul, Nick Pears, Suresh Manandhar |
Abstract | Given that convolutional neural networks extract features via learning convolution kernels, it makes sense to design better kernels which can in turn lead to better feature extraction. In this paper, we propose a new residual block for convolutional neural networks in the context of medical image segmentation. We combine attention mechanisms with group convolutions to create our group attention mechanism, which forms the fundamental building block of FocusNetAlpha - our convolutional autoencoder. We adapt a hybrid loss based on balanced cross entropy, tversky loss and the adaptive logarithmic loss to create a loss function that converges faster and more accurately to the minimum solution. On comparison with the different residual block variants, we observed a 5.6% increase in the IoU on the ISIC 2017 dataset over the basic residual block and a 1.3% increase over the resneXt group convolution block. Our results show that FocusNetAlpha achieves state-of-the-art results across all metrics for the ISIC 2018 melanoma segmentation, cell nuclei segmentation and the DRIVE retinal blood vessel segmentation datasets with fewer parameters and FLOPs. Our code and pre-trained models will be publicly available on GitHub to maximize reproducibility. |
Tasks | Medical Image Segmentation, Semantic Segmentation |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02079v1 |
https://arxiv.org/pdf/1912.02079v1.pdf | |
PWC | https://paperswithcode.com/paper/divided-we-stand-a-novel-residual-group |
Repo | |
Framework | |
Towards Robust Learning-Based Pose Estimation of Noncooperative Spacecraft
Title | Towards Robust Learning-Based Pose Estimation of Noncooperative Spacecraft |
Authors | Tae Ha Park, Sumant Sharma, Simone D’Amico |
Abstract | This work presents a novel Convolutional Neural Network (CNN) architecture and a training procedure to enable robust and accurate pose estimation of a noncooperative spacecraft. First, a new CNN architecture is introduced that has scored a fourth place in the recent Pose Estimation Challenge hosted by Stanford’s Space Rendezvous Laboratory (SLAB) and the Advanced Concepts Team (ACT) of the European Space Agency (ESA). The proposed architecture first detects the object by regressing a 2D bounding box, then a separate network regresses the 2D locations of the known surface keypoints from an image of the target cropped around the detected Region-of-Interest (RoI). In a single-image pose estimation problem, the extracted 2D keypoints can be used in conjunction with corresponding 3D model coordinates to compute relative pose via the Perspective-n-Point (PnP) problem. These keypoint locations have known correspondences to those in the 3D model, since the CNN is trained to predict the corners in a pre-defined order, allowing for bypassing the computationally expensive feature matching processes. This work also introduces and explores the texture randomization to train a CNN for spaceborne applications. Specifically, Neural Style Transfer (NST) is applied to randomize the texture of the spacecraft in synthetically rendered images. It is shown that using the texture-randomized images of spacecraft for training improves the network’s performance on spaceborne images without exposure to them during training. It is also shown that when using the texture-randomized spacecraft images during training, regressing 3D bounding box corners leads to better performance on spaceborne images than regressing surface keypoints, as NST inevitably distorts the spacecraft’s geometric features to which the surface keypoints have closer relation. |
Tasks | Pose Estimation, Style Transfer |
Published | 2019-09-01 |
URL | https://arxiv.org/abs/1909.00392v1 |
https://arxiv.org/pdf/1909.00392v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-robust-learning-based-pose-estimation |
Repo | |
Framework | |
Implicit Deep Latent Variable Models for Text Generation
Title | Implicit Deep Latent Variable Models for Text Generation |
Authors | Le Fang, Chunyuan Li, Jianfeng Gao, Wen Dong, Changyou Chen |
Abstract | Deep latent variable models (LVM) such as variational auto-encoder (VAE) have recently played an important role in text generation. One key factor is the exploitation of smooth latent structures to guide the generation. However, the representation power of VAEs is limited due to two reasons: (1) the Gaussian assumption is often made on the variational posteriors; and meanwhile (2) a notorious “posterior collapse” issue occurs. In this paper, we advocate sample-based representations of variational distributions for natural language, leading to implicit latent features, which can provide flexible representation power compared with Gaussian-based posteriors. We further develop an LVM to directly match the aggregated posterior to the prior. It can be viewed as a natural extension of VAEs with a regularization of maximizing mutual information, mitigating the “posterior collapse” issue. We demonstrate the effectiveness and versatility of our models in various text generation scenarios, including language modeling, unaligned style transfer, and dialog response generation. The source code to reproduce our experimental results is available on GitHub. |
Tasks | Language Modelling, Latent Variable Models, Style Transfer, Text Generation |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11527v3 |
https://arxiv.org/pdf/1908.11527v3.pdf | |
PWC | https://paperswithcode.com/paper/implicit-deep-latent-variable-models-for-text |
Repo | |
Framework | |
Unsupervised Multi-modal Style Transfer for Cardiac MR Segmentation
Title | Unsupervised Multi-modal Style Transfer for Cardiac MR Segmentation |
Authors | Chen Chen, Cheng Ouyang, Giacomo Tarroni, Jo Schlemper, Huaqi Qiu, Wenjia Bai, Daniel Rueckert |
Abstract | In this work, we present a fully automatic method to segment cardiac structures from late-gadolinium enhanced (LGE) images without using labelled LGE data for training, but instead by transferring the anatomical knowledge and features learned on annotated balanced steady-state free precession (bSSFP) images, which are easier to acquire. Our framework mainly consists of two neural networks: a multi-modal image translation network for style transfer and a cascaded segmentation network for image segmentation. The multi-modal image translation network generates realistic and diverse synthetic LGE images conditioned on a single annotated bSSFP image, forming a synthetic LGE training set. This set is then utilized to fine-tune the segmentation network pre-trained on labelled bSSFP images, achieving the goal of unsupervised LGE image segmentation. In particular, the proposed cascaded segmentation network is able to produce accurate segmentation by taking both shape prior and image appearance into account, achieving an average Dice score of 0.92 for the left ventricle, 0.83 for the myocardium, and 0.88 for the right ventricle on the test set. |
Tasks | Semantic Segmentation, Style Transfer |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07344v3 |
https://arxiv.org/pdf/1908.07344v3.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-multi-modal-style-transfer-for |
Repo | |
Framework | |
DpgMedia2019: A Dutch News Dataset for Partisanship Detection
Title | DpgMedia2019: A Dutch News Dataset for Partisanship Detection |
Authors | Chia-Lun Yeh, Babak Loni, Mariëlle Hendriks, Henrike Reinhardt, Anne Schuth |
Abstract | We present a new Dutch news dataset with labeled partisanship. The dataset contains more than 100K articles that are labeled on the publisher level and 776 articles that were crowdsourced using an internal survey platform and labeled on the article level. In this paper, we document our original motivation, the collection and annotation process, limitations, and applications. |
Tasks | |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.02322v1 |
https://arxiv.org/pdf/1908.02322v1.pdf | |
PWC | https://paperswithcode.com/paper/dpgmedia2019-a-dutch-news-dataset-for |
Repo | |
Framework | |
Biometric Presentation Attack Detection: Beyond the Visible Spectrum
Title | Biometric Presentation Attack Detection: Beyond the Visible Spectrum |
Authors | Ruben Tolosana, Marta Gomez-Barrero, Christoph Busch, Javier Ortega-Garcia |
Abstract | The increased need for unattended authentication in multiple scenarios has motivated a wide deployment of biometric systems in the last few years. This has in turn led to the disclosure of security concerns specifically related to biometric systems. Among them, Presentation Attacks (PAs, i.e., attempts to log into the system with a fake biometric characteristic or presentation attack instrument) pose a severe threat to the security of the system: any person could eventually fabricate or order a gummy finger or face mask to impersonate someone else. The biometrics community has thus made a considerable effort to the development of automatic Presentation Attack Detection (PAD) mechanisms, for instance through the international LivDet competitions. In this context, we present a novel fingerprint PAD scheme based on $i)$ a new capture device able to acquire images within the short wave infrared (SWIR) spectrum, and $ii)$ an in-depth analysis of several state-of-the-art techniques based on both handcrafted and deep learning features. The approach is evaluated on a database comprising over 4700 samples, stemming from 562 different subjects and 35 different presentation attack instrument (PAI) species. The results show the soundness of the proposed approach with a detection equal error rate (D-EER) as low as 1.36% even in a realistic scenario where five different PAI species are considered only for testing purposes (i.e., unknown attacks). |
Tasks | |
Published | 2019-02-28 |
URL | http://arxiv.org/abs/1902.11065v1 |
http://arxiv.org/pdf/1902.11065v1.pdf | |
PWC | https://paperswithcode.com/paper/biometric-presentation-attack-detection |
Repo | |
Framework | |
Alpha MAML: Adaptive Model-Agnostic Meta-Learning
Title | Alpha MAML: Adaptive Model-Agnostic Meta-Learning |
Authors | Harkirat Singh Behl, Atılım Güneş Baydin, Philip H. S. Torr |
Abstract | Model-agnostic meta-learning (MAML) is a meta-learning technique to train a model on a multitude of learning tasks in a way that primes the model for few-shot learning of new tasks. The MAML algorithm performs well on few-shot learning problems in classification, regression, and fine-tuning of policy gradients in reinforcement learning, but comes with the need for costly hyperparameter tuning for training stability. We address this shortcoming by introducing an extension to MAML, called Alpha MAML, to incorporate an online hyperparameter adaptation scheme that eliminates the need to tune meta-learning and learning rates. Our results with the Omniglot database demonstrate a substantial reduction in the need to tune MAML training hyperparameters and improvement to training stability with less sensitivity to hyperparameter choice. |
Tasks | Few-Shot Learning, Meta-Learning, Omniglot |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07435v1 |
https://arxiv.org/pdf/1905.07435v1.pdf | |
PWC | https://paperswithcode.com/paper/alpha-maml-adaptive-model-agnostic-meta |
Repo | |
Framework | |
Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions
Title | Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions |
Authors | Zhi-Xiu Ye, Zhen-Hua Ling |
Abstract | This paper presents a neural relation extraction method to deal with the noisy training data generated by distant supervision. Previous studies mainly focus on sentence-level de-noising by designing neural networks with intra-bag attentions. In this paper, both intra-bag and inter-bag attentions are considered in order to deal with the noise at sentence-level and bag-level respectively. First, relation-aware bag representations are calculated by weighting sentence embeddings using intra-bag attentions. Here, each possible relation is utilized as the query for attention calculation instead of only using the target relation in conventional methods. Furthermore, the representation of a group of bags in the training set which share the same relation label is calculated by weighting bag representations using a similarity-based inter-bag attention module. Finally, a bag group is utilized as a training sample when building our relation extractor. Experimental results on the New York Times dataset demonstrate the effectiveness of our proposed intra-bag and inter-bag attention modules. Our method also achieves better relation extraction accuracy than state-of-the-art methods on this dataset. |
Tasks | Relation Extraction, Sentence Embeddings |
Published | 2019-03-30 |
URL | http://arxiv.org/abs/1904.00143v1 |
http://arxiv.org/pdf/1904.00143v1.pdf | |
PWC | https://paperswithcode.com/paper/distant-supervision-relation-extraction-with |
Repo | |
Framework | |
Grinding the Space: Learning to Classify Against Strategic Agents
Title | Grinding the Space: Learning to Classify Against Strategic Agents |
Authors | Yiling Chen, Yang Liu, Chara Podimata |
Abstract | We study the problem of online learning in strategic classification settings from the perspective of the learner, who is repeatedly facing myopically rational strategic agents. We model this interplay as a repeated Stackelberg game, where at each timestep the learner deploys a high-dimensional linear classifier first and an agent, after observing the classifier, along with his real feature vector, and according to his underlying utility function, best-responds with a (potentially altered) feature vector. We measure the performance of the learner in terms of Stackelberg regret for her 0-1 loss function. Surprisingly, we prove that in strategic settings like the one considered in this paper there exist worst-case scenarios, where any sequence of actions providing sublinear external regret might result in linear Stackelberg regret and vice versa. We then provide the Grinder Algorithm, an adaptive discretization algorithm, potentially of independent interest in the online learning community, and prove its data-dependent upper bound on the Stackelberg regret given oracle access, while being computationally efficient. We also provide a nearly matching lower bound for the problem of strategic classification. We complement our theoretical analysis with simulation results, which suggest that our algorithm outperforms the benchmarks, even given access to approximation oracles. Our results advance the known state-of-the-art results in the growing literature of online learning from revealed preferences, which has so far focused on smoother utility and loss functions from the perspective of the agents and the learner respectively. |
Tasks | |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.04004v1 |
https://arxiv.org/pdf/1911.04004v1.pdf | |
PWC | https://paperswithcode.com/paper/grinding-the-space-learning-to-classify |
Repo | |
Framework | |