Paper Group AWR 89
Subsampling Generative Adversarial Networks: Density Ratio Estimation in Feature Space with Softplus Loss. Towards a Robust Deep Neural Network in Texts: A Survey. Unconstrained Facial Expression Transfer using Style-based Generator. A New Loss Function for CNN Classifier Based on Pre-defined Evenly-Distributed Class Centroids. Restoring ancient te …
Subsampling Generative Adversarial Networks: Density Ratio Estimation in Feature Space with Softplus Loss
Title | Subsampling Generative Adversarial Networks: Density Ratio Estimation in Feature Space with Softplus Loss |
Authors | Xin Ding, Z. Jane Wang, William J. Welch |
Abstract | Filtering out unrealistic images from trained generative adversarial networks (GANs) has attracted considerable attention recently. Two density ratio based subsampling methods—Discriminator Rejection Sampling (DRS) and Metropolis-Hastings GAN (MH-GAN)—were recently proposed, and their effectiveness in improving GANs was demonstrated on multiple datasets. However, DRS and MH-GAN are based on discriminator based density ratio estimation (DRE) methods, so they may not work well if the discriminator in the trained GAN is far from optimal. Moreover, they do not apply to some GANs (e.g., MMD-GAN). In this paper, we propose a novel Softplus (SP) loss for DRE. Based on it, we develop a sample-based DRE method in a feature space learned by a specially designed and pre-trained ResNet-34 (DRE-F-SP). We derive the rate of convergence of a density ratio model trained under the SP loss. Then, we propose three different density ratio subsampling methods (DRE-F-SP+RS, DRE-F-SP+MH, and DRE-F-SP+SIR) for GANs based on DRE-F-SP. Our subsampling methods do not rely on the optimality of the discriminator and are suitable for all types of GANs. We empirically show our subsampling approach can substantially outperform DRS and MH-GAN on a synthetic dataset and the CIFAR-10 dataset, using multiple GANs. |
Tasks | |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.10670v5 |
https://arxiv.org/pdf/1909.10670v5.pdf | |
PWC | https://paperswithcode.com/paper/subsampling-generative-adversarial-networks |
Repo | https://github.com/UBCDingXin/DDRE_Sampling_GANs |
Framework | pytorch |
Towards a Robust Deep Neural Network in Texts: A Survey
Title | Towards a Robust Deep Neural Network in Texts: A Survey |
Authors | Wenqi Wang, Lina Wang, Run Wang, Zhibo Wang, Aoshuang Ye |
Abstract | Deep neural networks (DNNs) have achieved remarkable success in various tasks (e.g., image classification, speech recognition, and natural language processing). However, researches have shown that DNN models are vulnerable to adversarial examples, which cause incorrect predictions by adding imperceptible perturbations into normal inputs. Studies on adversarial examples in image domain have been well investigated, but in texts the research is not enough, let alone a comprehensive survey in this field. In this paper, we aim at presenting a comprehensive understanding of adversarial attacks and corresponding mitigation strategies in texts. Specifically, we first give a taxonomy of adversarial attacks and defenses in texts from the perspective of different natural language processing (NLP) tasks, and then introduce how to build a robust DNN model via testing and verification. Finally, we discuss the existing challenges of adversarial attacks and defenses in texts and present the future research directions in this emerging field. |
Tasks | Image Classification, Speech Recognition |
Published | 2019-02-12 |
URL | https://arxiv.org/abs/1902.07285v5 |
https://arxiv.org/pdf/1902.07285v5.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-adversarial-attacks-and-defenses |
Repo | https://github.com/ShuaichiLi/Chinese-sentence-similarity-task |
Framework | none |
Unconstrained Facial Expression Transfer using Style-based Generator
Title | Unconstrained Facial Expression Transfer using Style-based Generator |
Authors | Chao Yang, Ser-Nam Lim |
Abstract | Facial expression transfer and reenactment has been an important research problem given its applications in face editing, image manipulation, and fabricated videos generation. We present a novel method for image-based facial expression transfer, leveraging the recent style-based GAN shown to be very effective for creating realistic looking images. Given two face images, our method can create plausible results that combine the appearance of one image and the expression of the other. To achieve this, we first propose an optimization procedure based on StyleGAN to infer hierarchical style vector from an image that disentangle different attributes of the face. We further introduce a linear combination scheme that fuses the style vectors of the two given images and generate a new face that combines the expression and appearance of the inputs. Our method can create high-quality synthesis with accurate facial reenactment. Unlike many existing methods, we do not rely on geometry annotations, and can be applied to unconstrained facial images of any identities without the need for retraining, making it feasible to generate large-scale expression-transferred results. |
Tasks | |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/1912.06253v1 |
https://arxiv.org/pdf/1912.06253v1.pdf | |
PWC | https://paperswithcode.com/paper/unconstrained-facial-expression-transfer |
Repo | https://github.com/pacifinapacific/StyleGAN_LatentEditor |
Framework | pytorch |
A New Loss Function for CNN Classifier Based on Pre-defined Evenly-Distributed Class Centroids
Title | A New Loss Function for CNN Classifier Based on Pre-defined Evenly-Distributed Class Centroids |
Authors | Qiuyu Zhu, Pengju Zhang, Xin Ye |
Abstract | With the development of convolutional neural networks (CNNs) in recent years, the network structure has become more and more complex and varied, and has achieved very good results in pattern recognition, image classification, object detection and tracking. For CNNs used for image classification, in addition to the network structure, more and more research is now focusing on the improvement of the loss function, so as to enlarge the inter-class feature differences, and reduce the intra-class feature variations as soon as possible. Besides the traditional Softmax, typical loss functions include L-Softmax, AM-Softmax, ArcFace, and Center loss, etc. Based on the concept of predefined evenly-distributed class centroids (PEDCC) in CSAE network, this paper proposes a PEDCC-based loss function called PEDCC-Loss, which can make the inter-class distance maximal and intra-class distance small enough in hidden feature space. Multiple experiments on image classification and face recognition have proved that our method achieve the best recognition accuracy, and network training is stable and easy to converge. Code is available in https://github.com/ZLeopard/PEDCC-Loss |
Tasks | Face Recognition, Image Classification, Object Detection |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06008v2 |
http://arxiv.org/pdf/1904.06008v2.pdf | |
PWC | https://paperswithcode.com/paper/a-new-loss-function-for-cnn-classifier-based |
Repo | https://github.com/ZLeopard/PEDCC-Loss |
Framework | pytorch |
Restoring ancient text using deep learning: a case study on Greek epigraphy
Title | Restoring ancient text using deep learning: a case study on Greek epigraphy |
Authors | Yannis Assael, Thea Sommerschield, Jonathan Prag |
Abstract | Ancient history relies on disciplines such as epigraphy, the study of ancient inscribed texts, for evidence of the recorded past. However, these texts, “inscriptions”, are often damaged over the centuries, and illegible parts of the text must be restored by specialists, known as epigraphists. This work presents Pythia, the first ancient text restoration model that recovers missing characters from a damaged text input using deep neural networks. Its architecture is carefully designed to handle long-term context information, and deal efficiently with missing or corrupted character and word representations. To train it, we wrote a non-trivial pipeline to convert PHI, the largest digital corpus of ancient Greek inscriptions, to machine actionable text, which we call PHI-ML. On PHI-ML, Pythia’s predictions achieve a 30.1% character error rate, compared to the 57.3% of human epigraphists. Moreover, in 73.5% of cases the ground-truth sequence was among the Top-20 hypotheses of Pythia, which effectively demonstrates the impact of this assistive method on the field of digital epigraphy, and sets the state-of-the-art in ancient text restoration. |
Tasks | Ancient Text Restoration |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06262v1 |
https://arxiv.org/pdf/1910.06262v1.pdf | |
PWC | https://paperswithcode.com/paper/restoring-ancient-text-using-deep-learning-a |
Repo | https://github.com/sommerschield/ancient-text-restoration |
Framework | none |
Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation
Title | Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation |
Authors | Jia Li, Wen Su, Zengfu Wang |
Abstract | We rethink a well-know bottom-up approach for multi-person pose estimation and propose an improved one. The improved approach surpasses the baseline significantly thanks to (1) an intuitional yet more sensible representation, which we refer to as body parts to encode the connection information between keypoints, (2) an improved stacked hourglass network with attention mechanisms, (3) a novel focal L2 loss which is dedicated to hard keypoint and keypoint association (body part) mining, and (4) a robust greedy keypoint assignment algorithm for grouping the detected keypoints into individual poses. Our approach not only works straightforwardly but also outperforms the baseline by about 15% in average precision and is comparable to the state of the art on the MS-COCO test-dev dataset. The code and pre-trained models are publicly available online. |
Tasks | Multi-Person Pose Estimation, Pose Estimation |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10529v1 |
https://arxiv.org/pdf/1911.10529v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-pose-rethinking-and-improving-a-bottom |
Repo | https://github.com/osmr/imgclsmob |
Framework | mxnet |
Deep High-Resolution Representation Learning for Human Pose Estimation
Title | Deep High-Resolution Representation Learning for Human Pose Estimation |
Authors | Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang |
Abstract | This is an official pytorch implementation of Deep High-Resolution Representation Learning for Human Pose Estimation. In this work, we are interested in the human pose estimation problem with a focus on learning reliable high-resolution representations. Most existing methods recover high-resolution representations from low-resolution representations produced by a high-to-low resolution network. Instead, our proposed network maintains high-resolution representations through the whole process. We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel. We conduct repeated multi-scale fusions such that each of the high-to-low resolution representations receives information from other parallel representations over and over, leading to rich high-resolution representations. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise. We empirically demonstrate the effectiveness of our network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset. The code and models have been publicly available at \url{https://github.com/leoxiaobin/deep-high-resolution-net.pytorch}. |
Tasks | Instance Segmentation, Keypoint Detection, Multi-Person Pose Estimation, Object Detection, Pose Estimation, Pose Tracking, Representation Learning |
Published | 2019-02-25 |
URL | http://arxiv.org/abs/1902.09212v1 |
http://arxiv.org/pdf/1902.09212v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-high-resolution-representation-learning |
Repo | https://github.com/HRNet/HRNet-Facial-Landmark-Detection |
Framework | pytorch |
Joint Effects of Context and User History for Predicting Online Conversation Re-entries
Title | Joint Effects of Context and User History for Predicting Online Conversation Re-entries |
Authors | Xingshan Zeng, Jing Li, Lu Wang, Kam-Fai Wong |
Abstract | As the online world continues its exponential growth, interpersonal communication has come to play an increasingly central role in opinion formation and change. In order to help users better engage with each other online, we study a challenging problem of re-entry prediction foreseeing whether a user will come back to a conversation they once participated in. We hypothesize that both the context of the ongoing conversations and the users’ previous chatting history will affect their continued interests in future engagement. Specifically, we propose a neural framework with three main layers, each modeling context, user history, and interactions between them, to explore how the conversation context and user chatting history jointly result in their re-entry behavior. We experiment with two large-scale datasets collected from Twitter and Reddit. Results show that our proposed framework with bi-attention achieves an F1 score of 61.1 on Twitter conversations, outperforming the state-of-the-art methods from previous work. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01185v1 |
https://arxiv.org/pdf/1906.01185v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-effects-of-context-and-user-history-for |
Repo | https://github.com/zxshamson/re-entry-prediction |
Framework | pytorch |
Sparse Optimization on Measures with Over-parameterized Gradient Descent
Title | Sparse Optimization on Measures with Over-parameterized Gradient Descent |
Authors | Lenaic Chizat |
Abstract | Minimizing a convex function of a measure with a sparsity-inducing penalty is a typical problem arising, e.g., in sparse spikes deconvolution or two-layer neural networks training. We show that this problem can be solved by discretizing the measure and running non-convex gradient descent on the positions and weights of the particles. For measures on a $d$-dimensional manifold and under some non-degeneracy assumptions, this leads to a global optimization algorithm with a complexity scaling as $\log(1/\epsilon)$ in the desired accuracy $\epsilon$, instead of $\epsilon^{-d}$ for convex methods. The key theoretical tools are a local convergence analysis in Wasserstein space and an analysis of a perturbed mirror descent in the space of measures. Our bounds involve quantities that are exponential in $d$ which is unavoidable under our assumptions. |
Tasks | |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10300v1 |
https://arxiv.org/pdf/1907.10300v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-optimization-on-measures-with-over |
Repo | https://github.com/lchizat/2019-sparse-optim-measures |
Framework | none |
Pose Neural Fabrics Search
Title | Pose Neural Fabrics Search |
Authors | Sen Yang, Wankou Yang, Zhen Cui |
Abstract | Neural Architecture Search (NAS) technologies have been successfully performed for efficient neural architectures for tasks such as image classification and semantic segmentation. However, existing works implement NAS for target tasks independently of domain knowledge and focus only on searching for an architecture to replace the human-designed network in a common pipeline. \emph{Can we exploit human prior knowledge to guide NAS?} To address it, we propose a framework, named Pose Neural Fabrics Search (PNFS), introducing prior knowledge of body structure into NAS for human pose estimation. We lead a new neural architecture search space, by parameterizing Cell-based Neural Fabric, to learn micro as well as macro neural architecture using a differentiable search strategy. To take advantage of part-based structural knowledge of the human body and learning capability of NAS, global pose constraint relationships are modeled as multiple part representations, each of which is predicted by a personalized Cell-based Neural Fabric. In part representation, we view human skeleton keypoints as entities by representing them as vectors at image locations, expecting it to capture keypoint’s feature in a relaxed vector space. The experiments on MPII and MS-COCO datasets demonstrate that PNFS\footnote{Code is available at \url{https://github.com/yangsenius/PoseNFS}} can achieve comparable performance to state-of-the-art methods, with fewer parameters and lower computational complexity. |
Tasks | Image Classification, Keypoint Detection, Neural Architecture Search, Pose Estimation, Semantic Segmentation |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07068v3 |
https://arxiv.org/pdf/1909.07068v3.pdf | |
PWC | https://paperswithcode.com/paper/pose-neural-fabrics-search |
Repo | https://github.com/yangsenius/Pose-Neural-Fabrics-Search |
Framework | pytorch |
Exact Rate-Distortion in Autoencoders via Echo Noise
Title | Exact Rate-Distortion in Autoencoders via Echo Noise |
Authors | Rob Brekelmans, Daniel Moyer, Aram Galstyan, Greg Ver Steeg |
Abstract | Compression is at the heart of effective representation learning. However, lossy compression is typically achieved through simple parametric models like Gaussian noise to preserve analytic tractability, and the limitations this imposes on learning are largely unexplored. Further, the Gaussian prior assumptions in models such as variational autoencoders (VAEs) provide only an upper bound on the compression rate in general. We introduce a new noise channel, \emph{Echo noise}, that admits a simple, exact expression for mutual information for arbitrary input distributions. The noise is constructed in a data-driven fashion that does not require restrictive distributional assumptions. With its complex encoding mechanism and exact rate regularization, Echo leads to improved bounds on log-likelihood and dominates $\beta$-VAEs across the achievable range of rate-distortion trade-offs. Further, we show that Echo noise can outperform flow-based methods without the need to train additional distributional transformations. |
Tasks | Representation Learning |
Published | 2019-04-15 |
URL | https://arxiv.org/abs/1904.07199v3 |
https://arxiv.org/pdf/1904.07199v3.pdf | |
PWC | https://paperswithcode.com/paper/exact-rate-distortion-in-autoencoders-via |
Repo | https://github.com/brekelma/echo |
Framework | tf |
Rethinking Self-Attention: An Interpretable Self-Attentive Encoder-Decoder Parser
Title | Rethinking Self-Attention: An Interpretable Self-Attentive Encoder-Decoder Parser |
Authors | Khalil Mrini, Franck Dernoncourt, Trung Bui, Walter Chang, Ndapa Nakashole |
Abstract | Attention mechanisms have improved the performance of NLP tasks while providing for appearance of model interpretability. Self-attention is currently widely used in NLP models, however it is difficult to interpret due to the numerous attention distributions. We hypothesize that model representations can benefit from label-specific information, while facilitating interpretation of predictions. We introduce the Label Attention Layer: a new form of self-attention where attention heads represent labels. We validate our hypothesis by running experiments in constituency and dependency parsing and show our new model obtains new state-of-the-art results for both tasks on the English Penn Treebank. Our neural parser obtains 96.34 F1 score for constituency parsing, and 97.33 UAS and 96.29 LAS for dependency parsing. Additionally, our model requires fewer layers, therefore, fewer parameters compared to existing work. |
Tasks | Constituency Parsing, Dependency Parsing |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03875v1 |
https://arxiv.org/pdf/1911.03875v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-self-attention-an-interpretable |
Repo | https://github.com/KhalilMrini/LAL-Parser |
Framework | none |
Short Text Topic Modeling Techniques, Applications, and Performance: A Survey
Title | Short Text Topic Modeling Techniques, Applications, and Performance: A Survey |
Authors | Qiang Jipeng, Qian Zhenyu, Li Yun, Yuan Yunhao, Wu Xindong |
Abstract | Analyzing short texts infers discriminative and coherent latent topics that is a critical and fundamental task since many real-world applications require semantic understanding of short texts. Traditional long text topic modeling algorithms (e.g., PLSA and LDA) based on word co-occurrences cannot solve this problem very well since only very limited word co-occurrence information is available in short texts. Therefore, short text topic modeling has already attracted much attention from the machine learning research community in recent years, which aims at overcoming the problem of sparseness in short texts. In this survey, we conduct a comprehensive review of various short text topic modeling techniques proposed in the literature. We present three categories of methods based on Dirichlet multinomial mixture, global word co-occurrences, and self-aggregation, with example of representative approaches in each category and analysis of their performance on various tasks. We develop the first comprehensive open-source library, called STTM, for use in Java that integrates all surveyed algorithms within a unified interface, benchmark datasets, to facilitate the expansion of new methods in this research field. Finally, we evaluate these state-of-the-art methods on many real-world datasets and compare their performance against one another and versus long text topic modeling algorithm. |
Tasks | |
Published | 2019-04-13 |
URL | http://arxiv.org/abs/1904.07695v1 |
http://arxiv.org/pdf/1904.07695v1.pdf | |
PWC | https://paperswithcode.com/paper/short-text-topic-modeling-techniques |
Repo | https://github.com/qiang2100/STTM |
Framework | none |
OpenML-Python: an extensible Python API for OpenML
Title | OpenML-Python: an extensible Python API for OpenML |
Authors | Matthias Feurer, Jan N. van Rijn, Arlind Kadra, Pieter Gijsbers, Neeratyoy Mallik, Sahithya Ravi, Andreas Müller, Joaquin Vanschoren, Frank Hutter |
Abstract | OpenML is an online platform for open science collaboration in machine learning, used to share datasets and results of machine learning experiments. In this paper we introduce \emph{OpenML-Python}, a client API for Python, opening up the OpenML platform for a wide range of Python-based tools. It provides easy access to all datasets, tasks and experiments on OpenML from within Python. It also provides functionality to conduct machine learning experiments, upload the results to OpenML, and reproduce results which are stored on OpenML. Furthermore, it comes with a scikit-learn plugin and a plugin mechanism to easily integrate other machine learning libraries written in Python into the OpenML ecosystem. Source code and documentation is available at https://github.com/openml/openml-python/. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02490v1 |
https://arxiv.org/pdf/1911.02490v1.pdf | |
PWC | https://paperswithcode.com/paper/openml-python-an-extensible-python-api-for |
Repo | https://github.com/openml/openml-python |
Framework | none |
Global Optimal Path-Based Clustering Algorithm
Title | Global Optimal Path-Based Clustering Algorithm |
Authors | Qidong Liu, Ruisheng Zhang |
Abstract | Combinatorial optimization problems for clustering are known to be NP-hard. Most optimization methods are not able to find the global optimum solution for all datasets. To solve this problem, we propose a global optimal path-based clustering (GOPC) algorithm in this paper. The GOPC algorithm is based on two facts: (1) medoids have the minimum degree in their clusters; (2) the minimax distance between two objects in one cluster is smaller than the minimax distance between objects in different clusters. Extensive experiments are conducted on synthetic and real-world datasets to evaluate the performance of the GOPC algorithm. The results on synthetic datasets show that the GOPC algorithm can recognize all kinds of clusters regardless of their shapes, sizes, or densities. Experimental results on real-world datasets demonstrate the effectiveness and efficiency of the GOPC algorithm. In addition, the GOPC algorithm needs only one parameter, i.e., the number of clusters, which can be estimated by the decision graph. The advantages mentioned above make GOPC a good candidate as a general clustering algorithm. Codes are available at https://github.com/Qidong-Liu/Clustering. |
Tasks | Combinatorial Optimization |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07774v1 |
https://arxiv.org/pdf/1909.07774v1.pdf | |
PWC | https://paperswithcode.com/paper/global-optimal-path-based-clustering |
Repo | https://github.com/Qidong-Liu/Clustering |
Framework | none |