January 25, 2020

2606 words 13 mins read

Paper Group NANR 18

Paper Group NANR 18

Efficient and Accurate Face Alignment by Global Regression and Cascaded Local Refinement. Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations. Learning Shape-Motion Representations from Geometric Algebra Spatio-Temporal Model for Skeleton-Based Action Recognition. Integer Networks f …

Efficient and Accurate Face Alignment by Global Regression and Cascaded Local Refinement

Title Efficient and Accurate Face Alignment by Global Regression and Cascaded Local Refinement
Authors Jinzhan Su, Zhe Wang, Chunyuan Liao, Haibin Ling
Abstract Despite great advances witnessed on facial image alignment in recent years, high accuracy high speed face alignment algorithms still have rooms to improve especially for applications where computation resources are limited. Addressing this issue, we propose a new face landmark localization algorithm by combining global regression and local refinement. In particular, for a given image, our algorithm first estimates its global facial shape through a global regression network (GRegNet) and then using cascaded local refinement networks (LRefNet) to sequentially improve the alignment result. Compared with previous face alignment algorithms, our key innovation is the sharing of low level features in GRegNet with LRefNet. Such feature sharing not only significantly improves the algorithm efficiency, but also allows full exploration of rich locality-sensitive details carried with shallow network layers and consequently boosts the localization accuracy. The advantages of our algorithm is clearly validated in our thorough experiments on four popular face alignment benchmarks, 300-W, AFLW, COFW and WFLW. On all datasets, our algorithm produces state-of-the-art alignment accuracy, while enjoys the smallest computational complexity.
Tasks Face Alignment
Published 2019-06-16
URL http://openaccess.thecvf.com/content_CVPRW_2019/html/AMFG/Su_Efficient_and_Accurate_Face_Alignment_by_Global_Regression_and_Cascaded_CVPRW_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPRW_2019/papers/AMFG/Su_Efficient_and_Accurate_Face_Alignment_by_Global_Regression_and_Cascaded_CVPRW_2019_paper.pdf
PWC https://paperswithcode.com/paper/efficient-and-accurate-face-alignment-by
Repo
Framework

Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations

Title Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations
Authors Marco Antonio Sobrevilla Cabezudo, Simon Mille, Thiago Pardo
Abstract This paper presents an exploratory study that aims to evaluate the usefulness of back-translation in Natural Language Generation (NLG) from semantic representations for non-English languages. Specifically, Abstract Meaning Representation and Brazilian Portuguese (BP) are chosen as semantic representation and language, respectively. Two methods (focused on Statistical and Neural Machine Translation) are evaluated on two datasets (one automatically generated and another one human-generated) to compare the performance in a real context. Also, several cuts according to quality measures are performed to evaluate the importance (or not) of the data quality in NLG. Results show that there are still many improvements to be made but this is a promising approach.
Tasks Machine Translation, Text Generation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6313/
PDF https://www.aclweb.org/anthology/D19-6313
PWC https://paperswithcode.com/paper/back-translation-as-strategy-to-tackle-the
Repo
Framework

Learning Shape-Motion Representations from Geometric Algebra Spatio-Temporal Model for Skeleton-Based Action Recognition

Title Learning Shape-Motion Representations from Geometric Algebra Spatio-Temporal Model for Skeleton-Based Action Recognition
Authors Yanshan Li, Rongjie Xia, Xing Liu, Qinghua Huang
Abstract Skeleton-based action recognition has been widely applied in intelligent video surveillance and human behavior analysis. Previous works have successfully applied Convolutional Neural Networks (CNN) to learn spatio-temporal characteristics of the skeleton sequence. However, they merely focus on the coordinates of isolated joints, which ignore the spatial relationships between joints and only implicitly learn the motion representations. To solve these problems, we propose an effective method to learn comprehensive representations from skeleton sequences by using Geometric Algebra. Firstly, a frontal orientation based spatio-temporal model is constructed to represent the spatial configuration and temporal dynamics of skeleton sequences, which owns the robustness against view variations. Then the shape-motion representations which mutually compensate are learned to describe skeleton actions comprehensively. Finally, a multi-stream CNN model is applied to extract and fuse deep features from the complementary shape-motion representations. Experimental results on NTU RGB+D and Northwestern-UCLA datasets consistently verify the superiority of our method.
Tasks Skeleton Based Action Recognition
Published 2019-07-08
URL https://doi.org/10.1109/ICME.2019.00187
PDF https://www.researchgate.net/publication/334997886_Learning_Shape-Motion_Representations_from_Geometric_Algebra_Spatio-Temporal_Model_for_Skeleton-Based_Action_Recognition
PWC https://paperswithcode.com/paper/learning-shape-motion-representations-from
Repo
Framework

Integer Networks for Data Compression with Latent-Variable Models

Title Integer Networks for Data Compression with Latent-Variable Models
Authors Johannes Ballé, Nick Johnston, David Minnen
Abstract We consider the problem of using variational latent-variable models for data compression. For such models to produce a compressed binary sequence, which is the universal data representation in a digital world, the latent representation needs to be subjected to entropy coding. Range coding as an entropy coding technique is optimal, but it can fail catastrophically if the computation of the prior differs even slightly between the sending and the receiving side. Unfortunately, this is a common scenario when floating point math is used and the sender and receiver operate on different hardware or software platforms, as numerical round-off is often platform dependent. We propose using integer networks as a universal solution to this problem, and demonstrate that they enable reliable cross-platform encoding and decoding of images using variational models.
Tasks Latent Variable Models
Published 2019-05-01
URL https://openreview.net/forum?id=S1zz2i0cY7
PDF https://openreview.net/pdf?id=S1zz2i0cY7
PWC https://paperswithcode.com/paper/integer-networks-for-data-compression-with
Repo
Framework

Bayesian Graph Convolution LSTM for Skeleton Based Action Recognition

Title Bayesian Graph Convolution LSTM for Skeleton Based Action Recognition
Authors Rui Zhao, Kang Wang, Hui Su, Qiang Ji
Abstract We propose a framework for recognizing human actions from skeleton data by modeling the underlying dynamic process that generates the motion pattern. We capture three major factors that contribute to the complexity of the motion pattern including spatial dependencies among body joints, temporal dependencies of body poses, and variation among subjects in action execution. We utilize graph convolution to extract structure-aware feature representation from pose data by exploiting the skeleton anatomy. Long short-term memory (LSTM) network is then used to capture the temporal dynamics of the data. Finally, the whole model is extended under the Bayesian framework to a probabilistic model in order to better capture the stochasticity and variation in the data. An adversarial prior is developed to regularize the model parameters to improve the generalization of the model. A Bayesian inference problem is formulated to solve the classification task. We demonstrate the benefit of this framework in several benchmark datasets with recognition under various generalization conditions.
Tasks Bayesian Inference, Skeleton Based Action Recognition
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Zhao_Bayesian_Graph_Convolution_LSTM_for_Skeleton_Based_Action_Recognition_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Zhao_Bayesian_Graph_Convolution_LSTM_for_Skeleton_Based_Action_Recognition_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/bayesian-graph-convolution-lstm-for-skeleton
Repo
Framework

A Turkish Dataset for Gender Identification of Twitter Users

Title A Turkish Dataset for Gender Identification of Twitter Users
Authors Erhan Sezerer, Ozan Polatbilek, Selma Tekir
Abstract Author profiling is the identification of an author{'}s gender, age, and language from his/her texts. With the increasing trend of using Twitter as a means to express thought, profiling the gender of an author from his/her tweets has become a challenge. Although several datasets in different languages have been released on this problem, there is still a need for multilingualism. In this work, we propose a dataset of tweets of Turkish Twitter users which are labeled with their gender information. The dataset has 3368 users in training set and 1924 users in test set where each user has 100 tweets. The dataset is publicly available.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4023/
PDF https://www.aclweb.org/anthology/W19-4023
PWC https://paperswithcode.com/paper/a-turkish-dataset-for-gender-identification
Repo
Framework

Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

Title Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology
Authors
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4200/
PDF https://www.aclweb.org/anthology/W19-4200
PWC https://paperswithcode.com/paper/proceedings-of-the-16th-workshop-on
Repo
Framework

A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation

Title A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
Authors Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alex Uma, ra, Udo Kruschwitz
Abstract We present a corpus of anaphoric information (coreference) crowdsourced through a game-with-a-purpose. The corpus, containing annotations for about 108,000 markables, is one of the largest corpora for coreference for English, and one of the largest crowdsourced NLP corpora, but its main feature is the large number of judgments per markable: 20 on average, and over 2.2M in total. This characteristic makes the corpus a unique resource for the study of disagreements on anaphoric interpretation. A second distinctive feature is its rich annotation scheme, covering singletons, expletives, and split-antecedent plurals. Finally, the corpus also comes with labels inferred using a recently proposed probabilistic model of annotation for coreference. The labels are of high quality and make it possible to successfully train a state of the art coreference resolver, including training on singletons and non-referring expressions. The annotation model can also result in more than one label, or no label, being proposed for a markable, thus serving as a baseline method for automatically identifying ambiguous markables. A preliminary analysis of the results is presented.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1176/
PDF https://www.aclweb.org/anthology/N19-1176
PWC https://paperswithcode.com/paper/a-crowdsourced-corpus-of-multiple-judgments
Repo
Framework

Convolutional Approximations to the General Non-Line-of-Sight Imaging Operator

Title Convolutional Approximations to the General Non-Line-of-Sight Imaging Operator
Authors Byeongjoo Ahn, Akshat Dave, Ashok Veeraraghavan, Ioannis Gkioulekas, Aswin C. Sankaranarayanan
Abstract Non-line-of-sight (NLOS) imaging aims to reconstruct scenes outside the field of view of an imaging system. A common approach is to measure the so-called light transients, which facilitates reconstructions through ellipsoidal tomography that involves solving a linear least-squares. Unfortunately, the corresponding linear operator is very high-dimensional and lacks structures that facilitate fast solvers, and so, the ensuing optimization is a computationally daunting task. We introduce a computationally tractable framework for solving the ellipsoidal tomography problem. Our main observation is that the Gram of the ellipsoidal tomography operator is convolutional, either exactly under certain idealized imaging conditions, or approximately in practice. This, in turn, allows us to obtain the ellipsoidal tomography solution by using efficient deconvolution procedures to solve a linear least-squares problem involving the Gram operator. The computational tractability of our approach also facilitates the use of various regularizers during the deconvolution procedure. We demonstrate the advantages of our framework in a variety of simulated and real experiments.
Tasks
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Ahn_Convolutional_Approximations_to_the_General_Non-Line-of-Sight_Imaging_Operator_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Ahn_Convolutional_Approximations_to_the_General_Non-Line-of-Sight_Imaging_Operator_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/convolutional-approximations-to-the-general
Repo
Framework

WSOD2: Learning Bottom-up and Top-down Objectness Distillation forWeakly-supervised Object Detection

Title WSOD2: Learning Bottom-up and Top-down Objectness Distillation forWeakly-supervised Object Detection
Authors Zhaoyang Zeng, Bei Liu, Jianlong Fu, Hongyang Chao, Lei Zhang
Abstract We study on weakly-supervised object detection (WSOD)which plays a vital role in relieving human involvement fromobject-level annotations. Predominant works integrate re-gion proposal mechanisms with convolutional neural net-works (CNN). Although CNN is proficient in extracting dis-criminative local features, grand challenges still exist tomeasure the likelihood of a bounding box containing a com-plete object (i.e., “objectness”). In this paper, we pro-pose a novelWSODframework withObjectnessDistillation(i.e.,WSOD2) by designing a tailored training mechanismfor weakly-supervised object detection. Multiple regressiontargets are specifically determined by jointly consideringbottom-up (BU) and top-down (TD) objectness from low-level measurement and CNN confidences with an adaptivelinear combination. As bounding box regression can fa-cilitate a region proposal learning to approach its regres-sion target with high objectness during training, deep ob-jectness representation learned from bottom-up evidencescan be gradually distilled into CNN by optimization. Weexplore different adaptive training curves for BU/TD ob-jectness, and show that the proposed WSOD2can achievestate-of-the-art results.
Tasks Object Detection, Weakly Supervised Object Detection
Published 2019-09-11
URL https://arxiv.org/abs/1909.04972
PDF https://arxiv.org/pdf/1909.04972.pdf
PWC https://paperswithcode.com/paper/wsod2-learning-bottom-up-and-top-down
Repo
Framework

NATTACK: A STRONG AND UNIVERSAL GAUSSIAN BLACK-BOX ADVERSARIAL ATTACK

Title NATTACK: A STRONG AND UNIVERSAL GAUSSIAN BLACK-BOX ADVERSARIAL ATTACK
Authors Yandong Li, Lijun Li, Liqiang Wang, Tong Zhang, Boqing Gong
Abstract Recent works find that DNNs are vulnerable to adversarial examples, whose changes from the benign ones are imperceptible and yet lead DNNs to make wrong predictions. One can find various adversarial examples for the same input to a DNN using different attack methods. In other words, there is a population of adversarial examples, instead of only one, for any input to a DNN. By explicitly modeling this adversarial population with a Gaussian distribution, we propose a new black-box attack called NATTACK. The adversarial attack is hence formalized as an optimization problem, which searches the mean of the Gaussian under the guidance of increasing the target DNN’s prediction error. NATTACK achieves 100% attack success rate on six out of eleven recently published defense methods (and greater than 90% for four), all using the same algorithm. Such results are on par with or better than powerful state-of-the-art white-box attacks. While the white-box attacks are often model-specific or defense-specific, the proposed black-box NATTACK is universally applicable to different defenses.
Tasks Adversarial Attack
Published 2019-05-01
URL https://openreview.net/forum?id=ryeoxnRqKQ
PDF https://openreview.net/pdf?id=ryeoxnRqKQ
PWC https://paperswithcode.com/paper/nattack-a-strong-and-universal-gaussian-black
Repo
Framework

iADAATPA Project: Pangeanic use cases

Title iADAATPA Project: Pangeanic use cases
Authors Mercedes Garc{'\i}a-Mart{'\i}nez, Am Estela, o, Laurent Bi{'e}, Alex Helle, re, Manuel Herranz
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-6719/
PDF https://www.aclweb.org/anthology/W19-6719
PWC https://paperswithcode.com/paper/iadaatpa-project-pangeanic-use-cases
Repo
Framework

Modeling Five Sentence Quality Representations by Finding Latent Spaces Produced with Deep Long Short-Memory Models

Title Modeling Five Sentence Quality Representations by Finding Latent Spaces Produced with Deep Long Short-Memory Models
Authors Pablo Rivas
Abstract We present a study in which we train neural models that approximate rules that assess the quality of English sentences. We modeled five rules using deep LSTMs trained over a dataset of sentences whose quality is evaluated under such rules. Preliminary results suggest the neural architecture can model such rules to high accuracy.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/papers/W/W19/W19-3610/
PDF https://www.aclweb.org/anthology/W19-3610
PWC https://paperswithcode.com/paper/modeling-five-sentence-quality
Repo
Framework

Towards Consistent Performance on Atari using Expert Demonstrations

Title Towards Consistent Performance on Atari using Expert Demonstrations
Authors Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin
Abstract Despite significant advances in the field of deep Reinforcement Learning (RL), today’s algorithms still fail to learn human-level policies consistently over a set of diverse tasks such as Atari 2600 games. We identify three key challenges that any algorithm needs to master in order to perform well on all games: processing diverse reward distributions, reasoning over long time horizons, and exploring efficiently. In this paper, we propose an algorithm that addresses each of these challenges and is able to learn human-level policies on nearly all Atari games. A new transformed Bellman operator allows our algorithm to process rewards of varying densities and scales; an auxiliary temporal consistency loss allows us to train stably using a discount factor of 0.999 (instead of 0.99) extending the effective planning horizon by an order of magnitude; and we ease the exploration problem by using human demonstrations that guide the agent towards rewarding states. When tested on a set of 42 Atari games, our algorithm exceeds the performance of an average human on 40 games using a common set of hyper parameters.
Tasks Atari Games
Published 2019-05-01
URL https://openreview.net/forum?id=BkfPnoActQ
PDF https://openreview.net/pdf?id=BkfPnoActQ
PWC https://paperswithcode.com/paper/towards-consistent-performance-on-atari-using
Repo
Framework

Block Annotation: Better Image Annotation With Sub-Image Decomposition

Title Block Annotation: Better Image Annotation With Sub-Image Decomposition
Authors Hubert Lin, Paul Upchurch, Kavita Bala
Abstract Image datasets with high-quality pixel-level annotations are valuable for semantic segmentation: labelling every pixel in an image ensures that rare classes and small objects are annotated. However, full-image annotations are expensive, with experts spending up to 90 minutes per image. We propose block sub-image annotation as a replacement for full-image annotation. Despite the attention cost of frequent task switching, we find that block annotations can be crowdsourced at higher quality compared to full-image annotation with equal monetary cost using existing annotation tools developed for full-image annotation. Surprisingly, we find that 50% pixels annotated with blocks allows semantic segmentation to achieve equivalent performance to 100% pixels annotated. Furthermore, as little as 12% of pixels annotated allows performance as high as 98% of the performance with dense annotation. In weakly-supervised settings, block annotation outperforms existing methods by 3-4% (absolute) given equivalent annotation time. To recover the necessary global structure for applications such as characterizing spatial context and affordance relationships, we propose an effective method to inpaint block-annotated images with high-quality labels without additional human effort. As such, fewer annotations can also be used for these applications compared to full-image annotation.
Tasks Semantic Segmentation
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Lin_Block_Annotation_Better_Image_Annotation_With_Sub-Image_Decomposition_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Lin_Block_Annotation_Better_Image_Annotation_With_Sub-Image_Decomposition_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/block-annotation-better-image-annotation-with
Repo
Framework
comments powered by Disqus