February 2, 2020

3118 words 15 mins read

Paper Group AWR 9

AdvKnn: Adversarial Attacks On K-Nearest Neighbor Classifiers With Approximate Gradients. Structured Variational Inference in Unstable Gaussian Process State Space Models. Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning. Interpreting the Latent Space of GANs for Semantic Face Editing. Pseudo Random Num …

AdvKnn: Adversarial Attacks On K-Nearest Neighbor Classifiers With Approximate Gradients


Title	AdvKnn: Adversarial Attacks On K-Nearest Neighbor Classifiers With Approximate Gradients
Authors	Xiaodan Li, Yuefeng Chen, Yuan He, Hui Xue
Abstract	Deep neural networks have been shown to be vulnerable to adversarial examples—maliciously crafted examples that can trigger the target model to misbehave by adding imperceptible perturbations. Existing attack methods for k-nearest neighbor~(kNN) based algorithms either require large perturbations or are not applicable for large k. To handle this problem, this paper proposes a new method called AdvKNN for evaluating the adversarial robustness of kNN-based models. Firstly, we propose a deep kNN block to approximate the output of kNN methods, which is differentiable thus can provide gradients for attacks to cross the decision boundary with small distortions. Second, a new consistency learning for distribution instead of classification is proposed for the effectiveness in distribution based methods. Extensive experimental results indicate that the proposed method significantly outperforms state of the art in terms of attack success rate and the added perturbations.
Tasks
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06591v2
PDF	https://arxiv.org/pdf/1911.06591v2.pdf
PWC	https://paperswithcode.com/paper/advknn-adversarial-attacks-on-k-nearest
Repo	https://github.com/fiona-lxd/AdvKnn
Framework	pytorch

Structured Variational Inference in Unstable Gaussian Process State Space Models


Title	Structured Variational Inference in Unstable Gaussian Process State Space Models
Authors	Sebastian Curi, Silvan Melchior, Felix Berkenkamp, Andreas Krause
Abstract	We propose a new variational inference algorithm for learning in Gaussian Process State-Space Models (GPSSMs). Our algorithm enables learning of unstable and partially observable systems, where previous algorithms fail. Our main algorithmic contribution is a novel approximate posterior that can be calculated efficiently using a single forward and backward pass along the training trajectories. The forward-backward pass is inspired on Kalman smoothing for linear dynamical systems but generalizes to GPSSMs. Our second contribution is a modification of the conditioning step that effectively lowers the Kalman gain. This modification is crucial to attaining good test performance where no measurements are available. Finally, we show experimentally that our learning algorithm performs well in stable and unstable real systems with hidden states.
Tasks	Gaussian Processes
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07035v2
PDF	https://arxiv.org/pdf/1907.07035v2.pdf
PWC	https://paperswithcode.com/paper/structured-variational-inference-in-unstable
Repo	https://github.com/silvanmelchior/CBF-SSM
Framework	none

Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning


Title	Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning
Authors	Baoyuan Wu, Weidong Chen, Yanbo Fan, Yong Zhang, Jinlong Hou, Jie Liu, Tong Zhang
Abstract	In existing visual representation learning tasks, deep convolutional neural networks (CNNs) are often trained on images annotated with single tags, such as ImageNet. However, a single tag cannot describe all important contents of one image, and some useful visual information may be wasted during training. In this work, we propose to train CNNs from images annotated with multiple tags, to enhance the quality of visual representation of the trained CNN model. To this end, we build a large-scale multi-label image database with 18M images and 11K categories, dubbed Tencent ML-Images. We efficiently train the ResNet-101 model with multi-label outputs on Tencent ML-Images, taking 90 hours for 60 epochs, based on a large-scale distributed deep learning framework,i.e.,TFplus. The good quality of the visual representation of the Tencent ML-Images checkpoint is verified through three transfer learning tasks, including single-label image classification on ImageNet and Caltech-256, object detection on PASCAL VOC 2007, and semantic segmentation on PASCAL VOC 2012. The Tencent ML-Images database, the checkpoints of ResNet-101, and all the training codehave been released at https://github.com/Tencent/tencent-ml-images. It is expected to promote other vision tasks in the research and industry community.
Tasks	Image Classification, Object Detection, Representation Learning, Semantic Segmentation, Transfer Learning
Published	2019-01-07
URL	https://arxiv.org/abs/1901.01703v7
PDF	https://arxiv.org/pdf/1901.01703v7.pdf
PWC	https://paperswithcode.com/paper/tencent-ml-images-a-large-scale-multi-label
Repo	https://github.com/Tencent/tencent-ml-images
Framework	tf

Interpreting the Latent Space of GANs for Semantic Face Editing


Title	Interpreting the Latent Space of GANs for Semantic Face Editing
Authors	Yujun Shen, Jinjin Gu, Xiaoou Tang, Bolei Zhou
Abstract	Despite the recent advance of Generative Adversarial Networks (GANs) in high-fidelity image synthesis, there lacks enough understanding of how GANs are able to map a latent code sampled from a random distribution to a photo-realistic image. Previous work assumes the latent space learned by GANs follows a distributed representation but observes the vector arithmetic phenomenon. In this work, we propose a novel framework, called InterFaceGAN, for semantic face editing by interpreting the latent semantics learned by GANs. In this framework, we conduct a detailed study on how different semantics are encoded in the latent space of GANs for face synthesis. We find that the latent code of well-trained generative models actually learns a disentangled representation after linear transformations. We explore the disentanglement between various semantics and manage to decouple some entangled semantics with subspace projection, leading to more precise control of facial attributes. Besides manipulating gender, age, expression, and the presence of eyeglasses, we can even vary the face pose as well as fix the artifacts accidentally generated by GAN models. The proposed method is further applied to achieve real image manipulation when combined with GAN inversion methods or some encoder-involved models. Extensive results suggest that learning to synthesize faces spontaneously brings a disentangled and controllable facial attribute representation.
Tasks	Face Generation, Image Generation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10786v3
PDF	https://arxiv.org/pdf/1907.10786v3.pdf
PWC	https://paperswithcode.com/paper/interpreting-the-latent-space-of-gans-for
Repo	https://github.com/ShenYujun/InterFaceGAN
Framework	tf

Pseudo Random Number Generation: a Reinforcement Learning approach


Title	Pseudo Random Number Generation: a Reinforcement Learning approach
Authors	Luca Pasqualini, Maurizio Parton
Abstract	Pseudo-Random Numbers Generators (PRNGs) are algorithms produced to generate long sequences of statistically uncorrelated numbers, i.e. Pseudo-Random Numbers (PRNs). These numbers are widely employed in mid-level cryptography and in software applications. Test suites are used to evaluate PRNGs quality by checking statistical properties of the generated sequences. Machine learning techniques are often used to break these generators, for instance approximating a certain generator or a certain sequence using a neural network. But what about using machine learning to generate PRNs generators? This paper proposes a Reinforcement Learning (RL) approach to the task of generating PRNGs from scratch by learning a policy to solve an N-dimensional navigation problem. In this context, N is the length of the period of the generated sequence, and the policy is iteratively improved using the average value of an appropriate test suite run over that period. Aim of this work is to demonstrate the feasibility of the proposed approach, to compare it with classical methods, and to lay the foundation of a research path which combines RL and PRNGs.
Tasks
Published	2019-12-15
URL	https://arxiv.org/abs/1912.11531v1
PDF	https://arxiv.org/pdf/1912.11531v1.pdf
PWC	https://paperswithcode.com/paper/pseudo-random-number-generation-a
Repo	https://github.com/InsaneMonster/pasqualini2019prngrl
Framework	tf

TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents


Title	TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
Authors	Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
Abstract	We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute improvement) and 19.5 (20 % absolute improvement).
Tasks	Information Retrieval, Transfer Learning
Published	2019-01-23
URL	http://arxiv.org/abs/1901.08149v2
PDF	http://arxiv.org/pdf/1901.08149v2.pdf
PWC	https://paperswithcode.com/paper/transfertransfo-a-transfer-learning-approach
Repo	https://github.com/cerebroai/AskIt
Framework	pytorch

Does Order Matter? An Empirical Study on Generating Multiple Keyphrases as a Sequence


Title	Does Order Matter? An Empirical Study on Generating Multiple Keyphrases as a Sequence
Authors	Rui Meng, Xingdi Yuan, Tong Wang, Peter Brusilovsky, Adam Trischler, Daqing He
Abstract	Recently, concatenating multiple keyphrases as a target sequence has been proposed as a new learning paradigm for keyphrase generation. Existing studies concatenate target keyphrases in different orders but no study has examined the effects of ordering on models’ behavior. In this paper, we propose several orderings for concatenation and inspect the important factors for training a successful keyphrase generation model. By running comprehensive comparisons, we observe one preferable ordering and summarize a number of empirical findings and challenges, which can shed light on future research on this line of work.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03590v3
PDF	https://arxiv.org/pdf/1909.03590v3.pdf
PWC	https://paperswithcode.com/paper/does-order-matter-an-empirical-study-on
Repo	https://github.com/memray/OpenNMT-kpg-release
Framework	pytorch


Title	Geometry Sharing Network for 3D Point Cloud Classification and Segmentation
Authors	Mingye Xu, Zhipeng Zhou, Yu Qiao
Abstract	In spite of the recent progresses on classifying 3D point cloud with deep CNNs, large geometric transformations like rotation and translation remain challenging problem and harm the final classification performance. To address this challenge, we propose Geometry Sharing Network (GS-Net) which effectively learns point descriptors with holistic context to enhance the robustness to geometric transformations. Compared with previous 3D point CNNs which perform convolution on nearby points, GS-Net can aggregate point features in a more global way. Specially, GS-Net consists of Geometry Similarity Connection (GSC) modules which exploit Eigen-Graph to group distant points with similar and relevant geometric information, and aggregate features from nearest neighbors in both Euclidean space and Eigenvalue space. This design allows GS-Net to efficiently capture both local and holistic geometric features such as symmetry, curvature, convexity and connectivity. Theoretically, we show the nearest neighbors of each point in Eigenvalue space are invariant to rotation and translation. We conduct extensive experiments on public datasets, ModelNet40, ShapeNet Part. Experiments demonstrate that GS-Net achieves the state-of-the-art performances on major datasets, 93.3% on ModelNet40, and are more robust to geometric transformations.
Tasks
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10644v1
PDF	https://arxiv.org/pdf/1912.10644v1.pdf
PWC	https://paperswithcode.com/paper/geometry-sharing-network-for-3d-point-cloud
Repo	https://github.com/MingyeXu/GS-Net
Framework	pytorch

An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions


Title	An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions
Authors	Sercan Türkmen, Janne Heikkilä
Abstract	Assigning a label to each pixel in an image, namely semantic segmentation, has been an important task in computer vision, and has applications in autonomous driving, robotic navigation, localization, and scene understanding. Fully convolutional neural networks have proved to be a successful solution for the task over the years but most of the work being done focuses primarily on accuracy. In this paper, we present a computationally efficient approach to semantic segmentation, while achieving a high mean intersection over union (mIOU), 70.33% on Cityscapes challenge. The network proposed is capable of running real-time on mobile devices. In addition, we make our code and model weights publicly available.
Tasks	Autonomous Driving, Scene Understanding, Semantic Segmentation
Published	2019-02-20
URL	http://arxiv.org/abs/1902.07476v2
PDF	http://arxiv.org/pdf/1902.07476v2.pdf
PWC	https://paperswithcode.com/paper/an-efficient-solution-for-semantic
Repo	https://github.com/sercant/mobile-segmentation
Framework	tf

Towards Stable Symbol Grounding with Zero-Suppressed State AutoEncoder


Title	Towards Stable Symbol Grounding with Zero-Suppressed State AutoEncoder
Authors	Masataro Asai, Hiroshi Kajino
Abstract	While classical planning has been an active branch of AI, its applicability is limited to the tasks precisely modeled by humans. Fully automated high-level agents should be instead able to find a symbolic representation of an unknown environment without supervision, otherwise it exhibits the knowledge acquisition bottleneck. Meanwhile, Latplan (Asai and Fukunaga 2018) partially resolves the bottleneck with a neural network called State AutoEncoder (SAE). SAE obtains the propositional representation of the image-based puzzle domains with unsupervised learning, generates a state space and performs classical planning. In this paper, we identify the problematic, stochastic behavior of the SAE-produced propositions as a new sub-problem of symbol grounding problem, the symbol stability problem. Informally, symbols are stable when their referents (e.g. propositional values) do not change against small perturbation of the observation, and unstable symbols are harmful for symbolic reasoning. We analyze the problem in Latplan both formally and empirically, and propose “Zero-Suppressed SAE”, an enhancement that stabilizes the propositions using the idea of closed-world assumption as a prior for NN optimization. We show that it finds the more stable propositions and the more compact representations, resulting in an improved success rate of Latplan. It is robust against various hyperparameters and eases the tuning effort, and also provides a weight pruning capability as a side effect.
Tasks
Published	2019-03-27
URL	http://arxiv.org/abs/1903.11277v1
PDF	http://arxiv.org/pdf/1903.11277v1.pdf
PWC	https://paperswithcode.com/paper/towards-stable-symbol-grounding-with-zero
Repo	https://github.com/guicho271828/latplan
Framework	tf

Merging Weak and Active Supervision for Semantic Parsing


Title	Merging Weak and Active Supervision for Semantic Parsing
Authors	Ansong Ni, Pengcheng Yin, Graham Neubig
Abstract	A semantic parser maps natural language commands (NLs) from the users to executable meaning representations (MRs), which are later executed in certain environment to obtain user-desired results. The fully-supervised training of such parser requires NL/MR pairs, annotated by domain experts, which makes them expensive to collect. However, weakly-supervised semantic parsers are learnt only from pairs of NL and expected execution results, leaving the MRs latent. While weak supervision is cheaper to acquire, learning from this input poses difficulties. It demands that parsers search a large space with a very weak learning signal and it is hard to avoid spurious MRs that achieve the correct answer in the wrong way. These factors lead to a performance gap between parsers trained in weakly- and fully-supervised setting. To bridge this gap, we examine the intersection between weak supervision and active learning, which allows the learner to actively select examples and query for manual annotations as extra supervision to improve the model trained under weak supervision. We study different active learning heuristics for selecting examples to query, and various forms of extra supervision for such queries. We evaluate the effectiveness of our method on two different datasets. Experiments on the WikiSQL show that by annotating only 1.8% of examples, we improve over a state-of-the-art weakly-supervised baseline by 6.4%, achieving an accuracy of 79.0%, which is only 1.3% away from the model trained with full supervision. Experiments on WikiTableQuestions with human annotators show that our method can improve the performance with only 100 active queries, especially for weakly-supervised parsers learnt from a cold start.
Tasks	Active Learning, Semantic Parsing
Published	2019-11-29
URL	https://arxiv.org/abs/1911.12986v1
PDF	https://arxiv.org/pdf/1911.12986v1.pdf
PWC	https://paperswithcode.com/paper/merging-weak-and-active-supervision-for
Repo	https://github.com/niansong1996/wassp
Framework	none

Compositional Semantic Parsing Across Graphbanks


Title	Compositional Semantic Parsing Across Graphbanks
Authors	Matthias Lindemann, Jonas Groschwitz, Alexander Koller
Abstract	Most semantic parsers that map sentences to graph-based meaning representations are hand-designed for specific graphbanks. We present a compositional neural semantic parser which achieves, for the first time, competitive accuracies across a diverse range of graphbanks. Incorporating BERT embeddings and multi-task learning improves the accuracy further, setting new states of the art on DM, PAS, PSD, AMR 2015 and EDS.
Tasks	Multi-Task Learning, Semantic Parsing
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11746v2
PDF	https://arxiv.org/pdf/1906.11746v2.pdf
PWC	https://paperswithcode.com/paper/compositional-semantic-parsing-across
Repo	https://github.com/coli-saar/am-parser
Framework	pytorch

Generative Models for Low-Rank Video Representation and Reconstruction


Title	Generative Models for Low-Rank Video Representation and Reconstruction
Authors	Rakib Hyder, M. Salman Asif
Abstract	Finding compact representation of videos is an essential component in almost every problem related to video processing or understanding. In this paper, we propose a generative model to learn compact latent codes that can efficiently represent and reconstruct a video sequence from its missing or under-sampled measurements. We use a generative network that is trained to map a compact code into an image. We first demonstrate that if a video sequence belongs to the range of the pretrained generative network, then we can recover it by estimating the underlying compact latent codes. Then we demonstrate that even if the video sequence does not belong to the range of a pretrained network, we can still recover the true video sequence by jointly updating the latent codes and the weights of the generative network. To avoid overfitting in our model, we regularize the recovery problem by imposing low-rank and similarity constraints on the latent codes of the neighboring frames in the video sequence. We use our methods to recover a variety of videos from compressive measurements at different compression rates. We also demonstrate that we can generate missing frames in a video sequence by interpolating the latent codes of the observed frames in the low-dimensional space.
Tasks
Published	2019-02-25
URL	http://arxiv.org/abs/1902.11132v1
PDF	http://arxiv.org/pdf/1902.11132v1.pdf
PWC	https://paperswithcode.com/paper/generative-models-for-low-rank-video
Repo	https://github.com/CSIPlab/gmlr
Framework	pytorch

A Sketch-Based System for Semantic Parsing


Title	A Sketch-Based System for Semantic Parsing
Authors	Zechang Li, Yuxuan Lai, Yuxi Xie, Yansong Feng, Dongyan Zhao
Abstract	This paper presents our semantic parsing system for the evaluation task of open domain semantic parsing in NLPCC 2019. Many previous works formulate semantic parsing as a sequence-to-sequence(seq2seq) problem. Instead, we treat the task as a sketch-based problem in a coarse-to-fine(coarse2fine) fashion. The sketch is a high-level structure of the logical form exclusive of low-level details such as entities and predicates. In this way, we are able to optimize each part individually. Specifically, we decompose the process into three stages: the sketch classification determines the high-level structure while the entity labeling and the matching network fill in missing details. Moreover, we adopt the seq2seq method to evaluate logical form candidates from an overall perspective. The co-occurrence relationship between predicates and entities contribute to the reranking as well. Our submitted system achieves the exactly matching accuracy of 82.53% on full test set and 47.83% on hard test subset, which is the 3rd place in NLPCC 2019 Shared Task 2. After optimizations for parameters, network structure and sampling, the accuracy reaches 84.47% on full test set and 63.08% on hard test subset(Our code and data are available at https://github.com/zechagl/NLPCC2019-Semantic-Parsing).
Tasks	Semantic Parsing
Published	2019-09-02
URL	https://arxiv.org/abs/1909.00574v2
PDF	https://arxiv.org/pdf/1909.00574v2.pdf
PWC	https://paperswithcode.com/paper/a-sketch-based-system-for-semantic-parsing
Repo	https://github.com/zechagl/NLPCC2019-Semantic-Parsing
Framework	tf

SParC: Cross-Domain Semantic Parsing in Context


Title	SParC: Cross-Domain Semantic Parsing in Context
Authors	Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev
Abstract	We present SParC, a dataset for cross-domainSemanticParsing inContext that consists of 4,298 coherent question sequences (12k+ individual questions annotated with SQL queries). It is obtained from controlled user interactions with 200 complex databases over 138 domains. We provide an in-depth analysis of SParC and show that it introduces new challenges compared to existing datasets. SParC demonstrates complex contextual dependencies, (2) has greater semantic diversity, and (3) requires generalization to unseen domains due to its cross-domain nature and the unseen databases at test time. We experiment with two state-of-the-art text-to-SQL models adapted to the context-dependent, cross-domain setup. The best model obtains an exact match accuracy of 20.2% over all questions and less than10% over all interaction sequences, indicating that the cross-domain setting and the con-textual phenomena of the dataset present significant challenges for future research. The dataset, baselines, and leaderboard are released at https://yale-lily.github.io/sparc.
Tasks	Semantic Parsing, Text-To-Sql
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02285v1
PDF	https://arxiv.org/pdf/1906.02285v1.pdf
PWC	https://paperswithcode.com/paper/sparc-cross-domain-semantic-parsing-in
Repo	https://github.com/ryanzhumich/sparc_atis_pytorch
Framework	pytorch