October 16, 2019

2647 words 13 mins read

Paper Group NANR 13

Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders. Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection. T"ubingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification. Sentence Compre …

Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders


Title	Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders
Authors	Panna Felsen, Patrick Lucey, Sujoy Ganguly
Abstract	Simultaneously and accurately forecasting the behavior of many interacting agents is imperative for computer vision applications to be widely deployed (e.g., autonomous vehicles, security, surveillance, sports). In this paper, we present a technique using conditional variational autoencoder which learns a model that “personalizes’’ prediction to individual agent behavior within a group representation. Given the volume of data available and its adversarial nature, we focus on the sport of basketball and show that our approach efficiently predicts context-specific agent motions. We find that our model generates results that are three times as accurate as previous state of the art approaches (5.74 ft vs. 17.95 ft).
Tasks	Autonomous Vehicles
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Panna_Felsen_Where_Will_They_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Panna_Felsen_Where_Will_They_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/where-will-they-go-predicting-fine-grained
Repo
Framework

Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection


Title	Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection
Authors	Hao Chen, Youfu Li
Abstract	How to incorporate cross-modal complementarity sufficiently is the cornerstone question for RGB-D salient object detection. Previous works mainly address this issue by simply concatenating multi-modal features or combining unimodal predictions. In this paper, we answer this question from two perspectives: (1) We argue that if the complementary part can be modelled more explicitly, the cross-modal complement is likely to be better captured. To this end, we design a novel complementarity-aware fusion (CA-Fuse) module when adopting the Convolutional Neural Network (CNN). By introducing cross-modal residual functions and complementarity-aware supervisions in each CA-Fuse module, the problem of learning complementary information from the paired modality is explicitly posed as asymptotically approximating the residual function. (2) Exploring the complement across all the levels. By cascading the CA-Fuse module and adding level-wise supervision from deep to shallow densely, the cross-level complement can be selected and combined progressively. The proposed RGB-D fusion network disambiguates both cross-modal and cross-level fusion processes and enables more sufficient fusion results. The experiments on public datasets show the effectiveness of the proposed CA-Fuse module and the RGB-D salient object detection network.
Tasks	Object Detection, Salient Object Detection
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_Progressively_Complementarity-Aware_Fusion_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_Progressively_Complementarity-Aware_Fusion_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/progressively-complementarity-aware-fusion
Repo
Framework

T"ubingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification


Title	T"ubingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification
Authors	{\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Taraka Rama, Verena Blaschke
Abstract	This paper describes our systems for the VarDial 2018 evaluation campaign. We participated in all language identification tasks, namely, Arabic dialect identification (ADI), German dialect identification (GDI), discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). In all of the tasks, we only used textual transcripts (not using audio features for ADI). We submitted system runs based on support vector machine classifiers (SVMs) with bag of character and word n-grams as features, and gated bidirectional recurrent neural networks (RNNs) using units of characters and words. Our SVM models outperformed our RNN models in all tasks, obtaining the first place on the DFS task, third place on the ADI task, and second place on others according to the official rankings. As well as describing the models we used in the shared task participation, we present an analysis of the n-gram features used by the SVM models in each task, and also report additional results (that were run after the official competition deadline) on the GDI surprise dialect track.
Tasks	Document Classification, Language Identification
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-3906/
PDF	https://www.aclweb.org/anthology/W18-3906
PWC	https://paperswithcode.com/paper/ta14bingen-oslo-team-at-the-vardial-2018
Repo
Framework

Sentence Compression for Arbitrary Languages via Multilingual Pivoting


Title	Sentence Compression for Arbitrary Languages via Multilingual Pivoting
Authors	Jonathan Mallinson, Rico Sennrich, Mirella Lapata
Abstract	In this paper we advocate the use of bilingual corpora which are abundantly available for training sentence compression models. Our approach borrows much of its machinery from neural machine translation and leverages bilingual pivoting: compressions are obtained by translating a source string into a foreign language and then back-translating it into the source while controlling the translation length. Our model can be trained for any language as long as a bilingual corpus is available and performs arbitrary rewrites without access to compression specific data. We release. Moss, a new parallel Multilingual Compression dataset for English, German, and French which can be used to evaluate compression models across languages and genres.
Tasks	Machine Translation, Sentence Compression, Text Generation, Text Summarization
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1267/
PDF	https://www.aclweb.org/anthology/D18-1267
PWC	https://paperswithcode.com/paper/sentence-compression-for-arbitrary-languages
Repo
Framework

Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence


Title	Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence
Authors	Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, Joseph J. Lim
Abstract	In this paper, we address the task of multi-view novel view synthesis, where we are interested in synthesizing a target image with an arbitrary camera pose from given source images. We propose an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision. Specifically, our model consists of a flow prediction module and a pixel generation module to directly leverage information presented in source views as well as hallucinate missing pixels from statistical priors. To merge the predictions produced by the two modules given multi-view source images, we introduce a self-learned confidence aggregation mechanism. We evaluate our model on images rendered from 3D object models as well as real and synthesized scenes. We demonstrate that our model is able to achieve state-of-the-art results as well as progressively improve its predictions when more source images are available.
Tasks	Novel View Synthesis
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Shao-Hua_Sun_Multi-view_to_Novel_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Shao-Hua_Sun_Multi-view_to_Novel_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/multi-view-to-novel-view-synthesizing-novel
Repo
Framework

Modeling Facial Geometry Using Compositional VAEs


Title	Modeling Facial Geometry Using Compositional VAEs
Authors	Timur Bagautdinov, Chenglei Wu, Jason Saragih, Pascal Fua, Yaser Sheikh
Abstract	We propose a method for learning non-linear face geometry representations using deep generative models. Our model is a variational autoencoder with multiple levels of hidden variables where lower layers capture global geometry and higher ones encode more local deformations. Based on that, we propose a new parameterization of facial geometry that naturally decomposes the structure of the human face into a set of semantically meaningful levels of detail. This parameterization enables us to do model fitting while capturing varying level of detail under different types of geometrical constraints.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Bagautdinov_Modeling_Facial_Geometry_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Bagautdinov_Modeling_Facial_Geometry_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/modeling-facial-geometry-using-compositional
Repo
Framework

Generating Summaries of Sets of Consumer Products: Learning from Experiments


Title	Generating Summaries of Sets of Consumer Products: Learning from Experiments
Authors	Kittipitch Kuptavanich, Ehud Reiter, Kees Van Deemter, Advaith Siddharthan
Abstract	We explored the task of creating a textual summary describing a large set of objects characterised by a small number of features using an e-commerce dataset. When a set of consumer products is large and varied, it can be difficult for a consumer to understand how the products in the set differ; consequently, it can be challenging to choose the most suitable product from the set. To assist consumers, we generated high-level summaries of product sets. Two generation algorithms are presented, discussed, and evaluated with human users. Our evaluation results suggest a positive contribution to consumers{'} understanding of the domain.
Tasks	Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6548/
PDF	https://www.aclweb.org/anthology/W18-6548
PWC	https://paperswithcode.com/paper/generating-summaries-of-sets-of-consumer
Repo
Framework

Learning Robust Rewards with Adverserial Inverse Reinforcement Learning


Title	Learning Robust Rewards with Adverserial Inverse Reinforcement Learning
Authors	Justin Fu, Katie Luo, Sergey Levine
Abstract	Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering. Deep reinforcement learning methods can remove the need for explicit engineering of policy or value features, but still require a manually specified reward function. Inverse reinforcement learning holds the promise of automatic reward acquisition, but has proven exceptionally difficult to apply to large, high-dimensional problems with unknown dynamics. In this work, we propose AIRL, a practical and scalable inverse reinforcement learning algorithm based on an adversarial reward learning formulation that is competitive with direct imitation learning algorithms. Additionally, we show that AIRL is able to recover portable reward functions that are robust to changes in dynamics, enabling us to learn policies even under significant variation in the environment seen during training.
Tasks	Decision Making, Imitation Learning
Published	2018-01-01
URL	https://openreview.net/forum?id=rkHywl-A-
PDF	https://openreview.net/pdf?id=rkHywl-A-
PWC	https://paperswithcode.com/paper/learning-robust-rewards-with-adverserial
Repo
Framework

End-to-End Learning of Task-Oriented Dialogs


Title	End-to-End Learning of Task-Oriented Dialogs
Authors	Bing Liu, Ian Lane
Abstract	In this thesis proposal, we address the limitations of conventional pipeline design of task-oriented dialog systems and propose end-to-end learning solutions. We design neural network based dialog system that is able to robustly track dialog state, interface with knowledge bases, and incorporate structured query results into system responses to successfully complete task-oriented dialog. In learning such neural network based dialog systems, we propose hybrid offline training and online interactive learning methods. We introduce a multi-task learning method in pre-training the dialog agent in a supervised manner using task-oriented dialog corpora. The supervised training agent can further be improved via interacting with users and learning online from user demonstration and feedback with imitation and reinforcement learning. In addressing the sample efficiency issue with online policy learning, we further propose a method by combining the learning-from-user and learning-from-simulation approaches to improve the online interactive learning efficiency.
Tasks	Multi-Task Learning, Spoken Language Understanding
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-4010/
PDF	https://www.aclweb.org/anthology/N18-4010
PWC	https://paperswithcode.com/paper/end-to-end-learning-of-task-oriented-dialogs
Repo
Framework

PEMT for the Public Sector - Evolution of a Solution


Title	PEMT for the Public Sector - Evolution of a Solution
Authors	Konstantine Boukhvalov, S Hogg, y
Abstract
Tasks
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-1919/
PDF	https://www.aclweb.org/anthology/W18-1919
PWC	https://paperswithcode.com/paper/pemt-for-the-public-sector-evolution-of-a
Repo
Framework

GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets


Title	GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets
Authors	Jinsung Yoon, James Jordon, Mihaela van der Schaar
Abstract	Estimating individualized treatment effects (ITE) is a challenging task due to the need for an individual’s potential outcomes to be learned from biased data and without having access to the counterfactuals. We propose a novel method for inferring ITE based on the Generative Adversarial Nets (GANs) framework. Our method, termed Generative Adversarial Nets for inference of Individualized Treatment Effects (GANITE), is motivated by the possibility that we can capture the uncertainty in the counterfactual distributions by attempting to learn them using a GAN. We generate proxies of the counterfactual outcomes using a counterfactual generator, G, and then pass these proxies to an ITE generator, I, in order to train it. By modeling both of these using the GAN framework, we are able to infer based on the factual data, while still accounting for the unseen counterfactuals. We test our method on three real-world datasets (with both binary and multiple treatments) and show that GANITE outperforms state-of-the-art methods.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=ByKWUeWA-
PDF	https://openreview.net/pdf?id=ByKWUeWA-
PWC	https://paperswithcode.com/paper/ganite-estimation-of-individualized-treatment
Repo
Framework

CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization


Title	CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization
Authors	Frederick Tung, Greg Mori
Abstract	Deep neural networks enable state-of-the-art accuracy on visual recognition tasks such as image classification and object detection. However, modern deep networks contain millions of learned weights; a more efficient utilization of computation resources would assist in a variety of deployment scenarios, from embedded platforms with resource constraints to computing clusters running ensembles of networks. In this paper, we combine network pruning and weight quantization in a single learning framework that performs pruning and quantization jointly, and in parallel with fine-tuning. This allows us to take advantage of the complementary nature of pruning and quantization and to recover from premature pruning errors, which is not possible with current two-stage approaches. Our proposed CLIP-Q method (Compression Learning by In-Parallel Pruning-Quantization) compresses AlexNet by 51-fold, GoogLeNet by 10-fold, and ResNet-50 by 15-fold, while preserving the uncompressed network accuracies on ImageNet.
Tasks	Image Classification, Network Pruning, Object Detection, Quantization
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Tung_CLIP-Q_Deep_Network_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Tung_CLIP-Q_Deep_Network_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/clip-q-deep-network-compression-learning-by
Repo
Framework

Unsupervised Hierarchical Video Prediction


Title	Unsupervised Hierarchical Video Prediction
Authors	Nevan Wichers, Dumitru Erhan, Honglak Lee
Abstract	Much recent research has been devoted to video prediction and generation, but mostly for short-scale time horizons. The hierarchical video prediction method by Villegas et al. (2017) is an example of a state of the art method for long term video prediction. However, their method has limited applicability in practical settings as it requires a ground truth pose (e.g., poses of joints of a human) at training time. This paper presents a long term hierarchical video prediction model that does not have such a restriction. We show that the network learns its own higher-level structure (e.g., pose equivalent hidden variables) that works better in cases where the ground truth pose does not fully capture all of the information needed to predict the next frame. This method gives sharper results than other video prediction methods which do not require a ground truth pose, and its efficiency is shown on the Humans 3.6M and Robot Pushing datasets.
Tasks	Video Prediction
Published	2018-01-01
URL	https://openreview.net/forum?id=rkmtTJZCb
PDF	https://openreview.net/pdf?id=rkmtTJZCb
PWC	https://paperswithcode.com/paper/unsupervised-hierarchical-video-prediction
Repo
Framework

Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification


Title	Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
Authors	Wenguan Wang, Yuanlu Xu, Jianbing Shen, Song-Chun Zhu
Abstract	This paper proposes a knowledge-guided fashion network to solve the problem of visual fashion analysis, e.g., fashion landmark localization and clothing category classification. The suggested fashion model is leveraged with high-level human knowledge in this domain. We propose two important fashion grammars: (i) dependency grammar capturing kinematics-like relation, and (ii) symmetry grammar accounting for the bilateral symmetry of clothes. We introduce Bidirectional Convolutional Recurrent Neural Networks (BCRNNs) for efficiently approaching message passing over grammar topologies, and producing regularized landmark layouts. For enhancing clothing category classification, our fashion network is encoded with two novel attention mechanisms, i.e., landmark-aware attention and category-driven attention. The former enforces our network to focus on the functional parts of clothes, and learns domain-knowledge centered representations, leading to a supervised attention mechanism. The latter is goal-driven, which directly enhances task-related features and can be learned in an implicit, top-down manner. Experimental results on large-scale fashion datasets demonstrate the superior performance of our fashion grammar network.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Attentive_Fashion_Grammar_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Attentive_Fashion_Grammar_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/attentive-fashion-grammar-network-for-fashion
Repo
Framework

Supervised and Unsupervised Methods for Robust Separation of Section Titles and Prose Text in Web Documents


Title	Supervised and Unsupervised Methods for Robust Separation of Section Titles and Prose Text in Web Documents
Authors	Abhijith Athreya Mysore Gopinath, Shomir Wilson, Norman Sadeh
Abstract	The text in many web documents is organized into a hierarchy of section titles and corresponding prose content, a structure which provides potentially exploitable information on discourse structure and topicality. However, this organization is generally discarded during text collection, and collecting it is not straightforward: the same visual organization can be implemented in a myriad of different ways in the underlying HTML. To remedy this, we present a flexible system for automatically extracting the hierarchical section titles and prose organization of web documents irrespective of differences in HTML representation. This system uses features from syntax, semantics, discourse and markup to build two models which classify HTML text into section titles and prose text. When tested on three different domains of web text, our domain-independent system achieves an overall precision of 0.82 and a recall of 0.98. The domain-dependent variation produces very high precision (0.99) at the expense of recall (0.75). These results exhibit a robust level of accuracy suitable for enhancing question answering, information extraction, and summarization.
Tasks	Information Retrieval, Question Answering
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1099/
PDF	https://www.aclweb.org/anthology/D18-1099
PWC	https://paperswithcode.com/paper/supervised-and-unsupervised-methods-for
Repo
Framework