Paper Group NANR 13
Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders. Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection. T"ubingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification. Sentence Compre …
Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders
Title | Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders |
Authors | Panna Felsen, Patrick Lucey, Sujoy Ganguly |
Abstract | Simultaneously and accurately forecasting the behavior of many interacting agents is imperative for computer vision applications to be widely deployed (e.g., autonomous vehicles, security, surveillance, sports). In this paper, we present a technique using conditional variational autoencoder which learns a model that “personalizes’’ prediction to individual agent behavior within a group representation. Given the volume of data available and its adversarial nature, we focus on the sport of basketball and show that our approach efficiently predicts context-specific agent motions. We find that our model generates results that are three times as accurate as previous state of the art approaches (5.74 ft vs. 17.95 ft). |
Tasks | Autonomous Vehicles |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Panna_Felsen_Where_Will_They_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Panna_Felsen_Where_Will_They_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/where-will-they-go-predicting-fine-grained |
Repo | |
Framework | |
Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection
Title | Progressively Complementarity-Aware Fusion Network for RGB-D Salient Object Detection |
Authors | Hao Chen, Youfu Li |
Abstract | How to incorporate cross-modal complementarity sufficiently is the cornerstone question for RGB-D salient object detection. Previous works mainly address this issue by simply concatenating multi-modal features or combining unimodal predictions. In this paper, we answer this question from two perspectives: (1) We argue that if the complementary part can be modelled more explicitly, the cross-modal complement is likely to be better captured. To this end, we design a novel complementarity-aware fusion (CA-Fuse) module when adopting the Convolutional Neural Network (CNN). By introducing cross-modal residual functions and complementarity-aware supervisions in each CA-Fuse module, the problem of learning complementary information from the paired modality is explicitly posed as asymptotically approximating the residual function. (2) Exploring the complement across all the levels. By cascading the CA-Fuse module and adding level-wise supervision from deep to shallow densely, the cross-level complement can be selected and combined progressively. The proposed RGB-D fusion network disambiguates both cross-modal and cross-level fusion processes and enables more sufficient fusion results. The experiments on public datasets show the effectiveness of the proposed CA-Fuse module and the RGB-D salient object detection network. |
Tasks | Object Detection, Salient Object Detection |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_Progressively_Complementarity-Aware_Fusion_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_Progressively_Complementarity-Aware_Fusion_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/progressively-complementarity-aware-fusion |
Repo | |
Framework | |
T"ubingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification
Title | T"ubingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification |
Authors | {\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Taraka Rama, Verena Blaschke |
Abstract | This paper describes our systems for the VarDial 2018 evaluation campaign. We participated in all language identification tasks, namely, Arabic dialect identification (ADI), German dialect identification (GDI), discriminating between Dutch and Flemish in Subtitles (DFS), and Indo-Aryan Language Identification (ILI). In all of the tasks, we only used textual transcripts (not using audio features for ADI). We submitted system runs based on support vector machine classifiers (SVMs) with bag of character and word n-grams as features, and gated bidirectional recurrent neural networks (RNNs) using units of characters and words. Our SVM models outperformed our RNN models in all tasks, obtaining the first place on the DFS task, third place on the ADI task, and second place on others according to the official rankings. As well as describing the models we used in the shared task participation, we present an analysis of the n-gram features used by the SVM models in each task, and also report additional results (that were run after the official competition deadline) on the GDI surprise dialect track. |
Tasks | Document Classification, Language Identification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3906/ |
https://www.aclweb.org/anthology/W18-3906 | |
PWC | https://paperswithcode.com/paper/ta14bingen-oslo-team-at-the-vardial-2018 |
Repo | |
Framework | |
Sentence Compression for Arbitrary Languages via Multilingual Pivoting
Title | Sentence Compression for Arbitrary Languages via Multilingual Pivoting |
Authors | Jonathan Mallinson, Rico Sennrich, Mirella Lapata |
Abstract | In this paper we advocate the use of bilingual corpora which are abundantly available for training sentence compression models. Our approach borrows much of its machinery from neural machine translation and leverages bilingual pivoting: compressions are obtained by translating a source string into a foreign language and then back-translating it into the source while controlling the translation length. Our model can be trained for any language as long as a bilingual corpus is available and performs arbitrary rewrites without access to compression specific data. We release. Moss, a new parallel Multilingual Compression dataset for English, German, and French which can be used to evaluate compression models across languages and genres. |
Tasks | Machine Translation, Sentence Compression, Text Generation, Text Summarization |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1267/ |
https://www.aclweb.org/anthology/D18-1267 | |
PWC | https://paperswithcode.com/paper/sentence-compression-for-arbitrary-languages |
Repo | |
Framework | |
Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence
Title | Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence |
Authors | Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, Joseph J. Lim |
Abstract | In this paper, we address the task of multi-view novel view synthesis, where we are interested in synthesizing a target image with an arbitrary camera pose from given source images. We propose an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision. Specifically, our model consists of a flow prediction module and a pixel generation module to directly leverage information presented in source views as well as hallucinate missing pixels from statistical priors. To merge the predictions produced by the two modules given multi-view source images, we introduce a self-learned confidence aggregation mechanism. We evaluate our model on images rendered from 3D object models as well as real and synthesized scenes. We demonstrate that our model is able to achieve state-of-the-art results as well as progressively improve its predictions when more source images are available. |
Tasks | Novel View Synthesis |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Shao-Hua_Sun_Multi-view_to_Novel_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Shao-Hua_Sun_Multi-view_to_Novel_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/multi-view-to-novel-view-synthesizing-novel |
Repo | |
Framework | |
Modeling Facial Geometry Using Compositional VAEs
Title | Modeling Facial Geometry Using Compositional VAEs |
Authors | Timur Bagautdinov, Chenglei Wu, Jason Saragih, Pascal Fua, Yaser Sheikh |
Abstract | We propose a method for learning non-linear face geometry representations using deep generative models. Our model is a variational autoencoder with multiple levels of hidden variables where lower layers capture global geometry and higher ones encode more local deformations. Based on that, we propose a new parameterization of facial geometry that naturally decomposes the structure of the human face into a set of semantically meaningful levels of detail. This parameterization enables us to do model fitting while capturing varying level of detail under different types of geometrical constraints. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Bagautdinov_Modeling_Facial_Geometry_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Bagautdinov_Modeling_Facial_Geometry_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/modeling-facial-geometry-using-compositional |
Repo | |
Framework | |
Generating Summaries of Sets of Consumer Products: Learning from Experiments
Title | Generating Summaries of Sets of Consumer Products: Learning from Experiments |
Authors | Kittipitch Kuptavanich, Ehud Reiter, Kees Van Deemter, Advaith Siddharthan |
Abstract | We explored the task of creating a textual summary describing a large set of objects characterised by a small number of features using an e-commerce dataset. When a set of consumer products is large and varied, it can be difficult for a consumer to understand how the products in the set differ; consequently, it can be challenging to choose the most suitable product from the set. To assist consumers, we generated high-level summaries of product sets. Two generation algorithms are presented, discussed, and evaluated with human users. Our evaluation results suggest a positive contribution to consumers{'} understanding of the domain. |
Tasks | Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6548/ |
https://www.aclweb.org/anthology/W18-6548 | |
PWC | https://paperswithcode.com/paper/generating-summaries-of-sets-of-consumer |
Repo | |
Framework | |
Learning Robust Rewards with Adverserial Inverse Reinforcement Learning
Title | Learning Robust Rewards with Adverserial Inverse Reinforcement Learning |
Authors | Justin Fu, Katie Luo, Sergey Levine |
Abstract | Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering. Deep reinforcement learning methods can remove the need for explicit engineering of policy or value features, but still require a manually specified reward function. Inverse reinforcement learning holds the promise of automatic reward acquisition, but has proven exceptionally difficult to apply to large, high-dimensional problems with unknown dynamics. In this work, we propose AIRL, a practical and scalable inverse reinforcement learning algorithm based on an adversarial reward learning formulation that is competitive with direct imitation learning algorithms. Additionally, we show that AIRL is able to recover portable reward functions that are robust to changes in dynamics, enabling us to learn policies even under significant variation in the environment seen during training. |
Tasks | Decision Making, Imitation Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rkHywl-A- |
https://openreview.net/pdf?id=rkHywl-A- | |
PWC | https://paperswithcode.com/paper/learning-robust-rewards-with-adverserial |
Repo | |
Framework | |
End-to-End Learning of Task-Oriented Dialogs
Title | End-to-End Learning of Task-Oriented Dialogs |
Authors | Bing Liu, Ian Lane |
Abstract | In this thesis proposal, we address the limitations of conventional pipeline design of task-oriented dialog systems and propose end-to-end learning solutions. We design neural network based dialog system that is able to robustly track dialog state, interface with knowledge bases, and incorporate structured query results into system responses to successfully complete task-oriented dialog. In learning such neural network based dialog systems, we propose hybrid offline training and online interactive learning methods. We introduce a multi-task learning method in pre-training the dialog agent in a supervised manner using task-oriented dialog corpora. The supervised training agent can further be improved via interacting with users and learning online from user demonstration and feedback with imitation and reinforcement learning. In addressing the sample efficiency issue with online policy learning, we further propose a method by combining the learning-from-user and learning-from-simulation approaches to improve the online interactive learning efficiency. |
Tasks | Multi-Task Learning, Spoken Language Understanding |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-4010/ |
https://www.aclweb.org/anthology/N18-4010 | |
PWC | https://paperswithcode.com/paper/end-to-end-learning-of-task-oriented-dialogs |
Repo | |
Framework | |
PEMT for the Public Sector - Evolution of a Solution
Title | PEMT for the Public Sector - Evolution of a Solution |
Authors | Konstantine Boukhvalov, S Hogg, y |
Abstract | |
Tasks | |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-1919/ |
https://www.aclweb.org/anthology/W18-1919 | |
PWC | https://paperswithcode.com/paper/pemt-for-the-public-sector-evolution-of-a |
Repo | |
Framework | |
GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets
Title | GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets |
Authors | Jinsung Yoon, James Jordon, Mihaela van der Schaar |
Abstract | Estimating individualized treatment effects (ITE) is a challenging task due to the need for an individual’s potential outcomes to be learned from biased data and without having access to the counterfactuals. We propose a novel method for inferring ITE based on the Generative Adversarial Nets (GANs) framework. Our method, termed Generative Adversarial Nets for inference of Individualized Treatment Effects (GANITE), is motivated by the possibility that we can capture the uncertainty in the counterfactual distributions by attempting to learn them using a GAN. We generate proxies of the counterfactual outcomes using a counterfactual generator, G, and then pass these proxies to an ITE generator, I, in order to train it. By modeling both of these using the GAN framework, we are able to infer based on the factual data, while still accounting for the unseen counterfactuals. We test our method on three real-world datasets (with both binary and multiple treatments) and show that GANITE outperforms state-of-the-art methods. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=ByKWUeWA- |
https://openreview.net/pdf?id=ByKWUeWA- | |
PWC | https://paperswithcode.com/paper/ganite-estimation-of-individualized-treatment |
Repo | |
Framework | |
CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization
Title | CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization |
Authors | Frederick Tung, Greg Mori |
Abstract | Deep neural networks enable state-of-the-art accuracy on visual recognition tasks such as image classification and object detection. However, modern deep networks contain millions of learned weights; a more efficient utilization of computation resources would assist in a variety of deployment scenarios, from embedded platforms with resource constraints to computing clusters running ensembles of networks. In this paper, we combine network pruning and weight quantization in a single learning framework that performs pruning and quantization jointly, and in parallel with fine-tuning. This allows us to take advantage of the complementary nature of pruning and quantization and to recover from premature pruning errors, which is not possible with current two-stage approaches. Our proposed CLIP-Q method (Compression Learning by In-Parallel Pruning-Quantization) compresses AlexNet by 51-fold, GoogLeNet by 10-fold, and ResNet-50 by 15-fold, while preserving the uncompressed network accuracies on ImageNet. |
Tasks | Image Classification, Network Pruning, Object Detection, Quantization |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Tung_CLIP-Q_Deep_Network_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Tung_CLIP-Q_Deep_Network_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/clip-q-deep-network-compression-learning-by |
Repo | |
Framework | |
Unsupervised Hierarchical Video Prediction
Title | Unsupervised Hierarchical Video Prediction |
Authors | Nevan Wichers, Dumitru Erhan, Honglak Lee |
Abstract | Much recent research has been devoted to video prediction and generation, but mostly for short-scale time horizons. The hierarchical video prediction method by Villegas et al. (2017) is an example of a state of the art method for long term video prediction. However, their method has limited applicability in practical settings as it requires a ground truth pose (e.g., poses of joints of a human) at training time. This paper presents a long term hierarchical video prediction model that does not have such a restriction. We show that the network learns its own higher-level structure (e.g., pose equivalent hidden variables) that works better in cases where the ground truth pose does not fully capture all of the information needed to predict the next frame. This method gives sharper results than other video prediction methods which do not require a ground truth pose, and its efficiency is shown on the Humans 3.6M and Robot Pushing datasets. |
Tasks | Video Prediction |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rkmtTJZCb |
https://openreview.net/pdf?id=rkmtTJZCb | |
PWC | https://paperswithcode.com/paper/unsupervised-hierarchical-video-prediction |
Repo | |
Framework | |
Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
Title | Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification |
Authors | Wenguan Wang, Yuanlu Xu, Jianbing Shen, Song-Chun Zhu |
Abstract | This paper proposes a knowledge-guided fashion network to solve the problem of visual fashion analysis, e.g., fashion landmark localization and clothing category classification. The suggested fashion model is leveraged with high-level human knowledge in this domain. We propose two important fashion grammars: (i) dependency grammar capturing kinematics-like relation, and (ii) symmetry grammar accounting for the bilateral symmetry of clothes. We introduce Bidirectional Convolutional Recurrent Neural Networks (BCRNNs) for efficiently approaching message passing over grammar topologies, and producing regularized landmark layouts. For enhancing clothing category classification, our fashion network is encoded with two novel attention mechanisms, i.e., landmark-aware attention and category-driven attention. The former enforces our network to focus on the functional parts of clothes, and learns domain-knowledge centered representations, leading to a supervised attention mechanism. The latter is goal-driven, which directly enhances task-related features and can be learned in an implicit, top-down manner. Experimental results on large-scale fashion datasets demonstrate the superior performance of our fashion grammar network. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Attentive_Fashion_Grammar_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Attentive_Fashion_Grammar_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/attentive-fashion-grammar-network-for-fashion |
Repo | |
Framework | |
Supervised and Unsupervised Methods for Robust Separation of Section Titles and Prose Text in Web Documents
Title | Supervised and Unsupervised Methods for Robust Separation of Section Titles and Prose Text in Web Documents |
Authors | Abhijith Athreya Mysore Gopinath, Shomir Wilson, Norman Sadeh |
Abstract | The text in many web documents is organized into a hierarchy of section titles and corresponding prose content, a structure which provides potentially exploitable information on discourse structure and topicality. However, this organization is generally discarded during text collection, and collecting it is not straightforward: the same visual organization can be implemented in a myriad of different ways in the underlying HTML. To remedy this, we present a flexible system for automatically extracting the hierarchical section titles and prose organization of web documents irrespective of differences in HTML representation. This system uses features from syntax, semantics, discourse and markup to build two models which classify HTML text into section titles and prose text. When tested on three different domains of web text, our domain-independent system achieves an overall precision of 0.82 and a recall of 0.98. The domain-dependent variation produces very high precision (0.99) at the expense of recall (0.75). These results exhibit a robust level of accuracy suitable for enhancing question answering, information extraction, and summarization. |
Tasks | Information Retrieval, Question Answering |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1099/ |
https://www.aclweb.org/anthology/D18-1099 | |
PWC | https://paperswithcode.com/paper/supervised-and-unsupervised-methods-for |
Repo | |
Framework | |