January 25, 2020

2962 words 14 mins read

Paper Group NAWR 1

Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models. A Prism Module for Semantic Disentanglement in Name Entity Recognition. ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features. IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstr …

Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models


Title	Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models
Authors	Tao Yu, Christopher M. De Sa
Abstract	Hyperbolic embeddings achieve excellent performance when embedding hierarchical data structures like synonym or type hierarchies, but they can be limited by numerical error when ordinary floating-point numbers are used to represent points in hyperbolic space. Standard models such as the Poincar{'e} disk and the Lorentz model have unbounded numerical error as points get far from the origin. To address this, we propose a new model which uses an integer-based tiling to represent \emph{any} point in hyperbolic space with provably bounded numerical error. This allows us to learn high-precision embeddings without using BigFloats, and enables us to store the resulting embeddings with fewer bits. We evaluate our tiling-based model empirically, and show that it can both compress hyperbolic embeddings (down to $2%$ of a Poincar{'e} embedding on WordNet Nouns) and learn more accurate embeddings on real-world datasets.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/8476-numerically-accurate-hyperbolic-embeddings-using-tiling-based-models
PDF	http://papers.nips.cc/paper/8476-numerically-accurate-hyperbolic-embeddings-using-tiling-based-models.pdf
PWC	https://paperswithcode.com/paper/numerically-accurate-hyperbolic-embeddings
Repo	https://github.com/ydtydr/HyperbolicTiling_Learning
Framework	pytorch

A Prism Module for Semantic Disentanglement in Name Entity Recognition


Title	A Prism Module for Semantic Disentanglement in Name Entity Recognition
Authors	Kun Liu, Shen Li, Daqi Zheng, Zhengdong Lu, Sheng Gao, Si Li
Abstract	Natural Language Processing has been perplexed for many years by the problem that multiple semantics are mixed inside a word, even with the help of context. To solve this problem, we propose a prism module to disentangle the semantic aspects of words and reduce noise at the input layer of a model. In the prism module, some words are selectively replaced with task-related semantic aspects, then these denoised word representations can be fed into downstream tasks to make them easier. Besides, we also introduce a structure to train this module jointly with the downstream model without additional data. This module can be easily integrated into the downstream model and significantly improve the performance of baselines on named entity recognition (NER) task. The ablation analysis demonstrates the rationality of the method. As a side effect, the proposed method also provides a way to visualize the contribution of each word.
Tasks	Named Entity Recognition
Published	2019-07-01
URL	https://www.aclweb.org/anthology/P19-1532/
PDF	https://www.aclweb.org/anthology/P19-1532
PWC	https://paperswithcode.com/paper/a-prism-module-for-semantic-disentanglement
Repo	https://github.com/liukun95/Prism-Module
Framework	tf

ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features


Title	ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features
Authors	Yue Wu, Wael AbdAlmageed, Premkumar Natarajan
Abstract	To fight against real-life image forgery, which commonly involves different types and combined manipulations, we propose a unified deep neural architecture called ManTra-Net. Unlike many existing solutions, ManTra-Net is an end-to-end network that performs both detection and localization without extra preprocessing and postprocessing. \manifold is a fully convolutional network and handles images of arbitrary sizes and many known forgery types such splicing, copy-move, removal, enhancement, and even unknown types. This paper has three salient contributions. We design a simple yet effective self-supervised learning task to learn robust image manipulation traces from classifying 385 image manipulation types. Further, we formulate the forgery localization problem as a local anomaly detection problem, design a Z-score feature to capture local anomaly, and propose a novel long short-term memory solution to assess local anomalies. Finally, we carefully conduct ablation experiments to systematically optimize the proposed network design. Our extensive experimental results demonstrate the generalizability, robustness and superiority of ManTra-Net, not only in single types of manipulations/forgeries, but also in their complicated combinations.
Tasks	Anomaly Detection
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_ManTra-Net_Manipulation_Tracing_Network_for_Detection_and_Localization_of_Image_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_ManTra-Net_Manipulation_Tracing_Network_for_Detection_and_Localization_of_Image_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/mantra-net-manipulation-tracing-network-for
Repo	https://github.com/ISICV/ManTraNet
Framework	tf

IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction


Title	IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction
Authors	Dominic Jack, Frederic Maire, Sareh Shirazi, Anders Eriksson
Abstract	Inferring 3D scene information from 2D observations is an open problem in computer vision. We propose using a deep-learning based energy minimization framework to learn a consistency measure between 2D observations and a proposed world model, and demonstrate that this framework can be trained end-to-end to produce consistent and realistic inferences. We evaluate the framework on human pose estimation and voxel-based object reconstruction benchmarks and show competitive results can be achieved with relatively shallow networks with drastically fewer learned parameters and floating point operations than conventional deep-learning approaches.
Tasks	Object Reconstruction, Pose Estimation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Jack_IGE-Net_Inverse_Graphics_Energy_Networks_for_Human_Pose_Estimation_and_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Jack_IGE-Net_Inverse_Graphics_Energy_Networks_for_Human_Pose_Estimation_and_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/ige-net-inverse-graphics-energy-networks-for
Repo	https://github.com/jackd/ige
Framework	tf

STREETS: A Novel Camera Network Dataset for Traffic Flow


Title	STREETS: A Novel Camera Network Dataset for Traffic Flow
Authors	Corey Snyder, Minh Do
Abstract	In this paper, we introduce STREETS, a novel traffic flow dataset from publicly available web cameras in the suburbs of Chicago, IL. We seek to address the limitations of existing datasets in this area. Many such datasets lack a coherent traffic network graph to describe the relationship between sensors. The datasets that do provide a graph depict traffic flow in urban population centers or highway systems and use costly sensors like induction loops. These contexts differ from that of a suburban traffic body. Our dataset provides over 4 million still images across 2.5 months and one hundred web cameras in suburban Lake County, IL. We divide the cameras into two distinct communities described by directed graphs and count vehicles to track traffic statistics. Our goal is to give researchers a benchmark dataset for exploring the capabilities of inexpensive and non-invasive sensors like web cameras to understand complex traffic bodies in communities of any size. We present benchmarking tasks and baseline results for one such task to guide how future work may use our dataset.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9213-streets-a-novel-camera-network-dataset-for-traffic-flow
PDF	http://papers.nips.cc/paper/9213-streets-a-novel-camera-network-dataset-for-traffic-flow.pdf
PWC	https://paperswithcode.com/paper/streets-a-novel-camera-network-dataset-for
Repo	https://github.com/corey-snyder/STREETS
Framework	tf


Title	Enriched Feature Guided Refinement Network for Object Detection
Authors	Jing Nie, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
Abstract	We propose a single-stage detection framework that jointly tackles the problem of multi-scale object detection and class imbalance. Rather than designing deeper networks, we introduce a simple yet effective feature enrichment scheme to produce multi-scale contextual features. We further introduce a cascaded refinement scheme which first instills multi-scale contextual features into the prediction layers of the single-stage detector in order to enrich their discriminative power for multi-scale detection. Second, the cascaded refinement scheme counters the class imbalance problem by refining the anchors and enriched features to improve classification and regression. Experiments are performed on two benchmarks: PASCAL VOC and MS COCO. For a 320x320 input on the MS COCO test-dev, our detector achieves state-of-the-art single-stage detection accuracy with a COCO AP of 33.2 in the case of single-scale inference, while operating at 21 milliseconds on a Titan XP GPU. For a 512x512 input on the MS COCO test-dev, our approach obtains an absolute gain of 1.6% in terms of COCO AP, compared to the best reported single-stage results[5]. Source code and models are available at: https://github.com/Ranchentx/EFGRNet.
Tasks	Object Detection
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Nie_Enriched_Feature_Guided_Refinement_Network_for_Object_Detection_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Nie_Enriched_Feature_Guided_Refinement_Network_for_Object_Detection_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/enriched-feature-guided-refinement-network
Repo	https://github.com/Ranchentx/EFGRNet
Framework	pytorch

Scalable Deep Generative Relational Model with High-Order Node Dependence


Title	Scalable Deep Generative Relational Model with High-Order Node Dependence
Authors	Xuhui Fan, Bin Li, Caoyuan Li, Scott Sisson, Ling Chen
Abstract	In this work, we propose a probabilistic framework for relational data modelling and latent structure exploring. Given the possible feature information for the nodes in a network, our model builds up a deep architecture that can approximate to the possible nonlinear mappings between the nodes’ feature information and latent representations. For each node, we incorporate all its neighborhoods’ high-order structure information to generate latent representation, such that these latent representations are ``smooth’’ in terms of the network. Since the latent representations are generated from Dirichlet distributions, we further develop a data augmentation trick to enable efficient Gibbs sampling for Ber-Poisson likelihood with Dirichlet random variables. Our model can be ready to apply to large sparse network as its computations cost scales to the number of positive links in the networks. The superior performance of our model is demonstrated through improved link prediction performance on a range of real-world datasets. \|
Tasks	Data Augmentation, Link Prediction
Published	2019-12-01
URL	http://papers.nips.cc/paper/9428-scalable-deep-generative-relational-model-with-high-order-node-dependence
PDF	http://papers.nips.cc/paper/9428-scalable-deep-generative-relational-model-with-high-order-node-dependence.pdf
PWC	https://paperswithcode.com/paper/scalable-deep-generative-relational-model
Repo	https://github.com/xuhuifan/SDREM
Framework	none

Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing


Title	Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing
Authors	Xindi Wang, Robert E. Mercer
Abstract	The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles{'} MEDLINE/PubMed MeSH terms is publicly available.
Tasks	Document Classification, Text Classification
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5018/
PDF	https://www.aclweb.org/anthology/W19-5018
PWC	https://paperswithcode.com/paper/incorporating-figure-captions-and-descriptive
Repo	https://github.com/xdwang0726/Mesh
Framework	none

Kernel Modeling Super-Resolution on Real Low-Resolution Images


Title	Kernel Modeling Super-Resolution on Real Low-Resolution Images
Authors	Ruofan Zhou, Sabine Susstrunk
Abstract	Deep convolutional neural networks (CNNs), trained on corresponding pairs of high- and low-resolution images, achieve state-of-the-art performance in single-image super-resolution and surpass previous signal-processing based approaches. However, their performance is limited when applied to real photographs. The reason lies in their training data: low-resolution (LR) images are obtained by bicubic interpolation of the corresponding high-resolution (HR) images. The applied convolution kernel significantly differs from real-world camera-blur. Consequently, while current CNNs well super-resolve bicubic-downsampled LR images, they often fail on camera-captured LR images. To improve generalization and robustness of deep super-resolution CNNs on real photographs, we present a kernel modeling super-resolution network (KMSR) that incorporates blur-kernel modeling in the training. Our proposed KMSR consists of two stages: we first build a pool of realistic blur-kernels with a generative adversarial network (GAN) and then we train a super-resolution network with HR and corresponding LR images constructed with the generated kernels. Our extensive experimental validations demonstrate the effectiveness of our single-image super-resolution approach on photographs with unknown blur-kernels.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Zhou_Kernel_Modeling_Super-Resolution_on_Real_Low-Resolution_Images_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Zhou_Kernel_Modeling_Super-Resolution_on_Real_Low-Resolution_Images_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/kernel-modeling-super-resolution-on-real-low
Repo	https://github.com/IVRL/Kernel-Modeling-Super-Resolution
Framework	pytorch

Transfer Learning via Minimizing the Performance Gap Between Domains


Title	Transfer Learning via Minimizing the Performance Gap Between Domains
Authors	Boyu Wang, Jorge Mendez, Mingbo Cai, Eric Eaton
Abstract	We propose a new principle for transfer learning, based on a straightforward intuition: if two domains are similar to each other, the model trained on one domain should also perform well on the other domain, and vice versa. To formalize this intuition, we define the performance gap as a measure of the discrepancy between the source and target domains. We derive generalization bounds for the instance weighting approach to transfer learning, showing that the performance gap can be viewed as an algorithm-dependent regularizer, which controls the model complexity. Our theoretical analysis provides new insight into transfer learning and motivates a set of general, principled rules for designing new instance weighting schemes for transfer learning. These rules lead to gapBoost, a novel and principled boosting approach for transfer learning. Our experimental evaluation on benchmark data sets shows that gapBoost significantly outperforms previous boosting-based transfer learning algorithms.
Tasks	Transfer Learning
Published	2019-12-01
URL	http://papers.nips.cc/paper/9249-transfer-learning-via-minimizing-the-performance-gap-between-domains
PDF	http://papers.nips.cc/paper/9249-transfer-learning-via-minimizing-the-performance-gap-between-domains.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-via-minimizing-the
Repo	https://github.com/bwang-ml/gapBoost
Framework	none

DHER: Hindsight Experience Replay for Dynamic Goals


Title	DHER: Hindsight Experience Replay for Dynamic Goals
Authors	Meng Fang, Cheng Zhou, Bei Shi, Boqing Gong, Jia Xu, Tong Zhang
Abstract	Dealing with sparse rewards is one of the most important challenges in reinforcement learning (RL), especially when a goal is dynamic (e.g., to grasp a moving object). Hindsight experience replay (HER) has been shown an effective solution to handling sparse rewards with fixed goals. However, it does not account for dynamic goals in its vanilla form and, as a result, even degrades the performance of existing off-policy RL algorithms when the goal is changing over time. In this paper, we present Dynamic Hindsight Experience Replay (DHER), a novel approach for tasks with dynamic goals in the presence of sparse rewards. DHER automatically assembles successful experiences from two relevant failures and can be used to enhance an arbitrary off-policy RL algorithm when the tasks’ goals are dynamic. We evaluate DHER on tasks of robotic manipulation and moving object tracking, and transfer the polices from simulation to physical robots. Extensive comparison and ablation studies demonstrate the superiority of our approach, showing that DHER is a crucial ingredient to enable RL to solve tasks with dynamic goals in manipulation and grid world domains.
Tasks	Object Tracking
Published	2019-05-01
URL	https://openreview.net/forum?id=Byf5-30qFX
PDF	https://openreview.net/pdf?id=Byf5-30qFX
PWC	https://paperswithcode.com/paper/dher-hindsight-experience-replay-for-dynamic
Repo	https://github.com/mengf1/DHER
Framework	none

Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations


Title	Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations
Authors	Kevin Smith, Lingjie Mei, Shunyu Yao, Jiajun Wu, Elizabeth Spelke, Josh Tenenbaum, Tomer Ullman
Abstract	From infancy, humans have expectations about how objects will move and interact. Even young children expect objects not to move through one another, teleport, or disappear. They are surprised by mismatches between physical expectations and perceptual observations, even in unfamiliar scenes with completely novel objects. A model that exhibits human-like understanding of physics should be similarly surprised, and adjust its beliefs accordingly. We propose ADEPT, a model that uses a coarse (approximate geometry) object-centric representation for dynamic 3D scene understanding. Inference integrates deep recognition networks, extended probabilistic physical simulation, and particle filtering for forming predictions and expectations across occlusion. We also present a new test set for measuring violations of physical expectations, using a range of scenarios derived from developmental psychology. We systematically compare ADEPT, baseline models, and human expectations on this test set. ADEPT outperforms standard network architectures in discriminating physically implausible scenes, and often performs this discrimination at the same level as people.
Tasks	Scene Understanding
Published	2019-12-01
URL	http://papers.nips.cc/paper/9100-modeling-expectation-violation-in-intuitive-physics-with-coarse-probabilistic-object-representations
PDF	http://papers.nips.cc/paper/9100-modeling-expectation-violation-in-intuitive-physics-with-coarse-probabilistic-object-representations.pdf
PWC	https://paperswithcode.com/paper/modeling-expectation-violation-in-intuitive
Repo	https://github.com/JerryLingjieMei/ADEPT-Model-Release
Framework	pytorch

Rethinking Planar Homography Estimation Using Perspective Fields


Title	Rethinking Planar Homography Estimation Using Perspective Fields
Authors	Rui Zeng, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract	Planar homography estimation refers to the problem of computing a bijective linear mapping of pixels between two images. While this problem has been studied with convolutional neural networks (CNNs), existing methods simply regress the location of the four corners using a dense layer preceded by a fully-connected layer. This vector representation damages the spatial structure of the corners since they have a clear spatial order. Moreover, four points are the minimum required to compute the homography, and so such an approach is susceptible to perturbation. In this paper, we propose a conceptually simple, reliable, and general framework for homography estimation. In contrast to previous works, we formulate this problem as a perspective field (PF), which models the essence of the homography - pixel-to-pixel bijection. The PF is naturally learned by the proposed fully convolutional residual network, PFNet, to keep the spatial order of each pixel. Moreover, since every pixels’ displacement can be obtained from the PF, it enables robust homography estimation by utilizing dense correspondences. Our experiments demonstrate the proposed method outperforms traditional correspondence-based approaches and state-of-the-art CNN approaches in terms of accuracy while also having a smaller network size. In addition, the new parameterization of this task is general and can be implemented by any fully convolutional network (FCN) architecture.
Tasks	Homography Estimation
Published	2019-05-26
URL	https://link.springer.com/chapter/10.1007/978-3-030-20876-9_36
PDF	https://eprints.qut.edu.au/126933/
PWC	https://paperswithcode.com/paper/rethinking-planar-homography-estimation-using
Repo	https://github.com/ruizengalways/PFNet
Framework	tf

Bayesian Joint Estimation of Multiple Graphical Models


Title	Bayesian Joint Estimation of Multiple Graphical Models
Authors	Lingrui Gan, Xinming Yang, Naveen Narisetty, Feng Liang
Abstract	In this paper, we propose a novel Bayesian group regularization method based on the spike and slab Lasso priors for jointly estimating multiple graphical models. The proposed method can be used to estimate the common sparsity structure underlying the graphical models while capturing potential heterogeneity of the precision matrices corresponding to those models. Our theoretical results show that the proposed method enjoys the optimal rate of convergence in $\ell_\infty$ norm for estimation consistency and has a strong structure recovery guarantee even when the signal strengths over different graphs are heterogeneous. Through simulation studies and an application to the capital bike-sharing network data, we demonstrate the competitive performance of our method compared to existing alternatives.
Tasks
Published	2019-12-01
URL	http://papers.nips.cc/paper/9173-bayesian-joint-estimation-of-multiple-graphical-models
PDF	http://papers.nips.cc/paper/9173-bayesian-joint-estimation-of-multiple-graphical-models.pdf
PWC	https://paperswithcode.com/paper/bayesian-joint-estimation-of-multiple
Repo	https://github.com/xinming104/GemBag
Framework	none

Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation


Title	Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation
Authors	Ying-Cong Chen, Xiaogang Xu, Zhuotao Tian, Jiaya Jia
Abstract	Generative adversarial networks have achieved great success in unpaired image-to-image translation. Cycle consistency allows modeling the relationship between two distinct domains without paired data. In this paper, we propose an alternative framework, as an extension of latent space interpolation, to consider the intermediate region between two domains during translation. It is based on the fact that in a flat and smooth latent space, there exist many paths that connect two sample points. Properly selecting paths makes it possible to change only certain image attributes, which is useful for generating intermediate images between the two domains. We also show that this framework can be applied to multi-domain and multi-modal translation. Extensive experiments manifest its generality and applicability to various tasks.
Tasks	Image-to-Image Translation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Chen_Homomorphic_Latent_Space_Interpolation_for_Unpaired_Image-To-Image_Translation_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Chen_Homomorphic_Latent_Space_Interpolation_for_Unpaired_Image-To-Image_Translation_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/homomorphic-latent-space-interpolation-for
Repo	https://github.com/yingcong/HomoInterpGAN
Framework	pytorch