January 25, 2020

2962 words 14 mins read

Paper Group NAWR 1

Paper Group NAWR 1

Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models. A Prism Module for Semantic Disentanglement in Name Entity Recognition. ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features. IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstr …

Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models

Title Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models
Authors Tao Yu, Christopher M. De Sa
Abstract Hyperbolic embeddings achieve excellent performance when embedding hierarchical data structures like synonym or type hierarchies, but they can be limited by numerical error when ordinary floating-point numbers are used to represent points in hyperbolic space. Standard models such as the Poincar{'e} disk and the Lorentz model have unbounded numerical error as points get far from the origin. To address this, we propose a new model which uses an integer-based tiling to represent \emph{any} point in hyperbolic space with provably bounded numerical error. This allows us to learn high-precision embeddings without using BigFloats, and enables us to store the resulting embeddings with fewer bits. We evaluate our tiling-based model empirically, and show that it can both compress hyperbolic embeddings (down to $2%$ of a Poincar{'e} embedding on WordNet Nouns) and learn more accurate embeddings on real-world datasets.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8476-numerically-accurate-hyperbolic-embeddings-using-tiling-based-models
PDF http://papers.nips.cc/paper/8476-numerically-accurate-hyperbolic-embeddings-using-tiling-based-models.pdf
PWC https://paperswithcode.com/paper/numerically-accurate-hyperbolic-embeddings
Repo https://github.com/ydtydr/HyperbolicTiling_Learning
Framework pytorch

A Prism Module for Semantic Disentanglement in Name Entity Recognition

Title A Prism Module for Semantic Disentanglement in Name Entity Recognition
Authors Kun Liu, Shen Li, Daqi Zheng, Zhengdong Lu, Sheng Gao, Si Li
Abstract Natural Language Processing has been perplexed for many years by the problem that multiple semantics are mixed inside a word, even with the help of context. To solve this problem, we propose a prism module to disentangle the semantic aspects of words and reduce noise at the input layer of a model. In the prism module, some words are selectively replaced with task-related semantic aspects, then these denoised word representations can be fed into downstream tasks to make them easier. Besides, we also introduce a structure to train this module jointly with the downstream model without additional data. This module can be easily integrated into the downstream model and significantly improve the performance of baselines on named entity recognition (NER) task. The ablation analysis demonstrates the rationality of the method. As a side effect, the proposed method also provides a way to visualize the contribution of each word.
Tasks Named Entity Recognition
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1532/
PDF https://www.aclweb.org/anthology/P19-1532
PWC https://paperswithcode.com/paper/a-prism-module-for-semantic-disentanglement
Repo https://github.com/liukun95/Prism-Module
Framework tf

ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features

Title ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features
Authors Yue Wu, Wael AbdAlmageed, Premkumar Natarajan
Abstract To fight against real-life image forgery, which commonly involves different types and combined manipulations, we propose a unified deep neural architecture called ManTra-Net. Unlike many existing solutions, ManTra-Net is an end-to-end network that performs both detection and localization without extra preprocessing and postprocessing. \manifold is a fully convolutional network and handles images of arbitrary sizes and many known forgery types such splicing, copy-move, removal, enhancement, and even unknown types. This paper has three salient contributions. We design a simple yet effective self-supervised learning task to learn robust image manipulation traces from classifying 385 image manipulation types. Further, we formulate the forgery localization problem as a local anomaly detection problem, design a Z-score feature to capture local anomaly, and propose a novel long short-term memory solution to assess local anomalies. Finally, we carefully conduct ablation experiments to systematically optimize the proposed network design. Our extensive experimental results demonstrate the generalizability, robustness and superiority of ManTra-Net, not only in single types of manipulations/forgeries, but also in their complicated combinations.
Tasks Anomaly Detection
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_ManTra-Net_Manipulation_Tracing_Network_for_Detection_and_Localization_of_Image_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_ManTra-Net_Manipulation_Tracing_Network_for_Detection_and_Localization_of_Image_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/mantra-net-manipulation-tracing-network-for
Repo https://github.com/ISICV/ManTraNet
Framework tf

IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction

Title IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction
Authors Dominic Jack, Frederic Maire, Sareh Shirazi, Anders Eriksson
Abstract Inferring 3D scene information from 2D observations is an open problem in computer vision. We propose using a deep-learning based energy minimization framework to learn a consistency measure between 2D observations and a proposed world model, and demonstrate that this framework can be trained end-to-end to produce consistent and realistic inferences. We evaluate the framework on human pose estimation and voxel-based object reconstruction benchmarks and show competitive results can be achieved with relatively shallow networks with drastically fewer learned parameters and floating point operations than conventional deep-learning approaches.
Tasks Object Reconstruction, Pose Estimation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Jack_IGE-Net_Inverse_Graphics_Energy_Networks_for_Human_Pose_Estimation_and_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Jack_IGE-Net_Inverse_Graphics_Energy_Networks_for_Human_Pose_Estimation_and_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/ige-net-inverse-graphics-energy-networks-for
Repo https://github.com/jackd/ige
Framework tf

STREETS: A Novel Camera Network Dataset for Traffic Flow

Title STREETS: A Novel Camera Network Dataset for Traffic Flow
Authors Corey Snyder, Minh Do
Abstract In this paper, we introduce STREETS, a novel traffic flow dataset from publicly available web cameras in the suburbs of Chicago, IL. We seek to address the limitations of existing datasets in this area. Many such datasets lack a coherent traffic network graph to describe the relationship between sensors. The datasets that do provide a graph depict traffic flow in urban population centers or highway systems and use costly sensors like induction loops. These contexts differ from that of a suburban traffic body. Our dataset provides over 4 million still images across 2.5 months and one hundred web cameras in suburban Lake County, IL. We divide the cameras into two distinct communities described by directed graphs and count vehicles to track traffic statistics. Our goal is to give researchers a benchmark dataset for exploring the capabilities of inexpensive and non-invasive sensors like web cameras to understand complex traffic bodies in communities of any size. We present benchmarking tasks and baseline results for one such task to guide how future work may use our dataset.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9213-streets-a-novel-camera-network-dataset-for-traffic-flow
PDF http://papers.nips.cc/paper/9213-streets-a-novel-camera-network-dataset-for-traffic-flow.pdf
PWC https://paperswithcode.com/paper/streets-a-novel-camera-network-dataset-for
Repo https://github.com/corey-snyder/STREETS
Framework tf

Enriched Feature Guided Refinement Network for Object Detection

Title Enriched Feature Guided Refinement Network for Object Detection
Authors Jing Nie, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
Abstract We propose a single-stage detection framework that jointly tackles the problem of multi-scale object detection and class imbalance. Rather than designing deeper networks, we introduce a simple yet effective feature enrichment scheme to produce multi-scale contextual features. We further introduce a cascaded refinement scheme which first instills multi-scale contextual features into the prediction layers of the single-stage detector in order to enrich their discriminative power for multi-scale detection. Second, the cascaded refinement scheme counters the class imbalance problem by refining the anchors and enriched features to improve classification and regression. Experiments are performed on two benchmarks: PASCAL VOC and MS COCO. For a 320x320 input on the MS COCO test-dev, our detector achieves state-of-the-art single-stage detection accuracy with a COCO AP of 33.2 in the case of single-scale inference, while operating at 21 milliseconds on a Titan XP GPU. For a 512x512 input on the MS COCO test-dev, our approach obtains an absolute gain of 1.6% in terms of COCO AP, compared to the best reported single-stage results[5]. Source code and models are available at: https://github.com/Ranchentx/EFGRNet.
Tasks Object Detection
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Nie_Enriched_Feature_Guided_Refinement_Network_for_Object_Detection_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Nie_Enriched_Feature_Guided_Refinement_Network_for_Object_Detection_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/enriched-feature-guided-refinement-network
Repo https://github.com/Ranchentx/EFGRNet
Framework pytorch

Scalable Deep Generative Relational Model with High-Order Node Dependence

Title Scalable Deep Generative Relational Model with High-Order Node Dependence
Authors Xuhui Fan, Bin Li, Caoyuan Li, Scott Sisson, Ling Chen
Abstract In this work, we propose a probabilistic framework for relational data modelling and latent structure exploring. Given the possible feature information for the nodes in a network, our model builds up a deep architecture that can approximate to the possible nonlinear mappings between the nodes’ feature information and latent representations. For each node, we incorporate all its neighborhoods’ high-order structure information to generate latent representation, such that these latent representations are ``smooth’’ in terms of the network. Since the latent representations are generated from Dirichlet distributions, we further develop a data augmentation trick to enable efficient Gibbs sampling for Ber-Poisson likelihood with Dirichlet random variables. Our model can be ready to apply to large sparse network as its computations cost scales to the number of positive links in the networks. The superior performance of our model is demonstrated through improved link prediction performance on a range of real-world datasets. |
Tasks Data Augmentation, Link Prediction
Published 2019-12-01
URL http://papers.nips.cc/paper/9428-scalable-deep-generative-relational-model-with-high-order-node-dependence
PDF http://papers.nips.cc/paper/9428-scalable-deep-generative-relational-model-with-high-order-node-dependence.pdf
PWC https://paperswithcode.com/paper/scalable-deep-generative-relational-model
Repo https://github.com/xuhuifan/SDREM
Framework none

Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing

Title Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing
Authors Xindi Wang, Robert E. Mercer
Abstract The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles{'} MEDLINE/PubMed MeSH terms is publicly available.
Tasks Document Classification, Text Classification
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5018/
PDF https://www.aclweb.org/anthology/W19-5018
PWC https://paperswithcode.com/paper/incorporating-figure-captions-and-descriptive
Repo https://github.com/xdwang0726/Mesh
Framework none

Kernel Modeling Super-Resolution on Real Low-Resolution Images

Title Kernel Modeling Super-Resolution on Real Low-Resolution Images
Authors Ruofan Zhou, Sabine Susstrunk
Abstract Deep convolutional neural networks (CNNs), trained on corresponding pairs of high- and low-resolution images, achieve state-of-the-art performance in single-image super-resolution and surpass previous signal-processing based approaches. However, their performance is limited when applied to real photographs. The reason lies in their training data: low-resolution (LR) images are obtained by bicubic interpolation of the corresponding high-resolution (HR) images. The applied convolution kernel significantly differs from real-world camera-blur. Consequently, while current CNNs well super-resolve bicubic-downsampled LR images, they often fail on camera-captured LR images. To improve generalization and robustness of deep super-resolution CNNs on real photographs, we present a kernel modeling super-resolution network (KMSR) that incorporates blur-kernel modeling in the training. Our proposed KMSR consists of two stages: we first build a pool of realistic blur-kernels with a generative adversarial network (GAN) and then we train a super-resolution network with HR and corresponding LR images constructed with the generated kernels. Our extensive experimental validations demonstrate the effectiveness of our single-image super-resolution approach on photographs with unknown blur-kernels.
Tasks Image Super-Resolution, Super-Resolution
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Zhou_Kernel_Modeling_Super-Resolution_on_Real_Low-Resolution_Images_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Zhou_Kernel_Modeling_Super-Resolution_on_Real_Low-Resolution_Images_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/kernel-modeling-super-resolution-on-real-low
Repo https://github.com/IVRL/Kernel-Modeling-Super-Resolution
Framework pytorch

Transfer Learning via Minimizing the Performance Gap Between Domains

Title Transfer Learning via Minimizing the Performance Gap Between Domains
Authors Boyu Wang, Jorge Mendez, Mingbo Cai, Eric Eaton
Abstract We propose a new principle for transfer learning, based on a straightforward intuition: if two domains are similar to each other, the model trained on one domain should also perform well on the other domain, and vice versa. To formalize this intuition, we define the performance gap as a measure of the discrepancy between the source and target domains. We derive generalization bounds for the instance weighting approach to transfer learning, showing that the performance gap can be viewed as an algorithm-dependent regularizer, which controls the model complexity. Our theoretical analysis provides new insight into transfer learning and motivates a set of general, principled rules for designing new instance weighting schemes for transfer learning. These rules lead to gapBoost, a novel and principled boosting approach for transfer learning. Our experimental evaluation on benchmark data sets shows that gapBoost significantly outperforms previous boosting-based transfer learning algorithms.
Tasks Transfer Learning
Published 2019-12-01
URL http://papers.nips.cc/paper/9249-transfer-learning-via-minimizing-the-performance-gap-between-domains
PDF http://papers.nips.cc/paper/9249-transfer-learning-via-minimizing-the-performance-gap-between-domains.pdf
PWC https://paperswithcode.com/paper/transfer-learning-via-minimizing-the
Repo https://github.com/bwang-ml/gapBoost
Framework none

DHER: Hindsight Experience Replay for Dynamic Goals

Title DHER: Hindsight Experience Replay for Dynamic Goals
Authors Meng Fang, Cheng Zhou, Bei Shi, Boqing Gong, Jia Xu, Tong Zhang
Abstract Dealing with sparse rewards is one of the most important challenges in reinforcement learning (RL), especially when a goal is dynamic (e.g., to grasp a moving object). Hindsight experience replay (HER) has been shown an effective solution to handling sparse rewards with fixed goals. However, it does not account for dynamic goals in its vanilla form and, as a result, even degrades the performance of existing off-policy RL algorithms when the goal is changing over time. In this paper, we present Dynamic Hindsight Experience Replay (DHER), a novel approach for tasks with dynamic goals in the presence of sparse rewards. DHER automatically assembles successful experiences from two relevant failures and can be used to enhance an arbitrary off-policy RL algorithm when the tasks’ goals are dynamic. We evaluate DHER on tasks of robotic manipulation and moving object tracking, and transfer the polices from simulation to physical robots. Extensive comparison and ablation studies demonstrate the superiority of our approach, showing that DHER is a crucial ingredient to enable RL to solve tasks with dynamic goals in manipulation and grid world domains.
Tasks Object Tracking
Published 2019-05-01
URL https://openreview.net/forum?id=Byf5-30qFX
PDF https://openreview.net/pdf?id=Byf5-30qFX
PWC https://paperswithcode.com/paper/dher-hindsight-experience-replay-for-dynamic
Repo https://github.com/mengf1/DHER
Framework none

Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations

Title Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations
Authors Kevin Smith, Lingjie Mei, Shunyu Yao, Jiajun Wu, Elizabeth Spelke, Josh Tenenbaum, Tomer Ullman
Abstract From infancy, humans have expectations about how objects will move and interact. Even young children expect objects not to move through one another, teleport, or disappear. They are surprised by mismatches between physical expectations and perceptual observations, even in unfamiliar scenes with completely novel objects. A model that exhibits human-like understanding of physics should be similarly surprised, and adjust its beliefs accordingly. We propose ADEPT, a model that uses a coarse (approximate geometry) object-centric representation for dynamic 3D scene understanding. Inference integrates deep recognition networks, extended probabilistic physical simulation, and particle filtering for forming predictions and expectations across occlusion. We also present a new test set for measuring violations of physical expectations, using a range of scenarios derived from developmental psychology. We systematically compare ADEPT, baseline models, and human expectations on this test set. ADEPT outperforms standard network architectures in discriminating physically implausible scenes, and often performs this discrimination at the same level as people.
Tasks Scene Understanding
Published 2019-12-01
URL http://papers.nips.cc/paper/9100-modeling-expectation-violation-in-intuitive-physics-with-coarse-probabilistic-object-representations
PDF http://papers.nips.cc/paper/9100-modeling-expectation-violation-in-intuitive-physics-with-coarse-probabilistic-object-representations.pdf
PWC https://paperswithcode.com/paper/modeling-expectation-violation-in-intuitive
Repo https://github.com/JerryLingjieMei/ADEPT-Model-Release
Framework pytorch

Rethinking Planar Homography Estimation Using Perspective Fields

Title Rethinking Planar Homography Estimation Using Perspective Fields
Authors Rui Zeng, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract Planar homography estimation refers to the problem of computing a bijective linear mapping of pixels between two images. While this problem has been studied with convolutional neural networks (CNNs), existing methods simply regress the location of the four corners using a dense layer preceded by a fully-connected layer. This vector representation damages the spatial structure of the corners since they have a clear spatial order. Moreover, four points are the minimum required to compute the homography, and so such an approach is susceptible to perturbation. In this paper, we propose a conceptually simple, reliable, and general framework for homography estimation. In contrast to previous works, we formulate this problem as a perspective field (PF), which models the essence of the homography - pixel-to-pixel bijection. The PF is naturally learned by the proposed fully convolutional residual network, PFNet, to keep the spatial order of each pixel. Moreover, since every pixels’ displacement can be obtained from the PF, it enables robust homography estimation by utilizing dense correspondences. Our experiments demonstrate the proposed method outperforms traditional correspondence-based approaches and state-of-the-art CNN approaches in terms of accuracy while also having a smaller network size. In addition, the new parameterization of this task is general and can be implemented by any fully convolutional network (FCN) architecture.
Tasks Homography Estimation
Published 2019-05-26
URL https://link.springer.com/chapter/10.1007/978-3-030-20876-9_36
PDF https://eprints.qut.edu.au/126933/
PWC https://paperswithcode.com/paper/rethinking-planar-homography-estimation-using
Repo https://github.com/ruizengalways/PFNet
Framework tf

Bayesian Joint Estimation of Multiple Graphical Models

Title Bayesian Joint Estimation of Multiple Graphical Models
Authors Lingrui Gan, Xinming Yang, Naveen Narisetty, Feng Liang
Abstract In this paper, we propose a novel Bayesian group regularization method based on the spike and slab Lasso priors for jointly estimating multiple graphical models. The proposed method can be used to estimate the common sparsity structure underlying the graphical models while capturing potential heterogeneity of the precision matrices corresponding to those models. Our theoretical results show that the proposed method enjoys the optimal rate of convergence in $\ell_\infty$ norm for estimation consistency and has a strong structure recovery guarantee even when the signal strengths over different graphs are heterogeneous. Through simulation studies and an application to the capital bike-sharing network data, we demonstrate the competitive performance of our method compared to existing alternatives.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9173-bayesian-joint-estimation-of-multiple-graphical-models
PDF http://papers.nips.cc/paper/9173-bayesian-joint-estimation-of-multiple-graphical-models.pdf
PWC https://paperswithcode.com/paper/bayesian-joint-estimation-of-multiple
Repo https://github.com/xinming104/GemBag
Framework none

Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation

Title Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation
Authors Ying-Cong Chen, Xiaogang Xu, Zhuotao Tian, Jiaya Jia
Abstract Generative adversarial networks have achieved great success in unpaired image-to-image translation. Cycle consistency allows modeling the relationship between two distinct domains without paired data. In this paper, we propose an alternative framework, as an extension of latent space interpolation, to consider the intermediate region between two domains during translation. It is based on the fact that in a flat and smooth latent space, there exist many paths that connect two sample points. Properly selecting paths makes it possible to change only certain image attributes, which is useful for generating intermediate images between the two domains. We also show that this framework can be applied to multi-domain and multi-modal translation. Extensive experiments manifest its generality and applicability to various tasks.
Tasks Image-to-Image Translation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Chen_Homomorphic_Latent_Space_Interpolation_for_Unpaired_Image-To-Image_Translation_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Chen_Homomorphic_Latent_Space_Interpolation_for_Unpaired_Image-To-Image_Translation_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/homomorphic-latent-space-interpolation-for
Repo https://github.com/yingcong/HomoInterpGAN
Framework pytorch
comments powered by Disqus