Paper Group NAWR 1
Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models. A Prism Module for Semantic Disentanglement in Name Entity Recognition. ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features. IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstr …
Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models
Title | Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models |
Authors | Tao Yu, Christopher M. De Sa |
Abstract | Hyperbolic embeddings achieve excellent performance when embedding hierarchical data structures like synonym or type hierarchies, but they can be limited by numerical error when ordinary floating-point numbers are used to represent points in hyperbolic space. Standard models such as the Poincar{'e} disk and the Lorentz model have unbounded numerical error as points get far from the origin. To address this, we propose a new model which uses an integer-based tiling to represent \emph{any} point in hyperbolic space with provably bounded numerical error. This allows us to learn high-precision embeddings without using BigFloats, and enables us to store the resulting embeddings with fewer bits. We evaluate our tiling-based model empirically, and show that it can both compress hyperbolic embeddings (down to $2%$ of a Poincar{'e} embedding on WordNet Nouns) and learn more accurate embeddings on real-world datasets. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8476-numerically-accurate-hyperbolic-embeddings-using-tiling-based-models |
http://papers.nips.cc/paper/8476-numerically-accurate-hyperbolic-embeddings-using-tiling-based-models.pdf | |
PWC | https://paperswithcode.com/paper/numerically-accurate-hyperbolic-embeddings |
Repo | https://github.com/ydtydr/HyperbolicTiling_Learning |
Framework | pytorch |
A Prism Module for Semantic Disentanglement in Name Entity Recognition
Title | A Prism Module for Semantic Disentanglement in Name Entity Recognition |
Authors | Kun Liu, Shen Li, Daqi Zheng, Zhengdong Lu, Sheng Gao, Si Li |
Abstract | Natural Language Processing has been perplexed for many years by the problem that multiple semantics are mixed inside a word, even with the help of context. To solve this problem, we propose a prism module to disentangle the semantic aspects of words and reduce noise at the input layer of a model. In the prism module, some words are selectively replaced with task-related semantic aspects, then these denoised word representations can be fed into downstream tasks to make them easier. Besides, we also introduce a structure to train this module jointly with the downstream model without additional data. This module can be easily integrated into the downstream model and significantly improve the performance of baselines on named entity recognition (NER) task. The ablation analysis demonstrates the rationality of the method. As a side effect, the proposed method also provides a way to visualize the contribution of each word. |
Tasks | Named Entity Recognition |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1532/ |
https://www.aclweb.org/anthology/P19-1532 | |
PWC | https://paperswithcode.com/paper/a-prism-module-for-semantic-disentanglement |
Repo | https://github.com/liukun95/Prism-Module |
Framework | tf |
ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features
Title | ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features |
Authors | Yue Wu, Wael AbdAlmageed, Premkumar Natarajan |
Abstract | To fight against real-life image forgery, which commonly involves different types and combined manipulations, we propose a unified deep neural architecture called ManTra-Net. Unlike many existing solutions, ManTra-Net is an end-to-end network that performs both detection and localization without extra preprocessing and postprocessing. \manifold is a fully convolutional network and handles images of arbitrary sizes and many known forgery types such splicing, copy-move, removal, enhancement, and even unknown types. This paper has three salient contributions. We design a simple yet effective self-supervised learning task to learn robust image manipulation traces from classifying 385 image manipulation types. Further, we formulate the forgery localization problem as a local anomaly detection problem, design a Z-score feature to capture local anomaly, and propose a novel long short-term memory solution to assess local anomalies. Finally, we carefully conduct ablation experiments to systematically optimize the proposed network design. Our extensive experimental results demonstrate the generalizability, robustness and superiority of ManTra-Net, not only in single types of manipulations/forgeries, but also in their complicated combinations. |
Tasks | Anomaly Detection |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_ManTra-Net_Manipulation_Tracing_Network_for_Detection_and_Localization_of_Image_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_ManTra-Net_Manipulation_Tracing_Network_for_Detection_and_Localization_of_Image_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/mantra-net-manipulation-tracing-network-for |
Repo | https://github.com/ISICV/ManTraNet |
Framework | tf |
IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction
Title | IGE-Net: Inverse Graphics Energy Networks for Human Pose Estimation and Single-View Reconstruction |
Authors | Dominic Jack, Frederic Maire, Sareh Shirazi, Anders Eriksson |
Abstract | Inferring 3D scene information from 2D observations is an open problem in computer vision. We propose using a deep-learning based energy minimization framework to learn a consistency measure between 2D observations and a proposed world model, and demonstrate that this framework can be trained end-to-end to produce consistent and realistic inferences. We evaluate the framework on human pose estimation and voxel-based object reconstruction benchmarks and show competitive results can be achieved with relatively shallow networks with drastically fewer learned parameters and floating point operations than conventional deep-learning approaches. |
Tasks | Object Reconstruction, Pose Estimation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Jack_IGE-Net_Inverse_Graphics_Energy_Networks_for_Human_Pose_Estimation_and_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Jack_IGE-Net_Inverse_Graphics_Energy_Networks_for_Human_Pose_Estimation_and_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/ige-net-inverse-graphics-energy-networks-for |
Repo | https://github.com/jackd/ige |
Framework | tf |
STREETS: A Novel Camera Network Dataset for Traffic Flow
Title | STREETS: A Novel Camera Network Dataset for Traffic Flow |
Authors | Corey Snyder, Minh Do |
Abstract | In this paper, we introduce STREETS, a novel traffic flow dataset from publicly available web cameras in the suburbs of Chicago, IL. We seek to address the limitations of existing datasets in this area. Many such datasets lack a coherent traffic network graph to describe the relationship between sensors. The datasets that do provide a graph depict traffic flow in urban population centers or highway systems and use costly sensors like induction loops. These contexts differ from that of a suburban traffic body. Our dataset provides over 4 million still images across 2.5 months and one hundred web cameras in suburban Lake County, IL. We divide the cameras into two distinct communities described by directed graphs and count vehicles to track traffic statistics. Our goal is to give researchers a benchmark dataset for exploring the capabilities of inexpensive and non-invasive sensors like web cameras to understand complex traffic bodies in communities of any size. We present benchmarking tasks and baseline results for one such task to guide how future work may use our dataset. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9213-streets-a-novel-camera-network-dataset-for-traffic-flow |
http://papers.nips.cc/paper/9213-streets-a-novel-camera-network-dataset-for-traffic-flow.pdf | |
PWC | https://paperswithcode.com/paper/streets-a-novel-camera-network-dataset-for |
Repo | https://github.com/corey-snyder/STREETS |
Framework | tf |
Enriched Feature Guided Refinement Network for Object Detection
Title | Enriched Feature Guided Refinement Network for Object Detection |
Authors | Jing Nie, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao |
Abstract | We propose a single-stage detection framework that jointly tackles the problem of multi-scale object detection and class imbalance. Rather than designing deeper networks, we introduce a simple yet effective feature enrichment scheme to produce multi-scale contextual features. We further introduce a cascaded refinement scheme which first instills multi-scale contextual features into the prediction layers of the single-stage detector in order to enrich their discriminative power for multi-scale detection. Second, the cascaded refinement scheme counters the class imbalance problem by refining the anchors and enriched features to improve classification and regression. Experiments are performed on two benchmarks: PASCAL VOC and MS COCO. For a 320x320 input on the MS COCO test-dev, our detector achieves state-of-the-art single-stage detection accuracy with a COCO AP of 33.2 in the case of single-scale inference, while operating at 21 milliseconds on a Titan XP GPU. For a 512x512 input on the MS COCO test-dev, our approach obtains an absolute gain of 1.6% in terms of COCO AP, compared to the best reported single-stage results[5]. Source code and models are available at: https://github.com/Ranchentx/EFGRNet. |
Tasks | Object Detection |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Nie_Enriched_Feature_Guided_Refinement_Network_for_Object_Detection_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Nie_Enriched_Feature_Guided_Refinement_Network_for_Object_Detection_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/enriched-feature-guided-refinement-network |
Repo | https://github.com/Ranchentx/EFGRNet |
Framework | pytorch |
Scalable Deep Generative Relational Model with High-Order Node Dependence
Title | Scalable Deep Generative Relational Model with High-Order Node Dependence |
Authors | Xuhui Fan, Bin Li, Caoyuan Li, Scott Sisson, Ling Chen |
Abstract | In this work, we propose a probabilistic framework for relational data modelling and latent structure exploring. Given the possible feature information for the nodes in a network, our model builds up a deep architecture that can approximate to the possible nonlinear mappings between the nodes’ feature information and latent representations. For each node, we incorporate all its neighborhoods’ high-order structure information to generate latent representation, such that these latent representations are ``smooth’’ in terms of the network. Since the latent representations are generated from Dirichlet distributions, we further develop a data augmentation trick to enable efficient Gibbs sampling for Ber-Poisson likelihood with Dirichlet random variables. Our model can be ready to apply to large sparse network as its computations cost scales to the number of positive links in the networks. The superior performance of our model is demonstrated through improved link prediction performance on a range of real-world datasets. | |
Tasks | Data Augmentation, Link Prediction |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9428-scalable-deep-generative-relational-model-with-high-order-node-dependence |
http://papers.nips.cc/paper/9428-scalable-deep-generative-relational-model-with-high-order-node-dependence.pdf | |
PWC | https://paperswithcode.com/paper/scalable-deep-generative-relational-model |
Repo | https://github.com/xuhuifan/SDREM |
Framework | none |
Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing
Title | Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing |
Authors | Xindi Wang, Robert E. Mercer |
Abstract | The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles{'} MEDLINE/PubMed MeSH terms is publicly available. |
Tasks | Document Classification, Text Classification |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5018/ |
https://www.aclweb.org/anthology/W19-5018 | |
PWC | https://paperswithcode.com/paper/incorporating-figure-captions-and-descriptive |
Repo | https://github.com/xdwang0726/Mesh |
Framework | none |
Kernel Modeling Super-Resolution on Real Low-Resolution Images
Title | Kernel Modeling Super-Resolution on Real Low-Resolution Images |
Authors | Ruofan Zhou, Sabine Susstrunk |
Abstract | Deep convolutional neural networks (CNNs), trained on corresponding pairs of high- and low-resolution images, achieve state-of-the-art performance in single-image super-resolution and surpass previous signal-processing based approaches. However, their performance is limited when applied to real photographs. The reason lies in their training data: low-resolution (LR) images are obtained by bicubic interpolation of the corresponding high-resolution (HR) images. The applied convolution kernel significantly differs from real-world camera-blur. Consequently, while current CNNs well super-resolve bicubic-downsampled LR images, they often fail on camera-captured LR images. To improve generalization and robustness of deep super-resolution CNNs on real photographs, we present a kernel modeling super-resolution network (KMSR) that incorporates blur-kernel modeling in the training. Our proposed KMSR consists of two stages: we first build a pool of realistic blur-kernels with a generative adversarial network (GAN) and then we train a super-resolution network with HR and corresponding LR images constructed with the generated kernels. Our extensive experimental validations demonstrate the effectiveness of our single-image super-resolution approach on photographs with unknown blur-kernels. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Zhou_Kernel_Modeling_Super-Resolution_on_Real_Low-Resolution_Images_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Zhou_Kernel_Modeling_Super-Resolution_on_Real_Low-Resolution_Images_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/kernel-modeling-super-resolution-on-real-low |
Repo | https://github.com/IVRL/Kernel-Modeling-Super-Resolution |
Framework | pytorch |
Transfer Learning via Minimizing the Performance Gap Between Domains
Title | Transfer Learning via Minimizing the Performance Gap Between Domains |
Authors | Boyu Wang, Jorge Mendez, Mingbo Cai, Eric Eaton |
Abstract | We propose a new principle for transfer learning, based on a straightforward intuition: if two domains are similar to each other, the model trained on one domain should also perform well on the other domain, and vice versa. To formalize this intuition, we define the performance gap as a measure of the discrepancy between the source and target domains. We derive generalization bounds for the instance weighting approach to transfer learning, showing that the performance gap can be viewed as an algorithm-dependent regularizer, which controls the model complexity. Our theoretical analysis provides new insight into transfer learning and motivates a set of general, principled rules for designing new instance weighting schemes for transfer learning. These rules lead to gapBoost, a novel and principled boosting approach for transfer learning. Our experimental evaluation on benchmark data sets shows that gapBoost significantly outperforms previous boosting-based transfer learning algorithms. |
Tasks | Transfer Learning |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9249-transfer-learning-via-minimizing-the-performance-gap-between-domains |
http://papers.nips.cc/paper/9249-transfer-learning-via-minimizing-the-performance-gap-between-domains.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-via-minimizing-the |
Repo | https://github.com/bwang-ml/gapBoost |
Framework | none |
DHER: Hindsight Experience Replay for Dynamic Goals
Title | DHER: Hindsight Experience Replay for Dynamic Goals |
Authors | Meng Fang, Cheng Zhou, Bei Shi, Boqing Gong, Jia Xu, Tong Zhang |
Abstract | Dealing with sparse rewards is one of the most important challenges in reinforcement learning (RL), especially when a goal is dynamic (e.g., to grasp a moving object). Hindsight experience replay (HER) has been shown an effective solution to handling sparse rewards with fixed goals. However, it does not account for dynamic goals in its vanilla form and, as a result, even degrades the performance of existing off-policy RL algorithms when the goal is changing over time. In this paper, we present Dynamic Hindsight Experience Replay (DHER), a novel approach for tasks with dynamic goals in the presence of sparse rewards. DHER automatically assembles successful experiences from two relevant failures and can be used to enhance an arbitrary off-policy RL algorithm when the tasks’ goals are dynamic. We evaluate DHER on tasks of robotic manipulation and moving object tracking, and transfer the polices from simulation to physical robots. Extensive comparison and ablation studies demonstrate the superiority of our approach, showing that DHER is a crucial ingredient to enable RL to solve tasks with dynamic goals in manipulation and grid world domains. |
Tasks | Object Tracking |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Byf5-30qFX |
https://openreview.net/pdf?id=Byf5-30qFX | |
PWC | https://paperswithcode.com/paper/dher-hindsight-experience-replay-for-dynamic |
Repo | https://github.com/mengf1/DHER |
Framework | none |
Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations
Title | Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations |
Authors | Kevin Smith, Lingjie Mei, Shunyu Yao, Jiajun Wu, Elizabeth Spelke, Josh Tenenbaum, Tomer Ullman |
Abstract | From infancy, humans have expectations about how objects will move and interact. Even young children expect objects not to move through one another, teleport, or disappear. They are surprised by mismatches between physical expectations and perceptual observations, even in unfamiliar scenes with completely novel objects. A model that exhibits human-like understanding of physics should be similarly surprised, and adjust its beliefs accordingly. We propose ADEPT, a model that uses a coarse (approximate geometry) object-centric representation for dynamic 3D scene understanding. Inference integrates deep recognition networks, extended probabilistic physical simulation, and particle filtering for forming predictions and expectations across occlusion. We also present a new test set for measuring violations of physical expectations, using a range of scenarios derived from developmental psychology. We systematically compare ADEPT, baseline models, and human expectations on this test set. ADEPT outperforms standard network architectures in discriminating physically implausible scenes, and often performs this discrimination at the same level as people. |
Tasks | Scene Understanding |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9100-modeling-expectation-violation-in-intuitive-physics-with-coarse-probabilistic-object-representations |
http://papers.nips.cc/paper/9100-modeling-expectation-violation-in-intuitive-physics-with-coarse-probabilistic-object-representations.pdf | |
PWC | https://paperswithcode.com/paper/modeling-expectation-violation-in-intuitive |
Repo | https://github.com/JerryLingjieMei/ADEPT-Model-Release |
Framework | pytorch |
Rethinking Planar Homography Estimation Using Perspective Fields
Title | Rethinking Planar Homography Estimation Using Perspective Fields |
Authors | Rui Zeng, Simon Denman, Sridha Sridharan, Clinton Fookes |
Abstract | Planar homography estimation refers to the problem of computing a bijective linear mapping of pixels between two images. While this problem has been studied with convolutional neural networks (CNNs), existing methods simply regress the location of the four corners using a dense layer preceded by a fully-connected layer. This vector representation damages the spatial structure of the corners since they have a clear spatial order. Moreover, four points are the minimum required to compute the homography, and so such an approach is susceptible to perturbation. In this paper, we propose a conceptually simple, reliable, and general framework for homography estimation. In contrast to previous works, we formulate this problem as a perspective field (PF), which models the essence of the homography - pixel-to-pixel bijection. The PF is naturally learned by the proposed fully convolutional residual network, PFNet, to keep the spatial order of each pixel. Moreover, since every pixels’ displacement can be obtained from the PF, it enables robust homography estimation by utilizing dense correspondences. Our experiments demonstrate the proposed method outperforms traditional correspondence-based approaches and state-of-the-art CNN approaches in terms of accuracy while also having a smaller network size. In addition, the new parameterization of this task is general and can be implemented by any fully convolutional network (FCN) architecture. |
Tasks | Homography Estimation |
Published | 2019-05-26 |
URL | https://link.springer.com/chapter/10.1007/978-3-030-20876-9_36 |
https://eprints.qut.edu.au/126933/ | |
PWC | https://paperswithcode.com/paper/rethinking-planar-homography-estimation-using |
Repo | https://github.com/ruizengalways/PFNet |
Framework | tf |
Bayesian Joint Estimation of Multiple Graphical Models
Title | Bayesian Joint Estimation of Multiple Graphical Models |
Authors | Lingrui Gan, Xinming Yang, Naveen Narisetty, Feng Liang |
Abstract | In this paper, we propose a novel Bayesian group regularization method based on the spike and slab Lasso priors for jointly estimating multiple graphical models. The proposed method can be used to estimate the common sparsity structure underlying the graphical models while capturing potential heterogeneity of the precision matrices corresponding to those models. Our theoretical results show that the proposed method enjoys the optimal rate of convergence in $\ell_\infty$ norm for estimation consistency and has a strong structure recovery guarantee even when the signal strengths over different graphs are heterogeneous. Through simulation studies and an application to the capital bike-sharing network data, we demonstrate the competitive performance of our method compared to existing alternatives. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9173-bayesian-joint-estimation-of-multiple-graphical-models |
http://papers.nips.cc/paper/9173-bayesian-joint-estimation-of-multiple-graphical-models.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-joint-estimation-of-multiple |
Repo | https://github.com/xinming104/GemBag |
Framework | none |
Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation
Title | Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation |
Authors | Ying-Cong Chen, Xiaogang Xu, Zhuotao Tian, Jiaya Jia |
Abstract | Generative adversarial networks have achieved great success in unpaired image-to-image translation. Cycle consistency allows modeling the relationship between two distinct domains without paired data. In this paper, we propose an alternative framework, as an extension of latent space interpolation, to consider the intermediate region between two domains during translation. It is based on the fact that in a flat and smooth latent space, there exist many paths that connect two sample points. Properly selecting paths makes it possible to change only certain image attributes, which is useful for generating intermediate images between the two domains. We also show that this framework can be applied to multi-domain and multi-modal translation. Extensive experiments manifest its generality and applicability to various tasks. |
Tasks | Image-to-Image Translation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Chen_Homomorphic_Latent_Space_Interpolation_for_Unpaired_Image-To-Image_Translation_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Chen_Homomorphic_Latent_Space_Interpolation_for_Unpaired_Image-To-Image_Translation_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/homomorphic-latent-space-interpolation-for |
Repo | https://github.com/yingcong/HomoInterpGAN |
Framework | pytorch |