Paper Group AWR 219
Joint Bootstrapping Machines for High Confidence Relation Extraction. Merging datasets through deep learning. Geometry-Based Data Generation. Point Cloud GAN. Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories. Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization. RankGAN: A Maximum Margin Ranking GA …
Joint Bootstrapping Machines for High Confidence Relation Extraction
Title | Joint Bootstrapping Machines for High Confidence Relation Extraction |
Authors | Pankaj Gupta, Benjamin Roth, Hinrich Schütze |
Abstract | Semi-supervised bootstrapping techniques for relationship extraction from text iteratively expand a set of initial seed instances. Due to the lack of labeled data, a key challenge in bootstrapping is semantic drift: if a false positive instance is added during an iteration, then all following iterations are contaminated. We introduce BREX, a new bootstrapping method that protects against such contamination by highly effective confidence assessment. This is achieved by using entity and template seeds jointly (as opposed to just one as in previous work), by expanding entities and templates in parallel and in a mutually constraining fashion in each iteration and by introducing higherquality similarity measures for templates. Experimental results show that BREX achieves an F1 that is 0.13 (0.87 vs. 0.74) better than the state of the art for four relationships. |
Tasks | Relation Extraction, Relationship Extraction (Distant Supervised) |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00254v1 |
http://arxiv.org/pdf/1805.00254v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-bootstrapping-machines-for-high |
Repo | https://github.com/pgcool/Joint-Bootstrapping-Machines |
Framework | none |
Merging datasets through deep learning
Title | Merging datasets through deep learning |
Authors | Kavitha Srinivas, Abraham Gale, Julian Dolby |
Abstract | Merging datasets is a key operation for data analytics. A frequent requirement for merging is joining across columns that have different surface forms for the same entity (e.g., the name of a person might be represented as “Douglas Adams” or “Adams, Douglas”). Similarly, ontology alignment can require recognizing distinct surface forms of the same entity, especially when ontologies are independently developed. However, data management systems are currently limited to performing merges based on string equality, or at best using string similarity. We propose an approach to performing merges based on deep learning models. Our approach depends on (a) creating a deep learning model that maps surface forms of an entity into a set of vectors such that alternate forms for the same entity are closest in vector space, (b) indexing these vectors using a nearest neighbors algorithm to find the forms that can be potentially joined together. To build these models, we had to adapt techniques from metric learning due to the characteristics of the data; specifically we describe novel sample selection techniques and loss functions that work for this problem. To evaluate our approach, we used Wikidata as ground truth and built models from datasets with approximately 1.1M people’s names (200K identities) and 130K company names (70K identities). We developed models that allow for joins with precision@1 of .75-.81 and recall of .74-.81. We make the models available for aligning people or companies across multiple datasets. |
Tasks | Metric Learning |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01604v1 |
http://arxiv.org/pdf/1809.01604v1.pdf | |
PWC | https://paperswithcode.com/paper/merging-datasets-through-deep-learning |
Repo | https://github.com/yehudagale/fuzzyjoiner |
Framework | tf |
Geometry-Based Data Generation
Title | Geometry-Based Data Generation |
Authors | Ofir Lindenbaum, Jay S. Stanley III, Guy Wolf, Smita Krishnaswamy |
Abstract | Many generative models attempt to replicate the density of their input data. However, this approach is often undesirable, since data density is highly affected by sampling biases, noise, and artifacts. We propose a method called SUGAR (Synthesis Using Geometrically Aligned Random-walks) that uses a diffusion process to learn a manifold geometry from the data. Then, it generates new points evenly along the manifold by pulling randomly generated points into its intrinsic structure using a diffusion kernel. SUGAR equalizes the density along the manifold by selectively generating points in sparse areas of the manifold. We demonstrate how the approach corrects sampling biases and artifacts, while also revealing intrinsic patterns (e.g. progression) and relations in the data. The method is applicable for correcting missing data, finding hypothetical data points, and learning relationships between data features. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04927v4 |
http://arxiv.org/pdf/1802.04927v4.pdf | |
PWC | https://paperswithcode.com/paper/geometry-based-data-generation |
Repo | https://github.com/KrishnaswamyLab/SUGAR |
Framework | none |
Point Cloud GAN
Title | Point Cloud GAN |
Authors | Chun-Liang Li, Manzil Zaheer, Yang Zhang, Barnabas Poczos, Ruslan Salakhutdinov |
Abstract | Generative Adversarial Networks (GAN) can achieve promising performance on learning complex data distributions on different types of data. In this paper, we first show a straightforward extension of existing GAN algorithm is not applicable to point clouds, because the constraint required for discriminators is undefined for set data. We propose a two fold modification to GAN algorithm for learning to generate point clouds (PC-GAN). First, we combine ideas from hierarchical Bayesian modeling and implicit generative models by learning a hierarchical and interpretable sampling process. A key component of our method is that we train a posterior inference network for the hidden variables. Second, instead of using only state-of-the-art Wasserstein GAN objective, we propose a sandwiching objective, which results in a tighter Wasserstein distance estimate than the commonly used dual form. Thereby, PC-GAN defines a generic framework that can incorporate many existing GAN algorithms. We validate our claims on ModelNet40 benchmark dataset. Using the distance between generated point clouds and true meshes as metric, we find that PC-GAN trained by the sandwiching objective achieves better results on test data than the existing methods. Moreover, as a byproduct, PC- GAN learns versatile latent representations of point clouds, which can achieve competitive performance with other unsupervised learning algorithms on object recognition task. Lastly, we also provide studies on generating unseen classes of objects and transforming image to point cloud, which demonstrates the compelling generalization capability and potentials of PC-GAN. |
Tasks | Object Recognition |
Published | 2018-10-13 |
URL | http://arxiv.org/abs/1810.05795v1 |
http://arxiv.org/pdf/1810.05795v1.pdf | |
PWC | https://paperswithcode.com/paper/point-cloud-gan |
Repo | https://github.com/chunliangli/Point-Cloud-GAN |
Framework | tf |
Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories
Title | Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories |
Authors | Yaming Wang, Xiao Tan, Yi Yang, Ziyu Li, Xiao Liu, Feng Zhou, Larry S. Davis |
Abstract | Existing 3D pose datasets of object categories are limited to generic object types and lack of fine-grained information. In this work, we introduce a new large-scale dataset that consists of 409 fine-grained categories and 31,881 images with accurate 3D pose annotation. Specifically, we augment three existing fine-grained object recognition datasets (StanfordCars, CompCars and FGVC-Aircraft) by finding a specific 3D model for each sub-category from ShapeNet and manually annotating each 2D image by adjusting a full set of 7 continuous perspective parameters. Since the fine-grained shapes allow 3D models to better fit the images, we further improve the annotation quality by initializing from the human annotation and conducting local search of the pose parameters with the objective of maximizing the IoUs between the projected mask and the segmentation reference estimated from state-of-the-art deep Convolutional Neural Networks (CNNs). We provide full statistics of the annotations with qualitative and quantitative comparisons suggesting that our dataset can be a complementary source for studying 3D pose estimation. The dataset can be downloaded at http://users.umiacs.umd.edu/~wym/3dpose.html. |
Tasks | 3D Pose Estimation, Object Recognition, Pose Estimation |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.09263v1 |
http://arxiv.org/pdf/1810.09263v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-annotation-for-3d-pose-dataset-of |
Repo | https://github.com/yangyi02/3d_pose_fine_grained |
Framework | none |
Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization
Title | Improving DNN Robustness to Adversarial Attacks using Jacobian Regularization |
Authors | Daniel Jakubovitz, Raja Giryes |
Abstract | Deep neural networks have lately shown tremendous performance in various applications including vision and speech processing tasks. However, alongside their ability to perform these tasks with such high accuracy, it has been shown that they are highly susceptible to adversarial attacks: a small change in the input would cause the network to err with high confidence. This phenomenon exposes an inherent fault in these networks and their ability to generalize well. For this reason, providing robustness to adversarial attacks is an important challenge in networks training, which has led to extensive research. In this work, we suggest a theoretically inspired novel approach to improve the networks’ robustness. Our method applies regularization using the Frobenius norm of the Jacobian of the network, which is applied as post-processing, after regular training has finished. We demonstrate empirically that it leads to enhanced robustness results with a minimal change in the original network’s accuracy. |
Tasks | |
Published | 2018-03-23 |
URL | https://arxiv.org/abs/1803.08680v4 |
https://arxiv.org/pdf/1803.08680v4.pdf | |
PWC | https://paperswithcode.com/paper/improving-dnn-robustness-to-adversarial |
Repo | https://github.com/danieljakubovitz/Jacobian_Regularization |
Framework | tf |
RankGAN: A Maximum Margin Ranking GAN for Generating Faces
Title | RankGAN: A Maximum Margin Ranking GAN for Generating Faces |
Authors | Rahul Dey, Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides |
Abstract | We present a new stage-wise learning paradigm for training generative adversarial networks (GANs). The goal of our work is to progressively strengthen the discriminator and thus, the generators, with each subsequent stage without changing the network architecture. We call this proposed method the RankGAN. We first propose a margin-based loss for the GAN discriminator. We then extend it to a margin-based ranking loss to train the multiple stages of RankGAN. We focus on face images from the CelebA dataset in our work and show visual as well as quantitative improvements in face generation and completion tasks over other GAN approaches, including WGAN and LSGAN. |
Tasks | Face Generation |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.08196v1 |
http://arxiv.org/pdf/1812.08196v1.pdf | |
PWC | https://paperswithcode.com/paper/rankgan-a-maximum-margin-ranking-gan-for |
Repo | https://github.com/human-analysis/RankGAN |
Framework | pytorch |
Dual Encoding for Zero-Example Video Retrieval
Title | Dual Encoding for Zero-Example Video Retrieval |
Authors | Jianfeng Dong, Xirong Li, Chaoxi Xu, Shouling Ji, Yuan He, Gang Yang, Xun Wang |
Abstract | This paper attacks the challenging problem of zero-example video retrieval. In such a retrieval paradigm, an end user searches for unlabeled videos by ad-hoc queries described in natural language text with no visual example provided. Given videos as sequences of frames and queries as sequences of words, an effective sequence-to-sequence cross-modal matching is required. The majority of existing methods are concept based, extracting relevant concepts from queries and videos and accordingly establishing associations between the two modalities. In contrast, this paper takes a concept-free approach, proposing a dual deep encoding network that encodes videos and queries into powerful dense representations of their own. Dual encoding is conceptually simple, practically effective and end-to-end. As experiments on three benchmarks, i.e. MSR-VTT, TRECVID 2016 and 2017 Ad-hoc Video Search show, the proposed solution establishes a new state-of-the-art for zero-example video retrieval. |
Tasks | Video Retrieval |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06181v3 |
http://arxiv.org/pdf/1809.06181v3.pdf | |
PWC | https://paperswithcode.com/paper/dual-dense-encoding-for-zero-example-video |
Repo | https://github.com/danieljf24/dual_encoding |
Framework | pytorch |
Generative Modeling using the Sliced Wasserstein Distance
Title | Generative Modeling using the Sliced Wasserstein Distance |
Authors | Ishan Deshpande, Ziyu Zhang, Alexander Schwing |
Abstract | Generative Adversarial Nets (GANs) are very successful at modeling distributions from given samples, even in the high-dimensional case. However, their formulation is also known to be hard to optimize and often not stable. While this is particularly true for early GAN formulations, there has been significant empirically motivated and theoretically founded progress to improve stability, for instance, by using the Wasserstein distance rather than the Jenson-Shannon divergence. Here, we consider an alternative formulation for generative modeling based on random projections which, in its simplest form, results in a single objective rather than a saddle-point formulation. By augmenting this approach with a discriminator we improve its accuracy. We found our approach to be significantly more stable compared to even the improved Wasserstein GAN. Further, unlike the traditional GAN loss, the loss formulated in our method is a good measure of the actual distance between the distributions and, for the first time for GAN training, we are able to show estimates for the same. |
Tasks | |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.11188v1 |
http://arxiv.org/pdf/1803.11188v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-modeling-using-the-sliced |
Repo | https://github.com/ishansd/swg |
Framework | tf |
iSPA-Net: Iterative Semantic Pose Alignment Network
Title | iSPA-Net: Iterative Semantic Pose Alignment Network |
Authors | Jogendra Nath Kundu, Aditya Ganeshan, Rahul M. V., Aditya Prakash, R. Venkatesh Babu |
Abstract | Understanding and extracting 3D information of objects from monocular 2D images is a fundamental problem in computer vision. In the task of 3D object pose estimation, recent data driven deep neural network based approaches suffer from scarcity of real images with 3D keypoint and pose annotations. Drawing inspiration from human cognition, where the annotators use a 3D CAD model as structural reference to acquire ground-truth viewpoints for real images; we propose an iterative Semantic Pose Alignment Network, called iSPA-Net. Our approach focuses on exploiting semantic 3D structural regularity to solve the task of fine-grained pose estimation by predicting viewpoint difference between a given pair of images. Such image comparison based approach also alleviates the problem of data scarcity and hence enhances scalability of the proposed approach for novel object categories with minimal annotation. The fine-grained object pose estimator is also aided by correspondence of learned spatial descriptor of the input image pair. The proposed pose alignment framework enjoys the faculty to refine its initial pose estimation in consecutive iterations by utilizing an online rendering setup along with effectiveness of a non-uniform bin classification of pose-difference. This enables iSPA-Net to achieve state-of-the-art performance on various real image viewpoint estimation datasets. Further, we demonstrate effectiveness of the approach for multiple applications. First, we show results for active object viewpoint localization to capture images from similar pose considering only a single image as pose reference. Second, we demonstrate the ability of the learned semantic correspondence to perform unsupervised part-segmentation transfer using only a single part-annotated 3D template model per object class. To encourage reproducible research, we have released the codes for our proposed algorithm. |
Tasks | Pose Estimation, Viewpoint Estimation |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01134v1 |
http://arxiv.org/pdf/1808.01134v1.pdf | |
PWC | https://paperswithcode.com/paper/ispa-net-iterative-semantic-pose-alignment |
Repo | https://github.com/val-iisc/iSPA-Net |
Framework | none |
Complex-valued Neural Networks with Non-parametric Activation Functions
Title | Complex-valued Neural Networks with Non-parametric Activation Functions |
Authors | Simone Scardapane, Steven Van Vaerenbergh, Amir Hussain, Aurelio Uncini |
Abstract | Complex-valued neural networks (CVNNs) are a powerful modeling tool for domains where data can be naturally interpreted in terms of complex numbers. However, several analytical properties of the complex domain (e.g., holomorphicity) make the design of CVNNs a more challenging task than their real counterpart. In this paper, we consider the problem of flexible activation functions (AFs) in the complex domain, i.e., AFs endowed with sufficient degrees of freedom to adapt their shape given the training data. While this problem has received considerable attention in the real case, a very limited literature exists for CVNNs, where most activation functions are generally developed in a split fashion (i.e., by considering the real and imaginary parts of the activation separately) or with simple phase-amplitude techniques. Leveraging over the recently proposed kernel activation functions (KAFs), and related advances in the design of complex-valued kernels, we propose the first fully complex, non-parametric activation function for CVNNs, which is based on a kernel expansion with a fixed dictionary that can be implemented efficiently on vectorized hardware. Several experiments on common use cases, including prediction and channel equalization, validate our proposal when compared to real-valued neural networks and CVNNs with fixed activation functions. |
Tasks | |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.08026v1 |
http://arxiv.org/pdf/1802.08026v1.pdf | |
PWC | https://paperswithcode.com/paper/complex-valued-neural-networks-with-non |
Repo | https://github.com/omrijsharon/torchlex |
Framework | pytorch |
NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval
Title | NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval |
Authors | Canjia Li, Yingfei Sun, Ben He, Le Wang, Kai Hui, Andrew Yates, Le Sun, Jungang Xu |
Abstract | Pseudo-relevance feedback (PRF) is commonly used to boost the performance of traditional information retrieval (IR) models by using top-ranked documents to identify and weight new query terms, thereby reducing the effect of query-document vocabulary mismatches. While neural retrieval models have recently demonstrated strong results for ad-hoc retrieval, combining them with PRF is not straightforward due to incompatibilities between existing PRF approaches and neural architectures. To bridge this gap, we propose an end-to-end neural PRF framework that can be used with existing neural IR models by embedding different neural models as building blocks. Extensive experiments on two standard test collections confirm the effectiveness of the proposed NPRF framework in improving the performance of two state-of-the-art neural IR models. |
Tasks | Ad-Hoc Information Retrieval, Information Retrieval |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12936v1 |
http://arxiv.org/pdf/1810.12936v1.pdf | |
PWC | https://paperswithcode.com/paper/nprf-a-neural-pseudo-relevance-feedback |
Repo | https://github.com/ucasir/NPRF |
Framework | tf |
Adviser Networks: Learning What Question to Ask for Human-In-The-Loop Viewpoint Estimation
Title | Adviser Networks: Learning What Question to Ask for Human-In-The-Loop Viewpoint Estimation |
Authors | Mohamed El Banani, Jason J. Corso |
Abstract | Humans have an unparalleled visual intelligence and can overcome visual ambiguities that machines currently cannot. Recent works have shown that incorporating guidance from humans during inference for monocular viewpoint-estimation can help overcome difficult cases in which the computer-alone would have otherwise failed. These hybrid intelligence approaches are hence gaining traction. However, deciding what question to ask the human at inference time remains an unknown for these problems. We address this question by formulating it as an Adviser Problem: can we learn a mapping from the input to a specific question to ask the human to maximize the expected positive impact to the overall task? We formulate a solution to the adviser problem for viewpoint estimation using a deep network where the question asks for the location of a keypoint in the input image. We show that by using the Adviser Network’s recommendations, the model and the human outperforms the previous hybrid-intelligence state-of-the-art by 3.7%, and the computer-only state-of-the-art by 5.28% absolute. |
Tasks | Viewpoint Estimation |
Published | 2018-02-05 |
URL | http://arxiv.org/abs/1802.01666v3 |
http://arxiv.org/pdf/1802.01666v3.pdf | |
PWC | https://paperswithcode.com/paper/adviser-networks-learning-what-question-to |
Repo | https://github.com/mbanani/adviser_networks |
Framework | pytorch |
APPLE Picker: Automatic Particle Picking, a Low-Effort Cryo-EM Framework
Title | APPLE Picker: Automatic Particle Picking, a Low-Effort Cryo-EM Framework |
Authors | Ayelet Heimowitz, Joakim andén, Amit Singer |
Abstract | Particle picking is a crucial first step in the computational pipeline of single-particle cryo-electron microscopy (cryo-EM). Selecting particles from the micrographs is difficult especially for small particles with low contrast. As high-resolution reconstruction typically requires hundreds of thousands of particles, manually picking that many particles is often too time-consuming. While semi-automated particle picking is currently a popular approach, it may suffer from introducing manual bias into the selection process. In addition, semi-automated particle picking is still somewhat time-consuming. This paper presents the APPLE (Automatic Particle Picking with Low user Effort) picker, a simple and novel approach for fast, accurate, and fully automatic particle picking. While our approach was inspired by template matching, it is completely template-free. This approach is evaluated on publicly available datasets containing micrographs of $\beta$-galactosidase and keyhole limpet hemocyanin projections. |
Tasks | |
Published | 2018-02-01 |
URL | http://arxiv.org/abs/1802.00469v2 |
http://arxiv.org/pdf/1802.00469v2.pdf | |
PWC | https://paperswithcode.com/paper/apple-picker-automatic-particle-picking-a-low |
Repo | https://github.com/PrincetonUniversity/APPLEpicker |
Framework | none |
Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation
Title | Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation |
Authors | Xiang Kong, Qizhe Xie, Zihang Dai, Eduard Hovy |
Abstract | Mixture of Softmaxes (MoS) has been shown to be effective at addressing the expressiveness limitation of Softmax-based models. Despite the known advantage, MoS is practically sealed by its large consumption of memory and computational time due to the need of computing multiple Softmaxes. In this work, we set out to unleash the power of MoS in practical applications by investigating improved word coding schemes, which could effectively reduce the vocabulary size and hence relieve the memory and computation burden. We show both BPE and our proposed Hybrid-LightRNN lead to improved encoding mechanisms that can halve the time and memory consumption of MoS without performance losses. With MoS, we achieve an improvement of 1.5 BLEU scores on IWSLT 2014 German-to-English corpus and an improvement of 0.76 CIDEr score on image captioning. Moreover, on the larger WMT 2014 machine translation dataset, our MoS-boosted Transformer yields 29.5 BLEU score for English-to-German and 42.1 BLEU score for English-to-French, outperforming the single-Softmax Transformer by 0.8 and 0.4 BLEU scores respectively and achieving the state-of-the-art result on WMT 2014 English-to-German task. |
Tasks | Image Captioning, Machine Translation, Text Generation |
Published | 2018-09-25 |
URL | https://arxiv.org/abs/1809.09296v2 |
https://arxiv.org/pdf/1809.09296v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-simple-mixture-of-softmaxes-with-bpe |
Repo | https://github.com/shawnkx/Fast-MoS |
Framework | tf |