October 20, 2019

3061 words 15 mins read

Paper Group ANR 96

Holographic Neural Architectures. Graph Memory Networks for Molecular Activity Prediction. Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximization Adaptation Network. Real-time 3D Pose Estimation with a Monocular Camera Using Deep Learning and Object Priors On an Autonomous Racecar. Active Distribution Learning from Indirect Samp …

Holographic Neural Architectures


Title	Holographic Neural Architectures
Authors	Tariq Daouda, Jeremie Zumer, Claude Perreault, Sébastien Lemieux
Abstract	Representation learning is at the heart of what makes deep learning effective. In this work, we introduce a new framework for representation learning that we call “Holographic Neural Architectures” (HNAs). In the same way that an observer can experience the 3D structure of a holographed object by looking at its hologram from several angles, HNAs derive Holographic Representations from the training set. These representations can then be explored by moving along a continuous bounded single dimension. We show that HNAs can be used to make generative networks, state-of-the-art regression models and that they are inherently highly resistant to noise. Finally, we argue that because of their denoising abilities and their capacity to generalize well from very few examples, models based upon HNAs are particularly well suited for biological applications where training examples are rare or noisy.
Tasks	Denoising, Representation Learning
Published	2018-06-04
URL	http://arxiv.org/abs/1806.00931v1
PDF	http://arxiv.org/pdf/1806.00931v1.pdf
PWC	https://paperswithcode.com/paper/holographic-neural-architectures
Repo
Framework

Graph Memory Networks for Molecular Activity Prediction


Title	Graph Memory Networks for Molecular Activity Prediction
Authors	Trang Pham, Truyen Tran, Svetha Venkatesh
Abstract	Molecular activity prediction is critical in drug design. Machine learning techniques such as kernel methods and random forests have been successful for this task. These models require fixed-size feature vectors as input while the molecules are variable in size and structure. As a result, fixed-size fingerprint representation is poor in handling substructures for large molecules. In addition, molecular activity tests, or a so-called BioAssays, are relatively small in the number of tested molecules due to its complexity. Here we approach the problem through deep neural networks as they are flexible in modeling structured data such as grids, sequences and graphs. We train multiple BioAssays using a multi-task learning framework, which combines information from multiple sources to improve the performance of prediction, especially on small datasets. We propose Graph Memory Network (GraphMem), a memory-augmented neural network to model the graph structure in molecules. GraphMem consists of a recurrent controller coupled with an external memory whose cells dynamically interact and change through a multi-hop reasoning process. Applied to the molecules, the dynamic interactions enable an iterative refinement of the representation of molecular graphs with multiple bond types. GraphMem is capable of jointly training on multiple datasets by using a specific-task query fed to the controller as an input. We demonstrate the effectiveness of the proposed model for separately and jointly training on more than 100K measurements, spanning across 9 BioAssay activity tests.
Tasks	Activity Prediction, Multi-Task Learning
Published	2018-01-08
URL	http://arxiv.org/abs/1801.02622v2
PDF	http://arxiv.org/pdf/1801.02622v2.pdf
PWC	https://paperswithcode.com/paper/graph-memory-networks-for-molecular-activity
Repo
Framework

Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximization Adaptation Network


Title	Racial Faces in-the-Wild: Reducing Racial Bias by Information Maximization Adaptation Network
Authors	Mei Wang, Weihong Deng, Jiani Hu, Xunqiang Tao, Yaohai Huang
Abstract	Racial bias is an important issue in biometric, but has not been thoroughly studied in deep face recognition. In this paper, we first contribute a dedicated dataset called Racial Faces in-the-Wild (RFW) database, on which we firmly validated the racial bias of four commercial APIs and four state-of-the-art (SOTA) algorithms. Then, we further present the solution using deep unsupervised domain adaptation and propose a deep information maximization adaptation network (IMAN) to alleviate this bias by using Caucasian as source domain and other races as target domains. This unsupervised method simultaneously aligns global distribution to decrease race gap at domain-level, and learns the discriminative target representations at cluster level. A novel mutual information loss is proposed to further enhance the discriminative ability of network output without label information. Extensive experiments on RFW, GBU, and IJB-A databases show that IMAN successfully learns features that generalize well across different races and across different databases.
Tasks	Domain Adaptation, Face Recognition, Face Verification, Unsupervised Domain Adaptation
Published	2018-12-01
URL	https://arxiv.org/abs/1812.00194v2
PDF	https://arxiv.org/pdf/1812.00194v2.pdf
PWC	https://paperswithcode.com/paper/racial-faces-in-the-wild-reducing-racial-bias
Repo
Framework

Real-time 3D Pose Estimation with a Monocular Camera Using Deep Learning and Object Priors On an Autonomous Racecar


Title	Real-time 3D Pose Estimation with a Monocular Camera Using Deep Learning and Object Priors On an Autonomous Racecar
Authors	Ankit Dhall
Abstract	We propose a complete pipeline that allows object detection and simultaneously estimate the pose of these multiple object instances using just a single image. A novel “keypoint regression” scheme with a cross-ratio term is introduced that exploits prior information about the object’s shape and size to regress and find specific feature points. Further, a priori 3D information about the object is used to match 2D-3D correspondences and accurately estimate object positions up to a distance of 15m. A detailed discussion of the results and an in-depth analysis of the pipeline is presented. The pipeline runs efficiently on a low-powered Jetson TX2 and is deployed as part of the perception pipeline on a real-time autonomous vehicle cruising at a top speed of 54 km/hr.
Tasks	3D Pose Estimation, Object Detection, Pose Estimation
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10548v1
PDF	http://arxiv.org/pdf/1809.10548v1.pdf
PWC	https://paperswithcode.com/paper/real-time-3d-pose-estimation-with-a-monocular
Repo
Framework

Active Distribution Learning from Indirect Samples


Title	Active Distribution Learning from Indirect Samples
Authors	Samarth Gupta, Gauri Joshi, Osman Yağan
Abstract	This paper studies the problem of {\em learning} the probability distribution $P_X$ of a discrete random variable $X$ using indirect and sequential samples. At each time step, we choose one of the possible $K$ functions, $g_1, \ldots, g_K$ and observe the corresponding sample $g_i(X)$. The goal is to estimate the probability distribution of $X$ by using a minimum number of such sequential samples. This problem has several real-world applications including inference under non-precise information and privacy-preserving statistical estimation. We establish necessary and sufficient conditions on the functions $g_1, \ldots, g_K$ under which asymptotically consistent estimation is possible. We also derive lower bounds on the estimation error as a function of total samples and show that it is order-wise achievable. Leveraging these results, we propose an iterative algorithm that i) chooses the function to observe at each step based on past observations; and ii) combines the obtained samples to estimate $p_X$. The performance of this algorithm is investigated numerically under various scenarios, and shown to outperform baseline approaches.
Tasks
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05334v1
PDF	http://arxiv.org/pdf/1808.05334v1.pdf
PWC	https://paperswithcode.com/paper/active-distribution-learning-from-indirect
Repo
Framework

Discovering Power Laws in Entity Length


Title	Discovering Power Laws in Entity Length
Authors	Xiaoshi Zhong, Erik Cambria, Jagath C. Rajapakse
Abstract	This paper presents a discovery that the length of the entities in various datasets follows a family of scale-free power law distributions. The concept of entity here broadly includes the named entity, entity mention, time expression, aspect term, and domain-specific entity that are well investigated in natural language processing and related areas. The entity length denotes the number of words in an entity. The power law distributions in entity length possess the scale-free property and have well-defined means and finite variances. We explain the phenomenon of power laws in entity length by the principle of least effort in communication and the preferential mechanism.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03325v3
PDF	http://arxiv.org/pdf/1811.03325v3.pdf
PWC	https://paperswithcode.com/paper/discovering-power-laws-in-entity-length
Repo
Framework

Automatic Face Aging in Videos via Deep Reinforcement Learning


Title	Automatic Face Aging in Videos via Deep Reinforcement Learning
Authors	Chi Nhan Duong, Khoa Luu, Kha Gia Quach, Nghia Nguyen, Eric Patterson, Tien D. Bui, Ngan Le
Abstract	This paper presents a novel approach to synthesize automatically age-progressed facial images in video sequences using Deep Reinforcement Learning. The proposed method models facial structures and the longitudinal face-aging process of given subjects coherently across video frames. The approach is optimized using a long-term reward, Reinforcement Learning function with deep feature extraction from Deep Convolutional Neural Network. Unlike previous age-progression methods that are only able to synthesize an aged likeness of a face from a single input image, the proposed approach is capable of age-progressing facial likenesses in videos with consistently synthesized facial features across frames. In addition, the deep reinforcement learning method guarantees preservation of the visual identity of input faces after age-progression. Results on videos of our new collected aging face AGFW-v2 database demonstrate the advantages of the proposed solution in terms of both quality of age-progressed faces, temporal smoothness, and cross-age face verification.
Tasks	Face Verification
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11082v2
PDF	http://arxiv.org/pdf/1811.11082v2.pdf
PWC	https://paperswithcode.com/paper/automatic-face-aging-in-videos-via-deep
Repo
Framework

Pathological Analysis of Stress Urinary Incontinence in Females using Artificial Neural Networks


Title	Pathological Analysis of Stress Urinary Incontinence in Females using Artificial Neural Networks
Authors	Mojtaba Barzegari, Bahman Vahidi, Mohammad Reza Safarinejad, Marzieh Hashemipour
Abstract	Objectives: To mathematically investigate urethral pressure and influencing parameters of stress urinary incontinence (SUI) in women, with focus on the clinical aspects of the mathematical modeling. Method: Several patients’ data are extracted from UPP and urodynamic documents and their relation and affinities are modeled using an artificial neural network (ANN) model. The studied parameter is urethral pressure as a function of two variables: the age of the patient and the position in which the pressure was measured across the urethra (normalized length). Results: The ANN-generated surface, showing the relation between the chosen parameters and the urethral pressure in the studied patients, is more efficient than the surface generated by conventional mathematical methods for clinical analysis, with multi-sample analysis being obtained. For example, in elderly people, there are many low-pressure zones throughout the urethra length, indicating that there is more incontinence in old age. Conclusion: The predictions of urethral pressure made by the trained neural network model in relation to the studied effective parameters can be used to build a medical assistance system in order to help clinicians diagnose urinary incontinence problems more efficiently.
Tasks
Published	2018-03-04
URL	http://arxiv.org/abs/1803.01843v1
PDF	http://arxiv.org/pdf/1803.01843v1.pdf
PWC	https://paperswithcode.com/paper/pathological-analysis-of-stress-urinary
Repo
Framework

DRPose3D: Depth Ranking in 3D Human Pose Estimation


Title	DRPose3D: Depth Ranking in 3D Human Pose Estimation
Authors	Min Wang, Xipeng Chen, Wentao Liu, Chen Qian, Liang Lin, Lizhuang Ma
Abstract	In this paper, we propose a two-stage depth ranking based method (DRPose3D) to tackle the problem of 3D human pose estimation. Instead of accurate 3D positions, the depth ranking can be identified by human intuitively and learned using the deep neural network more easily by solving classification problems. Moreover, depth ranking contains rich 3D information. It prevents the 2D-to-3D pose regression in two-stage methods from being ill-posed. In our method, firstly, we design a Pairwise Ranking Convolutional Neural Network (PRCNN) to extract depth rankings of human joints from images. Secondly, a coarse-to-fine 3D Pose Network(DPNet) is proposed to estimate 3D poses from both depth rankings and 2D human joint locations. Additionally, to improve the generality of our model, we introduce a statistical method to augment depth rankings. Our approach outperforms the state-of-the-art methods in the Human3.6M benchmark for all three testing protocols, indicating that depth ranking is an essential geometric feature which can be learned to improve the 3D pose estimation.
Tasks	3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation
Published	2018-05-23
URL	http://arxiv.org/abs/1805.08973v2
PDF	http://arxiv.org/pdf/1805.08973v2.pdf
PWC	https://paperswithcode.com/paper/drpose3d-depth-ranking-in-3d-human-pose
Repo
Framework

Co-Representation Learning For Classification and Novel Class Detection via Deep Networks


Title	Co-Representation Learning For Classification and Novel Class Detection via Deep Networks
Authors	Zhuoyi Wang, Zelun Kong, Hemeng Tao, Swarup Chandra, Latifur Khan
Abstract	One of the key challenges of performing label prediction over a data stream concerns with the emergence of instances belonging to unobserved class labels over time. Previously, this problem has been addressed by detecting such instances and using them for appropriate classifier adaptation. The fundamental aspect of a novel-class detection strategy relies on the ability of comparison among observed instances to discriminate them into known and unknown classes. Therefore, studies in the past have proposed various metrics suitable for comparison over the observed feature space. Unfortunately, these similarity measures fail to reliably identify distinct regions in observed feature spaces useful for class discrimination and novel-class detection, especially in streams containing high-dimensional data instances such as images and texts. In this paper, we address this key challenge by proposing a semi-supervised multi-task learning framework called \sysname{} which aims to intrinsically search for a latent space suitable for detecting labels of instances from both known and unknown classes. We empirically measure the performance of \sysname{} over multiple real-world image and text datasets and demonstrate its superiority by comparing its performance with existing semi-supervised methods.
Tasks	Multi-Task Learning, Representation Learning
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05141v2
PDF	http://arxiv.org/pdf/1811.05141v2.pdf
PWC	https://paperswithcode.com/paper/co-representation-learning-for-classification
Repo
Framework

Defoiling Foiled Image Captions


Title	Defoiling Foiled Image Captions
Authors	Pranava Madhyastha, Josiah Wang, Lucia Specia
Abstract	We address the task of detecting foiled image captions, i.e. identifying whether a caption contains a word that has been deliberately replaced by a semantically similar word, thus rendering it inaccurate with respect to the image being described. Solving this problem should in principle require a fine-grained understanding of images to detect linguistically valid perturbations in captions. In such contexts, encoding sufficiently descriptive image information becomes a key challenge. In this paper, we demonstrate that it is possible to solve this task using simple, interpretable yet powerful representations based on explicit object information. Our models achieve state-of-the-art performance on a standard dataset, with scores exceeding those achieved by humans on the task. We also measure the upper-bound performance of our models using gold standard annotations. Our analysis reveals that the simpler model performs well even without image information, suggesting that the dataset contains strong linguistic bias.
Tasks	Image Captioning
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06549v1
PDF	http://arxiv.org/pdf/1805.06549v1.pdf
PWC	https://paperswithcode.com/paper/defoiling-foiled-image-captions
Repo
Framework

Two-stream convolutional networks for end-to-end learning of self-driving cars


Title	Two-stream convolutional networks for end-to-end learning of self-driving cars
Authors	Nelson Fernandez
Abstract	We propose a methodology to extend the concept of Two-Stream Convolutional Networks to perform end-to-end learning for self-driving cars with temporal cues. The system has the ability to learn spatiotemporal features by simultaneously mapping raw images and pre-calculated optical flows directly to steering commands. Although optical flows encode temporal-rich information, we found that 2D-CNNs are prone to capturing features only as spatial representations. We show how the use of Multitask Learning favors the learning of temporal features via inductive transfer from a shared spatiotemporal representation. Preliminary results demonstrate a competitive improvement of 30% in prediction accuracy and stability compared to widely used regression methods trained on the Comma.ai dataset.
Tasks	Self-Driving Cars
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05785v2
PDF	http://arxiv.org/pdf/1811.05785v2.pdf
PWC	https://paperswithcode.com/paper/two-stream-convolutional-networks-for-end-to
Repo
Framework

A framework with updateable joint images re-ranking for Person Re-identification


Title	A framework with updateable joint images re-ranking for Person Re-identification
Authors	Mingyue Yuan, Dong Yin, Jingwen Ding, Yuhao Luo, Zhipeng Zhou, Chengfeng Zhu, Rui Zhang
Abstract	Person re-identification plays an important role in realistic video surveillance with increasing demand for public safety. In this paper, we propose a novel framework with rules of updating images for person re-identification in real-world surveillance system. First, Image Pool is generated by using mean-shift tracking method to automatically select video frame fragments of the target person. Second, features extracted from Image Pool by convolutional network work together to re-rank original ranking list of the main image and matching results will be generated. In addition, updating rules are designed for replacing images in Image Pool when a new image satiating with our updating critical formula in video system. These rules fall into two categories: if the new image is from the same camera as the previous updated image, it will replace one of assist images; otherwise, it will replace the main image directly. Experiments are conduced on Market-1501, iLIDS-VID and PRID-2011 and our ITSD datasets to validate that our framework outperforms on rank-1 accuracy and mAP for person re-identification. Furthermore, the update ability of our framework provides consistently remarkable accuracy rate in real-world surveillance system.
Tasks	Person Re-Identification
Published	2018-03-08
URL	http://arxiv.org/abs/1803.02983v1
PDF	http://arxiv.org/pdf/1803.02983v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-with-updateable-joint-images-re
Repo
Framework

Ring Migration Topology Helps Bypassing Local Optima


Title	Ring Migration Topology Helps Bypassing Local Optima
Authors	Clemens Frahnow, Timo Kötzing
Abstract	Running several evolutionary algorithms in parallel and occasionally exchanging good solutions is referred to as island models. The idea is that the independence of the different islands leads to diversity, thus possibly exploring the search space better. Many theoretical analyses so far have found a complete (or sufficiently quickly expanding) topology as underlying migration graph most efficient for optimization, even though a quick dissemination of individuals leads to a loss of diversity. We suggest a simple fitness function FORK with two local optima parametrized by $r \geq 2$ and a scheme for composite fitness functions. We show that, while the (1+1) EA gets stuck in a bad local optimum and incurs a run time of $\Theta(n^{2r})$ fitness evaluations on FORK, island models with a complete topology can achieve a run time of $\Theta(n^{1.5r})$ by making use of rare migrations in order to explore the search space more effectively. Finally, the ring topology, making use of rare migrations and a large diameter, can achieve a run time of $\tilde{\Theta}(n^r)$, the black box complexity of FORK. This shows that the ring topology can be preferable over the complete topology in order to maintain diversity.
Tasks
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01128v1
PDF	http://arxiv.org/pdf/1806.01128v1.pdf
PWC	https://paperswithcode.com/paper/ring-migration-topology-helps-bypassing-local
Repo
Framework

Nonparametric Topic Modeling with Neural Inference


Title	Nonparametric Topic Modeling with Neural Inference
Authors	Xuefei Ning, Yin Zheng, Zhuxi Jiang, Yu Wang, Huazhong Yang, Junzhou Huang
Abstract	This work focuses on combining nonparametric topic models with Auto-Encoding Variational Bayes (AEVB). Specifically, we first propose iTM-VAE, where the topics are treated as trainable parameters and the document-specific topic proportions are obtained by a stick-breaking construction. The inference of iTM-VAE is modeled by neural networks such that it can be computed in a simple feed-forward manner. We also describe how to introduce a hyper-prior into iTM-VAE so as to model the uncertainty of the prior parameter. Actually, the hyper-prior technique is quite general and we show that it can be applied to other AEVB based models to alleviate the {\it collapse-to-prior} problem elegantly. Moreover, we also propose HiTM-VAE, where the document-specific topic distributions are generated in a hierarchical manner. HiTM-VAE is even more flexible and can generate topic distributions with better variability. Experimental results on 20News and Reuters RCV1-V2 datasets show that the proposed models outperform the state-of-the-art baselines significantly. The advantages of the hyper-prior technique and the hierarchical model construction are also confirmed by experiments.
Tasks	Topic Models
Published	2018-06-18
URL	http://arxiv.org/abs/1806.06583v1
PDF	http://arxiv.org/pdf/1806.06583v1.pdf
PWC	https://paperswithcode.com/paper/nonparametric-topic-modeling-with-neural
Repo
Framework