January 25, 2020

3196 words 16 mins read

Paper Group NAWR 17

Paper Group NAWR 17

Realtime and Accurate 3D Eye Gaze Capture with DCNN-based Iris and Pupil Segmentation. Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters. ACIQ: Analytical Clipping for Integer Quantization of neural networks. Bayesian Hierarchical Dynamic Model for Human Action Recognition. Learning Generalizable Device Placement Algorit …

Realtime and Accurate 3D Eye Gaze Capture with DCNN-based Iris and Pupil Segmentation

Title Realtime and Accurate 3D Eye Gaze Capture with DCNN-based Iris and Pupil Segmentation
Authors Zhiyong Wang, Jinxiang Chai, Shihong Xia
Abstract This paper presents a realtime and accurate method for 3D eye gaze tracking with a monocular RGB camera. Our key idea is to train a deep convolutional neural network(DCNN) that automatically extracts the iris and pupil pixels of each eye from input images. To achieve this goal, we combine the power of Unet\cite{ronneberger2015u-net:} and Squeezenet\cite{iandola2017squeezenet:} to train an efficient convolutional neural network for pixel classification. In addition, we track the 3D eye gaze state in the Maximum A Posteriori (MAP) framework, which sequentially searches for the most likely state of the 3D eye gaze at each frame. When eye blinking occurs, the eye gaze tracker can obtain an inaccurate result. We further extend the convolutional neural network for eye close detection in order to improve the robustness and accuracy of the eye gaze tracker. Our system runs in realtime on desktop PCs and smart phones. We have evaluated our system on live videos and Internet videos, and our results demonstrate that the system is robust and accurate for various genders, races, lighting conditions, poses, shapes and facial expressions. A comparison against Wang et al.[3] shows that our method advances the state of the art in 3D eye tracking using a single RGB camera.
Tasks Eye Tracking
Published 2019-08-28
URL https://ieeexplore.ieee.org/document/8818661
PDF https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8818661
PWC https://paperswithcode.com/paper/realtime-and-accurate-3d-eye-gaze-capture
Repo https://github.com/1996scarlet/Laser-Eye
Framework mxnet

Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Title Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters
Authors Alberto Maria Metelli, Amarildo Likmeta, Marcello Restelli
Abstract How does the uncertainty of the value function propagate when performing temporal difference learning? In this paper, we address this question by proposing a Bayesian framework in which we employ approximate posterior distributions to model the uncertainty of the value function and Wasserstein barycenters to propagate it across state-action pairs. Leveraging on these tools, we present an algorithm, Wasserstein Q-Learning (WQL), starting in the tabular case and then, we show how it can be extended to deal with continuous domains. Furthermore, we prove that, under mild assumptions, a slight variation of WQL enjoys desirable theoretical properties in the tabular setting. Finally, we present an experimental campaign to show the effectiveness of WQL on finite problems, compared to several RL algorithms, some of which are specifically designed for exploration, along with some preliminary results on Atari games.
Tasks Atari Games, Q-Learning
Published 2019-12-01
URL http://papers.nips.cc/paper/8685-propagating-uncertainty-in-reinforcement-learning-via-wasserstein-barycenters
PDF http://papers.nips.cc/paper/8685-propagating-uncertainty-in-reinforcement-learning-via-wasserstein-barycenters.pdf
PWC https://paperswithcode.com/paper/propagating-uncertainty-in-reinforcement
Repo https://github.com/albertometelli/wql
Framework none

ACIQ: Analytical Clipping for Integer Quantization of neural networks

Title ACIQ: Analytical Clipping for Integer Quantization of neural networks
Authors Ron Banner, Yury Nahshan, Elad Hoffer, Daniel Soudry
Abstract We analyze the trade-off between quantization noise and clipping distortion in low precision networks. We identify the statistics of various tensors, and derive exact expressions for the mean-square-error degradation due to clipping. By optimizing these expressions, we show marked improvements over standard quantization schemes that normally avoid clipping. For example, just by choosing the accurate clipping values, more than 40% accuracy improvement is obtained for the quantization of VGG-16 to 4-bits of precision. Our results have many applications for the quantization of neural networks at both training and inference time.
Tasks Quantization
Published 2019-05-01
URL https://openreview.net/forum?id=B1x33sC9KQ
PDF https://openreview.net/pdf?id=B1x33sC9KQ
PWC https://paperswithcode.com/paper/aciq-analytical-clipping-for-integer
Repo https://github.com/submission2019/AnalyticalScaleForIntegerQuantization
Framework pytorch

Bayesian Hierarchical Dynamic Model for Human Action Recognition

Title Bayesian Hierarchical Dynamic Model for Human Action Recognition
Authors Rui Zhao, Wanru Xu, Hui Su, Qiang Ji
Abstract Human action recognition remains as a challenging task partially due to the presence of large variations in the execution of action. To address this issue, we propose a probabilistic model called Hierarchical Dynamic Model (HDM). Leveraging on Bayesian framework, the model parameters are allowed to vary across different sequences of data, which increase the capacity of the model to adapt to intra-class variations on both spatial and temporal extent of actions. Meanwhile, the generative learning process allows the model to preserve the distinctive dynamic pattern for each action class. Through Bayesian inference, we are able to quantify the uncertainty of the classification, providing insight during the decision process. Compared to state-of-the-art methods, our method not only achieves competitive recognition performance within individual dataset but also shows better generalization capability across different datasets. Experiments conducted on data with missing values also show the robustness of the proposed method.
Tasks Bayesian Inference, Multimodal Activity Recognition, Skeleton Based Action Recognition, Temporal Action Localization
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Zhao_Bayesian_Hierarchical_Dynamic_Model_for_Human_Action_Recognition_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhao_Bayesian_Hierarchical_Dynamic_Model_for_Human_Action_Recognition_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/bayesian-hierarchical-dynamic-model-for-human
Repo https://github.com/rort1989/HDM
Framework none

Learning Generalizable Device Placement Algorithms for Distributed Machine Learning

Title Learning Generalizable Device Placement Algorithms for Distributed Machine Learning
Authors Ravichandra Addanki, Shaileshh Bojja Venkatakrishnan, Shreyan Gupta, Hongzi Mao, Mohammad Alizadeh
Abstract We present Placeto, a reinforcement learning (RL) approach to efficiently find device placements for distributed neural network training. Unlike prior approaches that only find a device placement for a specific computation graph, Placeto can learn generalizable device placement policies that can be applied to any graph. We propose two key ideas in our approach: (1) we represent the policy as performing iterative placement improvements, rather than outputting a placement in one shot; (2) we use graph embeddings to capture relevant information about the structure of the computation graph, without relying on node labels for indexing. These ideas allow Placeto to train efficiently and generalize to unseen graphs. Our experiments show that Placeto requires up to 6.1x fewer training steps to find placements that are on par with or better than the best placements found by prior approaches. Moreover, Placeto is able to learn a generalizable placement policy for any given family of graphs that can be used without any re-training to predict optimized placements for unseen graphs from the same family. This eliminates the huge overhead incurred by the prior RL approaches whose lack of generalizability necessitates re-training from scratch every time a new graph is to be placed.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8653-learning-generalizable-device-placement-algorithms-for-distributed-machine-learning
PDF http://papers.nips.cc/paper/8653-learning-generalizable-device-placement-algorithms-for-distributed-machine-learning.pdf
PWC https://paperswithcode.com/paper/learning-generalizable-device-placement
Repo https://github.com/aravic/generalizable-device-placement
Framework tf

Generative Models for Graph-Based Protein Design

Title Generative Models for Graph-Based Protein Design
Authors John Ingraham, Vikas Garg, Regina Barzilay, Tommi Jaakkola
Abstract Engineered proteins offer the potential to solve many problems in biomedicine, energy, and materials science, but creating designs that succeed is difficult in practice. A significant aspect of this challenge is the complex coupling between protein sequence and 3D structure, with the task of finding a viable design often referred to as the inverse protein folding problem. We develop relational language models for protein sequences that directly condition on a graph specification of the target structure. Our approach efficiently captures the complex dependencies in proteins by focusing on those that are long-range in sequence but local in 3D space. Our framework significantly improves in both speed and robustness over conventional and deep-learning-based methods for structure-based protein sequence design, and takes a step toward rapid and targeted biomolecular design with the aid of deep generative models.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9711-generative-models-for-graph-based-protein-design
PDF http://papers.nips.cc/paper/9711-generative-models-for-graph-based-protein-design.pdf
PWC https://paperswithcode.com/paper/generative-models-for-graph-based-protein
Repo https://github.com/jingraham/neurips19-graph-protein-design
Framework pytorch

An adaptive homeostatic algorithm for the unsupervised learning of visual features

Title An adaptive homeostatic algorithm for the unsupervised learning of visual features
Authors Victor Boutin, Angelo Franciosini, Laurent Perrinet
Abstract The formation of structure in the brain, that is, of the connections between cells within neural populations, is by large an unsupervised learning process: the emergence of this architecture is mostly self-organized. In the primary visual cortex of mammals, for example, one may observe during development the formation of cells selective to localized, oriented features. This leads to the development of a rough representation of contours of the retinal image in area V1. We modeled these mechanisms using sparse Hebbian learning algorithms. These algorithms alternate a coding step to encode the information with a learning step to find the proper encoder. A major difficulty faced by these algorithms is to deduce a good representation while knowing immature encoders, and to learn good encoders with a non-optimal representation. To address this problem, we propose to introduce a new regulation process between learning and coding, called homeostasis. Our homeostasis is compatible with a neuro-mimetic architecture and allows for the fast emergence of localized filters sensitive to orientation. The key to this algorithm lies in a simple adaptation mechanism based on non-linear functions that reconciles the antagonistic processes that occur at the coding and learning time scales. We tested this unsupervised algorithm with this homeostasis rule for a range of existing unsupervised learning algorithms coupled with different neural coding algorithms. In addition, we propose a simplification of this optimal homeostasis rule by implementing a simple heuristic on the probability of activation of neurons. Compared to the optimal homeostasis rule, we show that this heuristic allows to implement a more rapid unsupervised learning algorithm while keeping a large part of its effectiveness. These results demonstrate the potential application of such a strategy in machine learning and we illustrate this with one result in a convolutional neural network.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=SyMras0cFQ
PDF https://openreview.net/pdf?id=SyMras0cFQ
PWC https://paperswithcode.com/paper/an-adaptive-homeostatic-algorithm-for-the
Repo https://github.com/bicv/SHL_scripts
Framework none

Event Cameras, Contrast Maximization and Reward Functions: An Analysis

Title Event Cameras, Contrast Maximization and Reward Functions: An Analysis
Authors Timo Stoffregen, Lindsay Kleeman
Abstract Event cameras asynchronously report timestamped changes in pixel intensity and offer advantages over conventional raster scan cameras in terms of low-latency, low redundancy sensing and high dynamic range. In recent years, much of research in event based vision has been focused on performing tasks such as optic flow estimation, moving object segmentation, feature tracking, camera rotation estimation and more, through contrast maximization. In contrast maximization, events are warped along motion trajectories whose parameters depend on the quantity being estimated, to some time t_ref. The parameters are then scored by some reward function of the accumulated events at t_ref. The versatility of this approach has lead to a flurry of research in recent years, but no in-depth study of the reward chosen during optimization has yet been made. In this work we examine the choice of reward used in contrast maximization, propose a classification of different rewards and show how a reward can be constructed that is more robust to noise and aperture uncertainty. We validate our work experimentally by predicting optical flow and comparing to ground-truth data.
Tasks Event-based vision, Optical Flow Estimation, Semantic Segmentation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Stoffregen_Event_Cameras_Contrast_Maximization_and_Reward_Functions_An_Analysis_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/event-cameras-contrast-maximization-and
Repo https://github.com/TimoStoff/events_contrast_maximization
Framework pytorch

Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders

Title Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders
Authors Edgar Schonfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, Zeynep Akata
Abstract Many approaches in generalized zero-shot learning rely on cross-modal mapping between the image feature space and the class embedding space. As labeled images are expensive, one direction is to augment the dataset by generating either images or image features. However, the former misses fine-grained details and the latter requires learning a mapping associated with class embeddings. In this work, we take feature generation one step further and propose a model where a shared latent space of image features and class embeddings is learned by modality-specific aligned variational autoencoders. This leaves us with the required discriminative information about the image and classes in the latent features, on which we train a softmax classifier. The key to our approach is that we align the distributions learned from images and from side-information to construct latent features that contain the essential multi-modal information associated with unseen classes. We evaluate our learned latent features on several benchmark datasets, i.e. CUB, SUN, AWA1 and AWA2, and establish a new state of the art on generalized zero-shot as well as on few-shot learning. Moreover, our results on ImageNet with various zero-shot splits show that our latent features generalize well in large-scale settings.
Tasks Few-Shot Learning, Zero-Shot Learning
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Schonfeld_Generalized_Zero-_and_Few-Shot_Learning_via_Aligned_Variational_Autoencoders_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Schonfeld_Generalized_Zero-_and_Few-Shot_Learning_via_Aligned_Variational_Autoencoders_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/generalized-zero-and-few-shot-learning-via-1
Repo https://github.com/edgarschnfld/CADA-VAE-PyTorch
Framework pytorch

Learning RoI Transformer for Oriented Object Detection in Aerial Images

Title Learning RoI Transformer for Oriented Object Detection in Aerial Images
Authors Jian Ding, Nan Xue, Yang Long, Gui-Song Xia, Qikai Lu
Abstract Object detection in aerial images is an active yet challenging task in computer vision because of the bird’s-eye view perspective, the highly complex backgrounds, and the variant appearances of objects. Especially when detecting densely packed objects in aerial images, methods relying on horizontal proposals for common object detection often introduce mismatches between the Region of Interests (RoIs) and objects. This leads to the common misalignment between the final object classification confidence and localization accuracy. In this paper, we propose a RoI Transformer to address these problems. The core idea of RoI Transformer is to apply spatial transformations on RoIs and learn the transformation parameters under the supervision of oriented bounding box (OBB) annotations. RoI Transformer is with lightweight and can be easily embedded into detectors for oriented object detection. Simply apply the RoI Transformer to light head RCNN has achieved state-of-the-art performances on two common and challenging aerial datasets, i.e., DOTA and HRSC2016, with a neglectable reduction to detection speed. Our RoI Transformer exceeds the deformable Position Sensitive RoI pooling when oriented bounding-box annotations are available. Extensive experiments have also validated the flexibility and effectiveness of our RoI Transformer.
Tasks Object Classification, Object Detection, Object Detection In Aerial Images
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Ding_Learning_RoI_Transformer_for_Oriented_Object_Detection_in_Aerial_Images_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Ding_Learning_RoI_Transformer_for_Oriented_Object_Detection_in_Aerial_Images_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/learning-roi-transformer-for-oriented-object
Repo https://github.com/dingjiansw101/RoITransformer_DOTA
Framework mxnet

Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks

Title Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks
Authors Aaron Voelker, Ivana Kajić, Chris Eliasmith
Abstract We propose a novel memory cell for recurrent neural networks that dynamically maintains information across long windows of time using relatively few resources. The Legendre Memory Unit~(LMU) is mathematically derived to orthogonalize its continuous-time history – doing so by solving $d$ coupled ordinary differential equations~(ODEs), whose phase space linearly maps onto sliding windows of time via the Legendre polynomials up to degree $d - 1$. Backpropagation across LMUs outperforms equivalently-sized LSTMs on a chaotic time-series prediction task, improves memory capacity by two orders of magnitude, and significantly reduces training and inference times. LMUs can efficiently handle temporal dependencies spanning $100\text{,}000$ time-steps, converge rapidly, and use few internal state-variables to learn complex functions spanning long windows of time – exceeding state-of-the-art performance among RNNs on permuted sequential MNIST. These results are due to the network’s disposition to learn scale-invariant features independently of step size. Backpropagation through the ODE solver allows each layer to adapt its internal time-step, enabling the network to learn task-relevant time-scales. We demonstrate that LMU memory cells can be implemented using $m$ recurrently-connected Poisson spiking neurons, $\mathcal{O}( m )$ time and memory, with error scaling as $\mathcal{O}( d / \sqrt{m} )$. We discuss implementations of LMUs on analog and digital neuromorphic hardware.
Tasks Time Series, Time Series Prediction
Published 2019-12-01
URL http://papers.nips.cc/paper/9689-legendre-memory-units-continuous-time-representation-in-recurrent-neural-networks
PDF http://papers.nips.cc/paper/9689-legendre-memory-units-continuous-time-representation-in-recurrent-neural-networks.pdf
PWC https://paperswithcode.com/paper/legendre-memory-units-continuous-time
Repo https://github.com/abr/neurips2019
Framework tf

Poincare Glove: Hyperbolic Word Embeddings

Title Poincare Glove: Hyperbolic Word Embeddings
Authors Alexandru Tifrea*, Gary Becigneul*, Octavian-Eugen Ganea*
Abstract Words are not created equal. In fact, they form an aristocratic graph with a latent hierarchical structure that the next generation of unsupervised learned word embeddings should reveal. In this paper, justified by the notion of delta-hyperbolicity or tree-likeliness of a space, we propose to embed words in a Cartesian product of hyperbolic spaces which we theoretically connect to the Gaussian word embeddings and their Fisher geometry. This connection allows us to introduce a novel principled hypernymy score for word embeddings. Moreover, we adapt the well-known Glove algorithm to learn unsupervised word embeddings in this type of Riemannian manifolds. We further explain how to solve the analogy task using the Riemannian parallel transport that generalizes vector arithmetics to this new type of geometry. Empirically, based on extensive experiments, we prove that our embeddings, trained unsupervised, are the first to simultaneously outperform strong and popular baselines on the tasks of similarity, analogy and hypernymy detection. In particular, for word hypernymy, we obtain new state-of-the-art on fully unsupervised WBLESS classification accuracy.
Tasks Learning Word Embeddings, Word Embeddings
Published 2019-05-01
URL https://openreview.net/forum?id=Ske5r3AqK7
PDF https://openreview.net/pdf?id=Ske5r3AqK7
PWC https://paperswithcode.com/paper/poincare-glove-hyperbolic-word-embeddings-1
Repo https://github.com/alex-tifrea/poincare_glove
Framework none

Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments

Title Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments
Authors Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
Abstract In this paper, we attempt to answer the question of whether neural network models can learn numeracy, which is the ability to predict the magnitude of a numeral at some specific position in a text description. A large benchmark dataset, called Numeracy-600K, is provided for the novel task. We explore several neural network models including CNN, GRU, BiGRU, CRNN, CNN-capsule, GRU-capsule, and BiGRU-capsule in the experiments. The results show that the BiGRU model gets the best micro-averaged F1 score of 80.16{%}, and the GRU-capsule model gets the best macro-averaged F1 score of 64.71{%}. Besides discussing the challenges through comprehensive experiments, we also present an important application scenario, i.e., detecting exaggerated information, for the task.
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1635/
PDF https://www.aclweb.org/anthology/P19-1635
PWC https://paperswithcode.com/paper/numeracy-600k-learning-numeracy-for-detecting
Repo https://github.com/aistairc/Numeracy-600K
Framework none

Unsupervised Graph Association for Person Re-Identification

Title Unsupervised Graph Association for Person Re-Identification
Authors Jinlin Wu, Yang Yang, Hao Liu, Shengcai Liao, Zhen Lei, Stan Z. Li
Abstract In this paper, we propose an unsupervised graph association (UGA) framework to learn the underlying viewinvariant representations from the video pedestrian tracklets. The core points of UGA are mining the underlying cross-view associations and reducing the damage of noise associations. To this end, UGA is adopts a two-stage training strategy: (1) intra-camera learning stage and (2) intercamera learning stage. The former learns the intra-camera representation for each camera. While the latter builds a cross-view graph (CVG) to associate different cameras. By doing this, we can learn view-invariant representation for all person. Extensive experiments and ablation studies on seven re-id datasets demonstrate the superiority of the proposed UGA over most state-of-the-art unsupervised and domain adaptation re-id methods.
Tasks Domain Adaptation, Person Re-Identification
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Wu_Unsupervised_Graph_Association_for_Person_Re-Identification_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Wu_Unsupervised_Graph_Association_for_Person_Re-Identification_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/unsupervised-graph-association-for-person-re
Repo https://github.com/yichuan9527/Unsupervised-Graph-Association-for-Person-Re-identification
Framework none

Optimal Pricing in Repeated Posted-Price Auctions with Different Patience of the Seller and the Buyer

Title Optimal Pricing in Repeated Posted-Price Auctions with Different Patience of the Seller and the Buyer
Authors Arsenii Vanunts, Alexey Drutsa
Abstract We study revenue optimization pricing algorithms for repeated posted-price auctions where a seller interacts with a single strategic buyer that holds a fixed private valuation. When the participants non-equally discount their cumulative utilities, we show that the optimal constant pricing (which offers the Myerson price) is no longer optimal. In the case of more patient seller, we propose a novel multidimensional optimization functional — a generalization of the one used to determine Myerson’s price. This functional allows to find the optimal algorithm and to boost revenue of the optimal static pricing by an efficient low-dimensional approximation. Numerical experiments are provided to support our results.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8380-optimal-pricing-in-repeated-posted-price-auctions-with-different-patience-of-the-seller-and-the-buyer
PDF http://papers.nips.cc/paper/8380-optimal-pricing-in-repeated-posted-price-auctions-with-different-patience-of-the-seller-and-the-buyer.pdf
PWC https://paperswithcode.com/paper/optimal-pricing-in-repeated-posted-price
Repo https://github.com/theonlybars/neurips-2019-rppa
Framework none
comments powered by Disqus