February 1, 2020

3284 words 16 mins read

Paper Group AWR 250

Paper Group AWR 250

Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation. The iMaterialist Fashion Attribute Dataset. DELTA: DEep Learning Transfer using Feature Map with Attention for Convolutional Networks. Context Model for Pedestrian Intention Prediction using Factored Latent-Dynamic Conditional Random Fields. Correlation Clustering with Same-Cl …

Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation

Title Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation
Authors Zhijie Fang, Antonio M. López
Abstract Anticipating the intentions of vulnerable road users (VRUs) such as pedestrians and cyclists is critical for performing safe and comfortable driving maneuvers. This is the case for human driving and, thus, should be taken into account by systems providing any level of driving assistance, from advanced driver assistant systems (ADAS) to fully autonomous vehicles (AVs). In this paper, we show how the latest advances on monocular vision-based human pose estimation, i.e. those relying on deep Convolutional Neural Networks (CNNs), enable to recognize the intentions of such VRUs. In the case of cyclists, we assume that they follow traffic rules to indicate future maneuvers with arm signals. In the case of pedestrians, no indications can be assumed. Instead, we hypothesize that the walking pattern of a pedestrian allows to determine if he/she has the intention of crossing the road in the path of the ego-vehicle, so that the ego-vehicle must maneuver accordingly (e.g. slowing down or stopping). In this paper, we show how the same methodology can be used for recognizing pedestrians and cyclists’ intentions. For pedestrians, we perform experiments on the JAAD dataset. For cyclists, we did not found an analogous dataset, thus, we created our own one by acquiring and annotating videos which we share with the research community. Overall, the proposed pipeline provides new state-of-the-art results on the intention recognition of VRUs.
Tasks Autonomous Vehicles, Intent Detection, Pose Estimation
Published 2019-10-09
URL https://arxiv.org/abs/1910.03858v1
PDF https://arxiv.org/pdf/1910.03858v1.pdf
PWC https://paperswithcode.com/paper/intention-recognition-of-pedestrians-and
Repo https://github.com/VRU-intention/casr
Framework none

The iMaterialist Fashion Attribute Dataset

Title The iMaterialist Fashion Attribute Dataset
Authors Sheng Guo, Weilin Huang, Xiao Zhang, Prasanna Srikhanta, Yin Cui, Yuan Li, Matthew R. Scott, Hartwig Adam, Serge Belongie
Abstract Large-scale image databases such as ImageNet have significantly advanced image classification and other visual recognition tasks. However much of these datasets are constructed only for single-label and coarse object-level classification. For real-world applications, multiple labels and fine-grained categories are often needed, yet very few such datasets exist publicly, especially those of large-scale and high quality. In this work, we contribute to the community a new dataset called iMaterialist Fashion Attribute (iFashion-Attribute) to address this problem in the fashion domain. The dataset is constructed from over one million fashion images with a label space that includes 8 groups of 228 fine-grained attributes in total. Each image is annotated by experts with multiple, high-quality fashion attributes. The result is the first known million-scale multi-label and fine-grained image dataset. We conduct extensive experiments and provide baseline results with modern deep Convolutional Neural Networks (CNNs). Additionally, we demonstrate models pre-trained on iFashion-Attribute achieve superior transfer learning performance on fashion related tasks compared with pre-training from ImageNet or other fashion datasets. Data is available at: https://github.com/visipedia/imat_fashion_comp
Tasks Image Classification, Transfer Learning
Published 2019-06-13
URL https://arxiv.org/abs/1906.05750v2
PDF https://arxiv.org/pdf/1906.05750v2.pdf
PWC https://paperswithcode.com/paper/the-imaterialist-fashion-attribute-dataset
Repo https://github.com/visipedia/imat_fashion_comp
Framework none

DELTA: DEep Learning Transfer using Feature Map with Attention for Convolutional Networks

Title DELTA: DEep Learning Transfer using Feature Map with Attention for Convolutional Networks
Authors Xingjian Li, Haoyi Xiong, Hanchao Wang, Yuxuan Rao, Liping Liu, Jun Huan
Abstract Transfer learning through fine-tuning a pre-trained neural network with an extremely large dataset, such as ImageNet, can significantly accelerate training while the accuracy is frequently bottlenecked by the limited dataset size of the new target task. To solve the problem, some regularization methods, constraining the outer layer weights of the target network using the starting point as references (SPAR), have been studied. In this paper, we propose a novel regularized transfer learning framework DELTA, namely DEep Learning Transfer using Feature Map with Attention. Instead of constraining the weights of neural network, DELTA aims to preserve the outer layer outputs of the target network. Specifically, in addition to minimizing the empirical loss, DELTA intends to align the outer layer outputs of two networks, through constraining a subset of feature maps that are precisely selected by attention that has been learned in an supervised learning manner. We evaluate DELTA with the state-of-the-art algorithms, including L2 and L2-SP. The experiment results show that our proposed method outperforms these baselines with higher accuracy for new tasks.
Tasks Transfer Learning
Published 2019-01-26
URL http://arxiv.org/abs/1901.09229v2
PDF http://arxiv.org/pdf/1901.09229v2.pdf
PWC https://paperswithcode.com/paper/delta-deep-learning-transfer-using-feature
Repo https://github.com/lixingjian/DELTA
Framework pytorch

Context Model for Pedestrian Intention Prediction using Factored Latent-Dynamic Conditional Random Fields

Title Context Model for Pedestrian Intention Prediction using Factored Latent-Dynamic Conditional Random Fields
Authors Satyajit Neogi, Michael Hoy, Kang Dang, Hang Yu, Justin Dauwels
Abstract Smooth handling of pedestrian interactions is a key requirement for Autonomous Vehicles (AV) and Advanced Driver Assistance Systems (ADAS). Such systems call for early and accurate prediction of a pedestrian’s crossing/not-crossing behaviour in front of the vehicle. Existing approaches to pedestrian behaviour prediction make use of pedestrian motion, his/her location in a scene and static context variables such as traffic lights, zebra crossings etc. We stress on the necessity of early prediction for smooth operation of such systems. We introduce the influence of vehicle interactions on pedestrian intention for this purpose. In this paper, we show a discernible advance in prediction time aided by the inclusion of such vehicle interaction context. We apply our methods to two different datasets, one in-house collected - NTU dataset and another public real-life benchmark - JAAD dataset. We also propose a generic graphical model Factored Latent-Dynamic Conditional Random Fields (FLDCRF) for single and multi-label sequence prediction as well as joint interaction modeling tasks. FLDCRF outperforms Long Short-Term Memory (LSTM) networks across the datasets ($\sim$100 sequences per dataset) over identical time-series features. While the existing best system predicts pedestrian stopping behaviour with 70% accuracy 0.38 seconds before the actual events, our system achieves such accuracy at least 0.9 seconds on an average before the actual events across datasets.
Tasks Autonomous Vehicles, Time Series
Published 2019-07-27
URL https://arxiv.org/abs/1907.11881v2
PDF https://arxiv.org/pdf/1907.11881v2.pdf
PWC https://paperswithcode.com/paper/context-model-for-pedestrian-intention
Repo https://github.com/satyajitneogiju/FLDCRF-for-sequence-labeling
Framework none

Correlation Clustering with Same-Cluster Queries Bounded by Optimal Cost

Title Correlation Clustering with Same-Cluster Queries Bounded by Optimal Cost
Authors Barna Saha, Sanjay Subramanian
Abstract Several clustering frameworks with interactive (semi-supervised) queries have been studied in the past. Recently, clustering with same-cluster queries has become popular. An algorithm in this setting has access to an oracle with full knowledge of an optimal clustering, and the algorithm can ask the oracle queries of the form, “Does the optimal clustering put vertices $ u $ and $ v $ in the same cluster?” Due to its simplicity, this querying model can easily be implemented in real crowd-sourcing platforms and has attracted a lot of recent work. In this paper, we study the popular correlation clustering problem (Bansal et al., 2002) under this framework. Given a complete graph $G=(V,E)$ with positive and negative edge labels, correlation clustering objective aims to compute a graph clustering that minimizes the total number of disagreements, that is the negative intra-cluster edges and positive inter-cluster edges. Let $ C_{OPT} $ be the number of disagreements made by the optimal clustering. We present algorithms for correlation clustering whose error and query bounds are parameterized by $C_{OPT}$ rather than by the number of clusters. Indeed, a good clustering must have small $C_{OPT}$. Specifically, we present an efficient algorithm that recovers an exact optimal clustering using at most $2C_{OPT} $ queries and an efficient algorithm that outputs a $2$-approximation using at most $C_{OPT} $ queries. In addition, we show under a plausible complexity assumption, there does not exist any polynomial time algorithm that has an approximation ratio better than $1+\alpha$ for an absolute constant $\alpha >0$ with $o(C_{OPT})$ queries. We extensively evaluate our methods on several synthetic and real-world datasets using real crowd-sourced oracles. Moreover, we compare our approach against several known correlation clustering algorithms.
Tasks Graph Clustering
Published 2019-08-14
URL https://arxiv.org/abs/1908.04976v1
PDF https://arxiv.org/pdf/1908.04976v1.pdf
PWC https://paperswithcode.com/paper/correlation-clustering-with-same-cluster
Repo https://github.com/sanjayss34/corr-clust-query-esa2019
Framework none

JNET: Learning User Representations via Joint Network Embedding and Topic Embedding

Title JNET: Learning User Representations via Joint Network Embedding and Topic Embedding
Authors Lin Gong, Lu Lin, Weihao Song, Hongning Wang
Abstract User representation learning is vital to capture diverse user preferences, while it is also challenging as user intents are latent and scattered among complex and different modalities of user-generated data, thus, not directly measurable. Inspired by the concept of user schema in social psychology, we take a new perspective to perform user representation learning by constructing a shared latent space to capture the dependency among different modalities of user-generated data. Both users and topics are embedded to the same space to encode users’ social connections and text content, to facilitate joint modeling of different modalities, via a probabilistic generative framework. We evaluated the proposed solution on large collections of Yelp reviews and StackOverflow discussion posts, with their associated network structures. The proposed model outperformed several state-of-the-art topic modeling based user models with better predictive power in unseen documents, and state-of-the-art network embedding based user models with improved link prediction quality in unseen nodes. The learnt user representations are also proved to be useful in content recommendation, e.g., expert finding in StackOverflow.
Tasks Link Prediction, Network Embedding, Representation Learning
Published 2019-12-01
URL https://arxiv.org/abs/1912.00465v1
PDF https://arxiv.org/pdf/1912.00465v1.pdf
PWC https://paperswithcode.com/paper/jnet-learning-user-representations-via-joint
Repo https://github.com/Linda-sunshine/JNET
Framework none

Learning to Predict Robot Keypoints Using Artificially Generated Images

Title Learning to Predict Robot Keypoints Using Artificially Generated Images
Authors Christoph Heindl, Sebastian Zambal, Josef Scharinger
Abstract This work considers robot keypoint estimation on color images as a supervised machine learning task. We propose the use of probabilistically created renderings to overcome the lack of labeled real images. Rather than sampling from stationary distributions, our approach introduces a feedback mechanism that constantly adapts probability distributions according to current training progress. Initial results show, our approach achieves near-human-level accuracy on real images. Additionally, we demonstrate that feedback leads to fewer required training steps, while maintaining the same model quality on synthetic data sets.
Tasks
Published 2019-07-03
URL https://arxiv.org/abs/1907.01879v1
PDF https://arxiv.org/pdf/1907.01879v1.pdf
PWC https://paperswithcode.com/paper/learning-to-predict-robot-keypoints-using
Repo https://github.com/cheind/pytorch-blender
Framework pytorch

From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization

Title From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization
Authors Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang
Abstract We present a new algorithm ASEBO for optimizing high-dimensional blackbox functions. ASEBO adapts to the geometry of the function and learns optimal sets of sensing directions, which are used to probe it, on-the-fly. It addresses the exploration-exploitation trade-off of blackbox optimization with expensive blackbox queries by continuously learning the bias of the lower-dimensional model used to approximate gradients of smoothings of the function via compressed sensing and contextual bandits methods. To obtain this model, it leverages techniques from the emerging theory of active subspaces in the novel ES blackbox optimization context. As a result, ASEBO learns the dynamically changing intrinsic dimensionality of the gradient space and adapts to the hardness of different stages of the optimization without external supervision. Consequently, it leads to more sample-efficient blackbox optimization than state-of-the-art algorithms. We provide theoretical results and test ASEBO advantages over other methods empirically by evaluating it on the set of reinforcement learning policy optimization tasks as well as functions from the recently open-sourced Nevergrad library.
Tasks Multi-Armed Bandits
Published 2019-03-07
URL https://arxiv.org/abs/1903.04268v3
PDF https://arxiv.org/pdf/1903.04268v3.pdf
PWC https://paperswithcode.com/paper/adaptive-sample-efficient-blackbox
Repo https://github.com/jparkerholder/ASEBO
Framework none

Expressive power of tensor-network factorizations for probabilistic modeling, with applications from hidden Markov models to quantum machine learning

Title Expressive power of tensor-network factorizations for probabilistic modeling, with applications from hidden Markov models to quantum machine learning
Authors Ivan Glasser, Ryan Sweke, Nicola Pancotti, Jens Eisert, J. Ignacio Cirac
Abstract Tensor-network techniques have enjoyed outstanding success in physics, and have recently attracted attention in machine learning, both as a tool for the formulation of new learning algorithms and for enhancing the mathematical understanding of existing methods. Inspired by these developments, and the natural correspondence between tensor networks and probabilistic graphical models, we provide a rigorous analysis of the expressive power of various tensor-network factorizations of discrete multivariate probability distributions. These factorizations include non-negative tensor-trains/MPS, which are in correspondence with hidden Markov models, and Born machines, which are naturally related to local quantum circuits. When used to model probability distributions, they exhibit tractable likelihoods and admit efficient learning algorithms. Interestingly, we prove that there exist probability distributions for which there are unbounded separations between the resource requirements of some of these tensor-network factorizations. Particularly surprising is the fact that using complex instead of real tensors can lead to an arbitrarily large reduction in the number of parameters of the network. Additionally, we introduce locally purified states (LPS), a new factorization inspired by techniques for the simulation of quantum systems, with provably better expressive power than all other representations considered. The ramifications of this result are explored through numerical experiments. Our findings imply that LPS should be considered over hidden Markov models, and furthermore provide guidelines for the design of local quantum circuits for probabilistic modeling.
Tasks Quantum Machine Learning, Tensor Networks
Published 2019-07-08
URL https://arxiv.org/abs/1907.03741v2
PDF https://arxiv.org/pdf/1907.03741v2.pdf
PWC https://paperswithcode.com/paper/expressive-power-of-tensor-network
Repo https://github.com/glivan/tensor_networks_for_probabilistic_modeling
Framework none

How to best use Syntax in Semantic Role Labelling

Title How to best use Syntax in Semantic Role Labelling
Authors Yufei Wang, Mark Johnson, Stephen Wan, Yifang Sun, Wei Wang
Abstract There are many different ways in which external information might be used in an NLP task. This paper investigates how external syntactic information can be used most effectively in the Semantic Role Labeling (SRL) task. We evaluate three different ways of encoding syntactic parses and three different ways of injecting them into a state-of-the-art neural ELMo-based SRL sequence labelling model. We show that using a constituency representation as input features improves performance the most, achieving a new state-of-the-art for non-ensemble SRL models on the in-domain CoNLL’05 and CoNLL’12 benchmarks.
Tasks Semantic Role Labeling
Published 2019-06-01
URL https://arxiv.org/abs/1906.00266v1
PDF https://arxiv.org/pdf/1906.00266v1.pdf
PWC https://paperswithcode.com/paper/190600266
Repo https://github.com/GaryYufei/bestParseSRL
Framework tf

End-to-End Denoising of Dark Burst Images Using Recurrent Fully Convolutional Networks

Title End-to-End Denoising of Dark Burst Images Using Recurrent Fully Convolutional Networks
Authors Di Zhao, Lan Ma, Songnan Li, Dahai Yu
Abstract When taking photos in dim-light environments, due to the small amount of light entering, the shot images are usually extremely dark, with a great deal of noise, and the color cannot reflect real-world color. Under this condition, the traditional methods used for single image denoising have always failed to be effective. One common idea is to take multiple frames of the same scene to enhance the signal-to-noise ratio. This paper proposes a recurrent fully convolutional network (RFCN) to process burst photos taken under extremely low-light conditions, and to obtain denoised images with improved brightness. Our model maps raw burst images directly to sRGB outputs, either to produce a best image or to generate a multi-frame denoised image sequence. This process has proven to be capable of accomplishing the low-level task of denoising, as well as the high-level task of color correction and enhancement, all of which is end-to-end processing through our network. Our method has achieved better results than state-of-the-art methods. In addition, we have applied the model trained by one type of camera without fine-tuning on photos captured by different cameras and have obtained similar end-to-end enhancements.
Tasks Denoising, Image Denoising
Published 2019-04-16
URL http://arxiv.org/abs/1904.07483v1
PDF http://arxiv.org/pdf/1904.07483v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-denoising-of-dark-burst-images
Repo https://github.com/z-bingo/Recurrent-Fully-Convolutional-Networks
Framework pytorch

ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring

Title ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring
Authors David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel
Abstract We improve the recently-proposed “MixMatch” semi-supervised learning algorithm by introducing two new techniques: distribution alignment and augmentation anchoring. Distribution alignment encourages the marginal distribution of predictions on unlabeled data to be close to the marginal distribution of ground-truth labels. Augmentation anchoring feeds multiple strongly augmented versions of an input into the model and encourages each output to be close to the prediction for a weakly-augmented version of the same input. To produce strong augmentations, we propose a variant of AutoAugment which learns the augmentation policy while the model is being trained. Our new algorithm, dubbed ReMixMatch, is significantly more data-efficient than prior work, requiring between $5\times$ and $16\times$ less data to reach the same accuracy. For example, on CIFAR-10 with 250 labeled examples we reach $93.73%$ accuracy (compared to MixMatch’s accuracy of $93.58%$ with $4{,}000$ examples) and a median accuracy of $84.92%$ with just four labels per class. We make our code and data open-source at https://github.com/google-research/remixmatch.
Tasks Image Classification, Semi-Supervised Image Classification
Published 2019-11-21
URL https://arxiv.org/abs/1911.09785v2
PDF https://arxiv.org/pdf/1911.09785v2.pdf
PWC https://paperswithcode.com/paper/remixmatch-semi-supervised-learning-with-1
Repo https://github.com/google-research/remixmatch
Framework tf

Extensional Higher-Order Paramodulation in Leo-III

Title Extensional Higher-Order Paramodulation in Leo-III
Authors Alexander Steen, Christoph Benzmüller
Abstract Leo-III is an automated theorem prover for extensional type theory with Henkin semantics and choice. Reasoning with primitive equality is enabled by adapting paramodulation-based proof search to higher-order logic. The prover may cooperate with multiple external specialist reasoning systems such as first-order provers and SMT solvers. Leo-III is compatible with the TPTP/TSTP framework for input formats, reporting results and proofs, and standardized communication between reasoning systems, enabling e.g. proof reconstruction from within proof assistants such as Isabelle/HOL. Leo-III supports reasoning in polymorphic first-order and higher-order logic, in all normal quantified modal logics, as well as in different deontic logics. Its development had initiated the ongoing extension of the TPTP infrastructure to reasoning within non-classical logics.
Tasks
Published 2019-07-26
URL https://arxiv.org/abs/1907.11501v1
PDF https://arxiv.org/pdf/1907.11501v1.pdf
PWC https://paperswithcode.com/paper/extensional-higher-order-paramodulation-in
Repo https://github.com/leoprover/Leo-III
Framework none

Learning Smooth Representation for Unsupervised Domain Adaptation

Title Learning Smooth Representation for Unsupervised Domain Adaptation
Authors Guanyu Cai, Yuqin Wang, Lianghua He
Abstract In unsupervised domain adaptation, existing methods have achieved remarkable performance, but few pay attention to the Lipschitz constraint. It has been studied that not just reducing the divergence between distributions, but the satisfaction of Lipschitz continuity guarantees an error bound for the target distribution. In this paper, we adopt this principle and extend it to a deep end-to-end model. We define a formula named local smooth discrepancy to measure the Lipschitzness for target distribution in a pointwise way. Further, several critical factors affecting the error bound are taken into account in our proposed optimization strategy to ensure the effectiveness and stability. Empirical evidence shows that the proposed method is comparable or superior to the state-of-the-art methods and our modifications are important for the validity.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2019-05-26
URL https://arxiv.org/abs/1905.10748v3
PDF https://arxiv.org/pdf/1905.10748v3.pdf
PWC https://paperswithcode.com/paper/190510748
Repo https://github.com/CuthbertCai/SRDA
Framework pytorch

Multiple Partitions Aligned Clustering

Title Multiple Partitions Aligned Clustering
Authors Zhao Kang, Zipeng Guo, Shudong Huang, Siying Wang, Wenyu Chen, Yuanzhang Su, Zenglin Xu
Abstract Multi-view clustering is an important yet challenging task due to the difficulty of integrating the information from multiple representations. Most existing multi-view clustering methods explore the heterogeneous information in the space where the data points lie. Such common practice may cause significant information loss because of unavoidable noise or inconsistency among views. Since different views admit the same cluster structure, the natural space should be all partitions. Orthogonal to existing techniques, in this paper, we propose to leverage the multi-view information by fusing partitions. Specifically, we align each partition to form a consensus cluster indicator matrix through a distinct rotation matrix. Moreover, a weight is assigned for each view to account for the clustering capacity differences of views. Finally, the basic partitions, weights, and consensus clustering are jointly learned in a unified framework. We demonstrate the effectiveness of our approach on several real datasets, where significant improvement is found over other state-of-the-art multi-view clustering methods.
Tasks
Published 2019-09-13
URL https://arxiv.org/abs/1909.06008v1
PDF https://arxiv.org/pdf/1909.06008v1.pdf
PWC https://paperswithcode.com/paper/multiple-partitions-aligned-clustering
Repo https://github.com/sckangz/mPAC
Framework none
comments powered by Disqus