July 29, 2019

2789 words 14 mins read

Paper Group AWR 184

Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters. Associative Domain Adaptation. Automated Hate Speech Detection and the Problem of Offensive Language. Cross-Validation with Confidence. Convolutional Dictionary Learning via Local Processing. Attention-based Extraction of Structured Inform …

Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters


Title	Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters
Authors	Lucas Beyer, Stefan Breuers, Vitaly Kurin, Bastian Leibe
Abstract	With the rise of end-to-end learning through deep learning, person detectors and re-identification (ReID) models have recently become very strong. Multi-camera multi-target (MCMT) tracking has not fully gone through this transformation yet. We intend to take another step in this direction by presenting a theoretically principled way of integrating ReID with tracking formulated as an optimal Bayes filter. This conveniently side-steps the need for data-association and opens up a direct path from full images to the core of the tracker. While the results are still sub-par, we believe that this new, tight integration opens many interesting research opportunities and leads the way towards full end-to-end tracking from raw pixels.
Tasks
Published	2017-05-12
URL	http://arxiv.org/abs/1705.04608v2
PDF	http://arxiv.org/pdf/1705.04608v2.pdf
PWC	https://paperswithcode.com/paper/towards-a-principled-integration-of-multi
Repo	https://github.com/VisualComputingInstitute/towards-reid-tracking
Framework	none

Associative Domain Adaptation


Title	Associative Domain Adaptation
Authors	Philip Haeusser, Thomas Frerix, Alexander Mordvintsev, Daniel Cremers
Abstract	We propose associative domain adaptation, a novel technique for end-to-end domain adaptation with neural networks, the task of inferring class labels for an unlabeled target domain based on the statistical properties of a labeled source domain. Our training scheme follows the paradigm that in order to effectively derive class labels for the target domain, a network should produce statistically domain invariant embeddings, while minimizing the classification error on the labeled source domain. We accomplish this by reinforcing associations between source and target data directly in embedding space. Our method can easily be added to any existing classification network with no structural and almost no computational overhead. We demonstrate the effectiveness of our approach on various benchmarks and achieve state-of-the-art results across the board with a generic convolutional neural network architecture not specifically tuned to the respective tasks. Finally, we show that the proposed association loss produces embeddings that are more effective for domain adaptation compared to methods employing maximum mean discrepancy as a similarity measure in embedding space.
Tasks	Domain Adaptation
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00938v1
PDF	http://arxiv.org/pdf/1708.00938v1.pdf
PWC	https://paperswithcode.com/paper/associative-domain-adaptation
Repo	https://github.com/stes/torch-associative
Framework	pytorch

Automated Hate Speech Detection and the Problem of Offensive Language


Title	Automated Hate Speech Detection and the Problem of Offensive Language
Authors	Thomas Davidson, Dana Warmsley, Michael Macy, Ingmar Weber
Abstract	A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.
Tasks	Hate Speech Detection
Published	2017-03-11
URL	http://arxiv.org/abs/1703.04009v1
PDF	http://arxiv.org/pdf/1703.04009v1.pdf
PWC	https://paperswithcode.com/paper/automated-hate-speech-detection-and-the
Repo	https://github.com/renuka-fernando/sinhalese_language_racism_detection
Framework	tf

Cross-Validation with Confidence


Title	Cross-Validation with Confidence
Authors	Jing Lei
Abstract	Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross validation methods tend to select overfitting models, due to the ignorance of the uncertainty in the testing sample. We develop a new, statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This new method outputs a set of highly competitive candidate models containing the best one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for regularizing tuning parameter selection, the method can provide a further trade-off between prediction accuracy and model interpretability. We demonstrate the performance of the proposed method in several simulated and real data examples.
Tasks	Model Selection
Published	2017-03-23
URL	http://arxiv.org/abs/1703.07904v2
PDF	http://arxiv.org/pdf/1703.07904v2.pdf
PWC	https://paperswithcode.com/paper/cross-validation-with-confidence
Repo	https://github.com/tim-coleman/CVC_Caret
Framework	none

Convolutional Dictionary Learning via Local Processing


Title	Convolutional Dictionary Learning via Local Processing
Authors	Vardan Papyan, Yaniv Romano, Jeremias Sulam, Michael Elad
Abstract	Convolutional Sparse Coding (CSC) is an increasingly popular model in the signal and image processing communities, tackling some of the limitations of traditional patch-based sparse representations. Although several works have addressed the dictionary learning problem under this model, these relied on an ADMM formulation in the Fourier domain, losing the sense of locality and the relation to the traditional patch-based sparse pursuit. A recent work suggested a novel theoretical analysis of this global model, providing guarantees that rely on a localized sparsity measure. Herein, we extend this local-global relation by showing how one can efficiently solve the convolutional sparse pursuit problem and train the filters involved, while operating locally on image patches. Our approach provides an intuitive algorithm that can leverage standard techniques from the sparse representations field. The proposed method is fast to train, simple to implement, and flexible enough that it can be easily deployed in a variety of applications. We demonstrate the proposed training scheme for image inpainting and image separation, while achieving state-of-the-art results.
Tasks	Dictionary Learning, Image Inpainting
Published	2017-05-09
URL	http://arxiv.org/abs/1705.03239v1
PDF	http://arxiv.org/pdf/1705.03239v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-dictionary-learning-via-local
Repo	https://github.com/qu-arx/arx-inf
Framework	none

Attention-based Extraction of Structured Information from Street View Imagery


Title	Attention-based Extraction of Structured Information from Street View Imagery
Authors	Zbigniew Wojna, Alex Gorban, Dar-Shyang Lee, Kevin Murphy, Qian Yu, Yeqing Li, Julian Ibarz
Abstract	We present a neural network model - based on CNNs, RNNs and a novel attention mechanism - which achieves 84.2% accuracy on the challenging French Street Name Signs (FSNS) dataset, significantly outperforming the previous state of the art (Smith’16), which achieved 72.46%. Furthermore, our new method is much simpler and more general than the previous approach. To demonstrate the generality of our model, we show that it also performs well on an even more challenging dataset derived from Google Street View, in which the goal is to extract business names from store fronts. Finally, we study the speed/accuracy tradeoff that results from using CNN feature extractors of different depths. Surprisingly, we find that deeper is not always better (in terms of accuracy, as well as speed). Our resulting model is simple, accurate and fast, allowing it to be used at scale on a variety of challenging real-world text extraction problems.
Tasks	Optical Character Recognition
Published	2017-04-11
URL	http://arxiv.org/abs/1704.03549v4
PDF	http://arxiv.org/pdf/1704.03549v4.pdf
PWC	https://paperswithcode.com/paper/attention-based-extraction-of-structured
Repo	https://github.com/tensorflow/models/tree/master/research/attention_ocr
Framework	tf

The Uncertainty Bellman Equation and Exploration


Title	The Uncertainty Bellman Equation and Exploration
Authors	Brendan O’Donoghue, Ian Osband, Remi Munos, Volodymyr Mnih
Abstract	We consider the exploration/exploitation problem in reinforcement learning. For exploitation, it is well known that the Bellman equation connects the value at any time-step to the expected value at subsequent time-steps. In this paper we consider a similar \textit{uncertainty} Bellman equation (UBE), which connects the uncertainty at any time-step to the expected uncertainties at subsequent time-steps, thereby extending the potential exploratory benefit of a policy beyond individual time-steps. We prove that the unique fixed point of the UBE yields an upper bound on the variance of the posterior distribution of the Q-values induced by any policy. This bound can be much tighter than traditional count-based bonuses that compound standard deviation rather than variance. Importantly, and unlike several existing approaches to optimism, this method scales naturally to large systems with complex generalization. Substituting our UBE-exploration strategy for $\epsilon$-greedy improves DQN performance on 51 out of 57 games in the Atari suite.
Tasks
Published	2017-09-15
URL	http://arxiv.org/abs/1709.05380v4
PDF	http://arxiv.org/pdf/1709.05380v4.pdf
PWC	https://paperswithcode.com/paper/the-uncertainty-bellman-equation-and
Repo	https://github.com/practical-rl-study/schedules
Framework	none

A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning


Title	A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning
Authors	Marco Fraccaro, Simon Kamronn, Ulrich Paquet, Ole Winther
Abstract	This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world. We introduce the Kalman variational auto-encoder, a framework for unsupervised learning of sequential data that disentangles two latent representations: an object’s representation, coming from a recognition model, and a latent state describing its dynamics. As a result, the evolution of the world can be imagined and missing data imputed, both without the need to generate high dimensional frames at each time step. The model is trained end-to-end on videos of a variety of simulated physical systems, and outperforms competing methods in generative and missing data imputation tasks.
Tasks	Imputation
Published	2017-10-16
URL	http://arxiv.org/abs/1710.05741v2
PDF	http://arxiv.org/pdf/1710.05741v2.pdf
PWC	https://paperswithcode.com/paper/a-disentangled-recognition-and-nonlinear
Repo	https://github.com/simonkamronn/kvae
Framework	tf

Tractable Clustering of Data on the Curve Manifold


Title	Tractable Clustering of Data on the Curve Manifold
Authors	Stephen Tierney, Junbin Gao, Yi Guo, Zheng Zhang
Abstract	In machine learning it is common to interpret each data point as a vector in Euclidean space. However the data may actually be functional i.e.\ each data point is a function of some variable such as time and the function is discretely sampled. The naive treatment of functional data as traditional multivariate data can lead to poor performance since the algorithms are ignoring the correlation in the curvature of each function. In this paper we propose a tractable method to cluster functional data or curves by adapting the Euclidean Low-Rank Representation (LRR) to the curve manifold. Experimental evaluation on synthetic and real data reveals that this method massively outperforms prior clustering methods in both speed and accuracy when clustering functional data.
Tasks
Published	2017-04-13
URL	http://arxiv.org/abs/1704.03963v1
PDF	http://arxiv.org/pdf/1704.03963v1.pdf
PWC	https://paperswithcode.com/paper/tractable-clustering-of-data-on-the-curve
Repo	https://github.com/sjtrny/curveLRR
Framework	none

A Unified Joint Matrix Factorization Framework for Data Integration


Title	A Unified Joint Matrix Factorization Framework for Data Integration
Authors	Lihua Zhang, Shihua Zhang
Abstract	Nonnegative matrix factorization (NMF) is a powerful tool in data exploratory analysis by discovering the hidden features and part-based patterns from high-dimensional data. NMF and its variants have been successfully applied into diverse fields such as pattern recognition, signal processing, data mining, bioinformatics and so on. Recently, NMF has been extended to analyze multiple matrices simultaneously. However, a unified framework is still lacking. In this paper, we introduce a sparse multiple relationship data regularized joint matrix factorization (JMF) framework and two adapted prediction models for pattern recognition and data integration. Next, we present four update algorithms to solve this framework. The merits and demerits of these algorithms are systematically explored. Furthermore, extensive computational experiments using both synthetic data and real data demonstrate the effectiveness of JMF framework and related algorithms on pattern recognition and data mining.
Tasks
Published	2017-07-25
URL	http://arxiv.org/abs/1707.08183v1
PDF	http://arxiv.org/pdf/1707.08183v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-joint-matrix-factorization
Repo	https://github.com/dugzzuli/jmf
Framework	none

Deal or No Deal? End-to-End Learning for Negotiation Dialogues


Title	Deal or No Deal? End-to-End Learning for Negotiation Dialogues
Authors	Mike Lewis, Denis Yarats, Yann N. Dauphin, Devi Parikh, Dhruv Batra
Abstract	Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions. Negotiations require complex communication and reasoning skills, but success is easy to measure, making this an interesting task for AI. We gather a large dataset of human-human negotiations on a multi-issue bargaining task, where agents who cannot observe each other’s reward functions must reach an agreement (or a deal) via natural language dialogue. For the first time, we show it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states. We also introduce dialogue rollouts, in which the model plans ahead by simulating possible complete continuations of the conversation, and find that this technique dramatically improves performance. Our code and dataset are publicly available (https://github.com/facebookresearch/end-to-end-negotiator).
Tasks
Published	2017-06-16
URL	http://arxiv.org/abs/1706.05125v1
PDF	http://arxiv.org/pdf/1706.05125v1.pdf
PWC	https://paperswithcode.com/paper/deal-or-no-deal-end-to-end-learning-for
Repo	https://github.com/facebookresearch/end-to-end-negotiator
Framework	pytorch

Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations


Title	Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations
Authors	Maziar Raissi, Paris Perdikaris, George Em Karniadakis
Abstract	We introduce physics informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this two part treatise, we present our developments in the context of solving two main classes of problems: data-driven solution and data-driven discovery of partial differential equations. Depending on the nature and arrangement of the available data, we devise two distinct classes of algorithms, namely continuous time and discrete time models. The resulting neural networks form a new class of data-efficient universal function approximators that naturally encode any underlying physical laws as prior information. In this first part, we demonstrate how these networks can be used to infer solutions to partial differential equations, and obtain physics-informed surrogate models that are fully differentiable with respect to all input coordinates and free parameters.
Tasks
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10561v1
PDF	http://arxiv.org/pdf/1711.10561v1.pdf
PWC	https://paperswithcode.com/paper/physics-informed-deep-learning-part-i-data
Repo	https://github.com/pierremtb/PINNs-TF2.0
Framework	tf

Deep Image Matting


Title	Deep Image Matting
Authors	Ning Xu, Brian Price, Scott Cohen, Thomas Huang
Abstract	Image matting is a fundamental computer vision problem and has many applications. Previous algorithms have poor performance when an image has similar foreground and background colors or complicated textures. The main reasons are prior methods 1) only use low-level features and 2) lack high-level context. In this paper, we propose a novel deep learning based algorithm that can tackle both these problems. Our deep model has two parts. The first part is a deep convolutional encoder-decoder network that takes an image and the corresponding trimap as inputs and predict the alpha matte of the image. The second part is a small convolutional network that refines the alpha matte predictions of the first network to have more accurate alpha values and sharper edges. In addition, we also create a large-scale image matting dataset including 49300 training images and 1000 testing images. We evaluate our algorithm on the image matting benchmark, our testing set, and a wide variety of real images. Experimental results clearly demonstrate the superiority of our algorithm over previous methods.
Tasks	Image Matting
Published	2017-03-10
URL	http://arxiv.org/abs/1703.03872v3
PDF	http://arxiv.org/pdf/1703.03872v3.pdf
PWC	https://paperswithcode.com/paper/deep-image-matting
Repo	https://github.com/brendanvonhofe/telescope-nn
Framework	pytorch

Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations


Title	Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations
Authors	Maziar Raissi, Paris Perdikaris, George Em Karniadakis
Abstract	We introduce physics informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this second part of our two-part treatise, we focus on the problem of data-driven discovery of partial differential equations. Depending on whether the available data is scattered in space-time or arranged in fixed temporal snapshots, we introduce two main classes of algorithms, namely continuous time and discrete time models. The effectiveness of our approach is demonstrated using a wide range of benchmark problems in mathematical physics, including conservation laws, incompressible fluid flow, and the propagation of nonlinear shallow-water waves.
Tasks
Published	2017-11-28
URL	http://arxiv.org/abs/1711.10566v1
PDF	http://arxiv.org/pdf/1711.10566v1.pdf
PWC	https://paperswithcode.com/paper/physics-informed-deep-learning-part-ii-data
Repo	https://github.com/pierremtb/PINNs-TF2.0
Framework	tf

InfoVAE: Information Maximizing Variational Autoencoders


Title	InfoVAE: Information Maximizing Variational Autoencoders
Authors	Shengjia Zhao, Jiaming Song, Stefano Ermon
Abstract	A key advance in learning generative models is the use of amortized inference distributions that are jointly trained with the models. We find that existing training objectives for variational autoencoders can lead to inaccurate amortized inference distributions and, in some cases, improving the objective provably degrades the inference quality. In addition, it has been observed that variational autoencoders tend to ignore the latent variables when combined with a decoding distribution that is too flexible. We again identify the cause in existing training criteria and propose a new class of objectives (InfoVAE) that mitigate these problems. We show that our model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution. Through extensive qualitative and quantitative analyses, we demonstrate that our models outperform competing approaches on multiple performance metrics.
Tasks
Published	2017-06-07
URL	http://arxiv.org/abs/1706.02262v3
PDF	http://arxiv.org/pdf/1706.02262v3.pdf
PWC	https://paperswithcode.com/paper/infovae-information-maximizing-variational
Repo	https://github.com/zacheberhart/Maximum-Mean-Discrepancy-Variational-Autoencoder
Framework	tf