Paper Group AWR 184
Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters. Associative Domain Adaptation. Automated Hate Speech Detection and the Problem of Offensive Language. Cross-Validation with Confidence. Convolutional Dictionary Learning via Local Processing. Attention-based Extraction of Structured Inform …
Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters
Title | Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters |
Authors | Lucas Beyer, Stefan Breuers, Vitaly Kurin, Bastian Leibe |
Abstract | With the rise of end-to-end learning through deep learning, person detectors and re-identification (ReID) models have recently become very strong. Multi-camera multi-target (MCMT) tracking has not fully gone through this transformation yet. We intend to take another step in this direction by presenting a theoretically principled way of integrating ReID with tracking formulated as an optimal Bayes filter. This conveniently side-steps the need for data-association and opens up a direct path from full images to the core of the tracker. While the results are still sub-par, we believe that this new, tight integration opens many interesting research opportunities and leads the way towards full end-to-end tracking from raw pixels. |
Tasks | |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.04608v2 |
http://arxiv.org/pdf/1705.04608v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-principled-integration-of-multi |
Repo | https://github.com/VisualComputingInstitute/towards-reid-tracking |
Framework | none |
Associative Domain Adaptation
Title | Associative Domain Adaptation |
Authors | Philip Haeusser, Thomas Frerix, Alexander Mordvintsev, Daniel Cremers |
Abstract | We propose associative domain adaptation, a novel technique for end-to-end domain adaptation with neural networks, the task of inferring class labels for an unlabeled target domain based on the statistical properties of a labeled source domain. Our training scheme follows the paradigm that in order to effectively derive class labels for the target domain, a network should produce statistically domain invariant embeddings, while minimizing the classification error on the labeled source domain. We accomplish this by reinforcing associations between source and target data directly in embedding space. Our method can easily be added to any existing classification network with no structural and almost no computational overhead. We demonstrate the effectiveness of our approach on various benchmarks and achieve state-of-the-art results across the board with a generic convolutional neural network architecture not specifically tuned to the respective tasks. Finally, we show that the proposed association loss produces embeddings that are more effective for domain adaptation compared to methods employing maximum mean discrepancy as a similarity measure in embedding space. |
Tasks | Domain Adaptation |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00938v1 |
http://arxiv.org/pdf/1708.00938v1.pdf | |
PWC | https://paperswithcode.com/paper/associative-domain-adaptation |
Repo | https://github.com/stes/torch-associative |
Framework | pytorch |
Automated Hate Speech Detection and the Problem of Offensive Language
Title | Automated Hate Speech Detection and the Problem of Offensive Language |
Authors | Thomas Davidson, Dana Warmsley, Michael Macy, Ingmar Weber |
Abstract | A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify. |
Tasks | Hate Speech Detection |
Published | 2017-03-11 |
URL | http://arxiv.org/abs/1703.04009v1 |
http://arxiv.org/pdf/1703.04009v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-hate-speech-detection-and-the |
Repo | https://github.com/renuka-fernando/sinhalese_language_racism_detection |
Framework | tf |
Cross-Validation with Confidence
Title | Cross-Validation with Confidence |
Authors | Jing Lei |
Abstract | Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross validation methods tend to select overfitting models, due to the ignorance of the uncertainty in the testing sample. We develop a new, statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This new method outputs a set of highly competitive candidate models containing the best one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for regularizing tuning parameter selection, the method can provide a further trade-off between prediction accuracy and model interpretability. We demonstrate the performance of the proposed method in several simulated and real data examples. |
Tasks | Model Selection |
Published | 2017-03-23 |
URL | http://arxiv.org/abs/1703.07904v2 |
http://arxiv.org/pdf/1703.07904v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-validation-with-confidence |
Repo | https://github.com/tim-coleman/CVC_Caret |
Framework | none |
Convolutional Dictionary Learning via Local Processing
Title | Convolutional Dictionary Learning via Local Processing |
Authors | Vardan Papyan, Yaniv Romano, Jeremias Sulam, Michael Elad |
Abstract | Convolutional Sparse Coding (CSC) is an increasingly popular model in the signal and image processing communities, tackling some of the limitations of traditional patch-based sparse representations. Although several works have addressed the dictionary learning problem under this model, these relied on an ADMM formulation in the Fourier domain, losing the sense of locality and the relation to the traditional patch-based sparse pursuit. A recent work suggested a novel theoretical analysis of this global model, providing guarantees that rely on a localized sparsity measure. Herein, we extend this local-global relation by showing how one can efficiently solve the convolutional sparse pursuit problem and train the filters involved, while operating locally on image patches. Our approach provides an intuitive algorithm that can leverage standard techniques from the sparse representations field. The proposed method is fast to train, simple to implement, and flexible enough that it can be easily deployed in a variety of applications. We demonstrate the proposed training scheme for image inpainting and image separation, while achieving state-of-the-art results. |
Tasks | Dictionary Learning, Image Inpainting |
Published | 2017-05-09 |
URL | http://arxiv.org/abs/1705.03239v1 |
http://arxiv.org/pdf/1705.03239v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-dictionary-learning-via-local |
Repo | https://github.com/qu-arx/arx-inf |
Framework | none |
Attention-based Extraction of Structured Information from Street View Imagery
Title | Attention-based Extraction of Structured Information from Street View Imagery |
Authors | Zbigniew Wojna, Alex Gorban, Dar-Shyang Lee, Kevin Murphy, Qian Yu, Yeqing Li, Julian Ibarz |
Abstract | We present a neural network model - based on CNNs, RNNs and a novel attention mechanism - which achieves 84.2% accuracy on the challenging French Street Name Signs (FSNS) dataset, significantly outperforming the previous state of the art (Smith’16), which achieved 72.46%. Furthermore, our new method is much simpler and more general than the previous approach. To demonstrate the generality of our model, we show that it also performs well on an even more challenging dataset derived from Google Street View, in which the goal is to extract business names from store fronts. Finally, we study the speed/accuracy tradeoff that results from using CNN feature extractors of different depths. Surprisingly, we find that deeper is not always better (in terms of accuracy, as well as speed). Our resulting model is simple, accurate and fast, allowing it to be used at scale on a variety of challenging real-world text extraction problems. |
Tasks | Optical Character Recognition |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03549v4 |
http://arxiv.org/pdf/1704.03549v4.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-extraction-of-structured |
Repo | https://github.com/tensorflow/models/tree/master/research/attention_ocr |
Framework | tf |
The Uncertainty Bellman Equation and Exploration
Title | The Uncertainty Bellman Equation and Exploration |
Authors | Brendan O’Donoghue, Ian Osband, Remi Munos, Volodymyr Mnih |
Abstract | We consider the exploration/exploitation problem in reinforcement learning. For exploitation, it is well known that the Bellman equation connects the value at any time-step to the expected value at subsequent time-steps. In this paper we consider a similar \textit{uncertainty} Bellman equation (UBE), which connects the uncertainty at any time-step to the expected uncertainties at subsequent time-steps, thereby extending the potential exploratory benefit of a policy beyond individual time-steps. We prove that the unique fixed point of the UBE yields an upper bound on the variance of the posterior distribution of the Q-values induced by any policy. This bound can be much tighter than traditional count-based bonuses that compound standard deviation rather than variance. Importantly, and unlike several existing approaches to optimism, this method scales naturally to large systems with complex generalization. Substituting our UBE-exploration strategy for $\epsilon$-greedy improves DQN performance on 51 out of 57 games in the Atari suite. |
Tasks | |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05380v4 |
http://arxiv.org/pdf/1709.05380v4.pdf | |
PWC | https://paperswithcode.com/paper/the-uncertainty-bellman-equation-and |
Repo | https://github.com/practical-rl-study/schedules |
Framework | none |
A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning
Title | A Disentangled Recognition and Nonlinear Dynamics Model for Unsupervised Learning |
Authors | Marco Fraccaro, Simon Kamronn, Ulrich Paquet, Ole Winther |
Abstract | This paper takes a step towards temporal reasoning in a dynamically changing video, not in the pixel space that constitutes its frames, but in a latent space that describes the non-linear dynamics of the objects in its world. We introduce the Kalman variational auto-encoder, a framework for unsupervised learning of sequential data that disentangles two latent representations: an object’s representation, coming from a recognition model, and a latent state describing its dynamics. As a result, the evolution of the world can be imagined and missing data imputed, both without the need to generate high dimensional frames at each time step. The model is trained end-to-end on videos of a variety of simulated physical systems, and outperforms competing methods in generative and missing data imputation tasks. |
Tasks | Imputation |
Published | 2017-10-16 |
URL | http://arxiv.org/abs/1710.05741v2 |
http://arxiv.org/pdf/1710.05741v2.pdf | |
PWC | https://paperswithcode.com/paper/a-disentangled-recognition-and-nonlinear |
Repo | https://github.com/simonkamronn/kvae |
Framework | tf |
Tractable Clustering of Data on the Curve Manifold
Title | Tractable Clustering of Data on the Curve Manifold |
Authors | Stephen Tierney, Junbin Gao, Yi Guo, Zheng Zhang |
Abstract | In machine learning it is common to interpret each data point as a vector in Euclidean space. However the data may actually be functional i.e.\ each data point is a function of some variable such as time and the function is discretely sampled. The naive treatment of functional data as traditional multivariate data can lead to poor performance since the algorithms are ignoring the correlation in the curvature of each function. In this paper we propose a tractable method to cluster functional data or curves by adapting the Euclidean Low-Rank Representation (LRR) to the curve manifold. Experimental evaluation on synthetic and real data reveals that this method massively outperforms prior clustering methods in both speed and accuracy when clustering functional data. |
Tasks | |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.03963v1 |
http://arxiv.org/pdf/1704.03963v1.pdf | |
PWC | https://paperswithcode.com/paper/tractable-clustering-of-data-on-the-curve |
Repo | https://github.com/sjtrny/curveLRR |
Framework | none |
A Unified Joint Matrix Factorization Framework for Data Integration
Title | A Unified Joint Matrix Factorization Framework for Data Integration |
Authors | Lihua Zhang, Shihua Zhang |
Abstract | Nonnegative matrix factorization (NMF) is a powerful tool in data exploratory analysis by discovering the hidden features and part-based patterns from high-dimensional data. NMF and its variants have been successfully applied into diverse fields such as pattern recognition, signal processing, data mining, bioinformatics and so on. Recently, NMF has been extended to analyze multiple matrices simultaneously. However, a unified framework is still lacking. In this paper, we introduce a sparse multiple relationship data regularized joint matrix factorization (JMF) framework and two adapted prediction models for pattern recognition and data integration. Next, we present four update algorithms to solve this framework. The merits and demerits of these algorithms are systematically explored. Furthermore, extensive computational experiments using both synthetic data and real data demonstrate the effectiveness of JMF framework and related algorithms on pattern recognition and data mining. |
Tasks | |
Published | 2017-07-25 |
URL | http://arxiv.org/abs/1707.08183v1 |
http://arxiv.org/pdf/1707.08183v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-joint-matrix-factorization |
Repo | https://github.com/dugzzuli/jmf |
Framework | none |
Deal or No Deal? End-to-End Learning for Negotiation Dialogues
Title | Deal or No Deal? End-to-End Learning for Negotiation Dialogues |
Authors | Mike Lewis, Denis Yarats, Yann N. Dauphin, Devi Parikh, Dhruv Batra |
Abstract | Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions. Negotiations require complex communication and reasoning skills, but success is easy to measure, making this an interesting task for AI. We gather a large dataset of human-human negotiations on a multi-issue bargaining task, where agents who cannot observe each other’s reward functions must reach an agreement (or a deal) via natural language dialogue. For the first time, we show it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states. We also introduce dialogue rollouts, in which the model plans ahead by simulating possible complete continuations of the conversation, and find that this technique dramatically improves performance. Our code and dataset are publicly available (https://github.com/facebookresearch/end-to-end-negotiator). |
Tasks | |
Published | 2017-06-16 |
URL | http://arxiv.org/abs/1706.05125v1 |
http://arxiv.org/pdf/1706.05125v1.pdf | |
PWC | https://paperswithcode.com/paper/deal-or-no-deal-end-to-end-learning-for |
Repo | https://github.com/facebookresearch/end-to-end-negotiator |
Framework | pytorch |
Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations
Title | Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations |
Authors | Maziar Raissi, Paris Perdikaris, George Em Karniadakis |
Abstract | We introduce physics informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this two part treatise, we present our developments in the context of solving two main classes of problems: data-driven solution and data-driven discovery of partial differential equations. Depending on the nature and arrangement of the available data, we devise two distinct classes of algorithms, namely continuous time and discrete time models. The resulting neural networks form a new class of data-efficient universal function approximators that naturally encode any underlying physical laws as prior information. In this first part, we demonstrate how these networks can be used to infer solutions to partial differential equations, and obtain physics-informed surrogate models that are fully differentiable with respect to all input coordinates and free parameters. |
Tasks | |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10561v1 |
http://arxiv.org/pdf/1711.10561v1.pdf | |
PWC | https://paperswithcode.com/paper/physics-informed-deep-learning-part-i-data |
Repo | https://github.com/pierremtb/PINNs-TF2.0 |
Framework | tf |
Deep Image Matting
Title | Deep Image Matting |
Authors | Ning Xu, Brian Price, Scott Cohen, Thomas Huang |
Abstract | Image matting is a fundamental computer vision problem and has many applications. Previous algorithms have poor performance when an image has similar foreground and background colors or complicated textures. The main reasons are prior methods 1) only use low-level features and 2) lack high-level context. In this paper, we propose a novel deep learning based algorithm that can tackle both these problems. Our deep model has two parts. The first part is a deep convolutional encoder-decoder network that takes an image and the corresponding trimap as inputs and predict the alpha matte of the image. The second part is a small convolutional network that refines the alpha matte predictions of the first network to have more accurate alpha values and sharper edges. In addition, we also create a large-scale image matting dataset including 49300 training images and 1000 testing images. We evaluate our algorithm on the image matting benchmark, our testing set, and a wide variety of real images. Experimental results clearly demonstrate the superiority of our algorithm over previous methods. |
Tasks | Image Matting |
Published | 2017-03-10 |
URL | http://arxiv.org/abs/1703.03872v3 |
http://arxiv.org/pdf/1703.03872v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-image-matting |
Repo | https://github.com/brendanvonhofe/telescope-nn |
Framework | pytorch |
Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations
Title | Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations |
Authors | Maziar Raissi, Paris Perdikaris, George Em Karniadakis |
Abstract | We introduce physics informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this second part of our two-part treatise, we focus on the problem of data-driven discovery of partial differential equations. Depending on whether the available data is scattered in space-time or arranged in fixed temporal snapshots, we introduce two main classes of algorithms, namely continuous time and discrete time models. The effectiveness of our approach is demonstrated using a wide range of benchmark problems in mathematical physics, including conservation laws, incompressible fluid flow, and the propagation of nonlinear shallow-water waves. |
Tasks | |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10566v1 |
http://arxiv.org/pdf/1711.10566v1.pdf | |
PWC | https://paperswithcode.com/paper/physics-informed-deep-learning-part-ii-data |
Repo | https://github.com/pierremtb/PINNs-TF2.0 |
Framework | tf |
InfoVAE: Information Maximizing Variational Autoencoders
Title | InfoVAE: Information Maximizing Variational Autoencoders |
Authors | Shengjia Zhao, Jiaming Song, Stefano Ermon |
Abstract | A key advance in learning generative models is the use of amortized inference distributions that are jointly trained with the models. We find that existing training objectives for variational autoencoders can lead to inaccurate amortized inference distributions and, in some cases, improving the objective provably degrades the inference quality. In addition, it has been observed that variational autoencoders tend to ignore the latent variables when combined with a decoding distribution that is too flexible. We again identify the cause in existing training criteria and propose a new class of objectives (InfoVAE) that mitigate these problems. We show that our model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution. Through extensive qualitative and quantitative analyses, we demonstrate that our models outperform competing approaches on multiple performance metrics. |
Tasks | |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.02262v3 |
http://arxiv.org/pdf/1706.02262v3.pdf | |
PWC | https://paperswithcode.com/paper/infovae-information-maximizing-variational |
Repo | https://github.com/zacheberhart/Maximum-Mean-Discrepancy-Variational-Autoencoder |
Framework | tf |