January 25, 2020

3551 words 17 mins read

Paper Group ANR 1660

Paper Group ANR 1660

Pedestrian Tracking by Probabilistic Data Association and Correspondence Embeddings. Thompson Sampling with Approximate Inference. Deep Adaptive Input Normalization for Time Series Forecasting. Selection via Proxy: Efficient Data Selection for Deep Learning. StRE: Self Attentive Edit Quality Prediction in Wikipedia. Techniques for Inferring Context …

Pedestrian Tracking by Probabilistic Data Association and Correspondence Embeddings

Title Pedestrian Tracking by Probabilistic Data Association and Correspondence Embeddings
Authors Borna Bićanić, Marin Oršić, Ivan Marković, Siniša Šegvić, Ivan Petrović
Abstract This paper studies the interplay between kinematics (position and velocity) and appearance cues for establishing correspondences in multi-target pedestrian tracking. We investigate tracking-by-detection approaches based on a deep learning detector, joint integrated probabilistic data association (JIPDA), and appearance-based tracking of deep correspondence embeddings. We first addressed the fixed-camera setup by fine-tuning a convolutional detector for accurate pedestrian detection and combining it with kinematic-only JIPDA. The resulting submission ranked first on the 3DMOT2015 benchmark. However, in sequences with a moving camera and unknown ego-motion, we achieved the best results by replacing kinematic cues with global nearest neighbor tracking of deep correspondence embeddings. We trained the embeddings by fine-tuning features from the second block of ResNet-18 using angular loss extended by a margin term. We note that integrating deep correspondence embeddings directly in JIPDA did not bring significant improvement. It appears that geometry of deep correspondence embeddings for soft data association needs further investigation in order to obtain the best from both worlds.
Tasks Pedestrian Detection
Published 2019-07-16
URL https://arxiv.org/abs/1907.07045v1
PDF https://arxiv.org/pdf/1907.07045v1.pdf
PWC https://paperswithcode.com/paper/pedestrian-tracking-by-probabilistic-data
Repo
Framework

Thompson Sampling with Approximate Inference

Title Thompson Sampling with Approximate Inference
Authors My Phan, Yasin Abbasi-Yadkori, Justin Domke
Abstract We study the effects of approximate inference on the performance of Thompson sampling in the $k$-armed bandit problems. Thompson sampling is a successful algorithm for online decision-making but requires posterior inference, which often must be approximated in practice. We show that even small constant inference error (in $\alpha$-divergence) can lead to poor performance (linear regret) due to under-exploration (for $\alpha<1$) or over-exploration (for $\alpha>0$) by the approximation. While for $\alpha > 0$ this is unavoidable, for $\alpha \leq 0$ the regret can be improved by adding a small amount of forced exploration even when the inference error is a large constant.
Tasks Decision Making
Published 2019-08-14
URL https://arxiv.org/abs/1908.04970v2
PDF https://arxiv.org/pdf/1908.04970v2.pdf
PWC https://paperswithcode.com/paper/thompson-sampling-and-approximate-inference
Repo
Framework

Deep Adaptive Input Normalization for Time Series Forecasting

Title Deep Adaptive Input Normalization for Time Series Forecasting
Authors Nikolaos Passalis, Anastasios Tefas, Juho Kanniainen, Moncef Gabbouj, Alexandros Iosifidis
Abstract Deep Learning (DL) models can be used to tackle time series analysis tasks with great success. However, the performance of DL models can degenerate rapidly if the data are not appropriately normalized. This issue is even more apparent when DL is used for financial time series forecasting tasks, where the non-stationary and multimodal nature of the data pose significant challenges and severely affect the performance of DL models. In this work, a simple, yet effective, neural layer, that is capable of adaptively normalizing the input time series, while taking into account the distribution of the data, is proposed. The proposed layer is trained in an end-to-end fashion using back-propagation and leads to significant performance improvements compared to other evaluated normalization schemes. The proposed method differs from traditional normalization methods since it learns how to perform normalization for a given task instead of using a fixed normalization scheme. At the same time, it can be directly applied to any new time series without requiring re-training. The effectiveness of the proposed method is demonstrated using a large-scale limit order book dataset, as well as a load forecasting dataset.
Tasks Load Forecasting, Time Series, Time Series Analysis, Time Series Forecasting
Published 2019-02-21
URL https://arxiv.org/abs/1902.07892v2
PDF https://arxiv.org/pdf/1902.07892v2.pdf
PWC https://paperswithcode.com/paper/deep-adaptive-input-normalization-for-price
Repo
Framework

Selection via Proxy: Efficient Data Selection for Deep Learning

Title Selection via Proxy: Efficient Data Selection for Deep Learning
Authors Cody Coleman, Christopher Yeh, Stephen Mussmann, Baharan Mirzasoleiman, Peter Bailis, Percy Liang, Jure Leskovec, Matei Zaharia
Abstract Data selection methods, such as active learning and core-set selection, are useful tools for machine learning on large datasets. However, they can be prohibitively expensive to apply in deep learning because they depend on feature representations that need to be learned. In this work, we show that we can greatly improve the computational efficiency by using a small proxy model to perform data selection (e.g., selecting data points to label for active learning). By removing hidden layers from the target model, using smaller architectures, and training for fewer epochs, we create proxies that are an order of magnitude faster to train. Although these small proxy models have higher error rates, we find that they empirically provide useful signals for data selection. We evaluate this “selection via proxy” (SVP) approach on several data selection tasks across five datasets: CIFAR10, CIFAR100, ImageNet, Amazon Review Polarity, and Amazon Review Full. For active learning, applying SVP can give an order of magnitude improvement in data selection runtime (i.e., the time it takes to repeatedly train and select points) without significantly increasing the final error (often within 0.1%). For core-set selection on CIFAR10, proxies that are over 10x faster to train than their larger, more accurate targets can remove up to 50% of the data without harming the final accuracy of the target, leading to a 1.6x end-to-end training time improvement.
Tasks Active Learning
Published 2019-06-26
URL https://arxiv.org/abs/1906.11829v3
PDF https://arxiv.org/pdf/1906.11829v3.pdf
PWC https://paperswithcode.com/paper/selection-via-proxy-efficient-data-selection
Repo
Framework

StRE: Self Attentive Edit Quality Prediction in Wikipedia

Title StRE: Self Attentive Edit Quality Prediction in Wikipedia
Authors Soumya Sarkar, Bhanu Prakash Reddy, Sandipan Sikdar, Animesh Mukherjee
Abstract Wikipedia can easily be justified as a behemoth, considering the sheer volume of content that is added or removed every minute to its several projects. This creates an immense scope, in the field of natural language processing towards developing automated tools for content moderation and review. In this paper we propose Self Attentive Revision Encoder (StRE) which leverages orthographic similarity of lexical units toward predicting the quality of new edits. In contrast to existing propositions which primarily employ features like page reputation, editor activity or rule based heuristics, we utilize the textual content of the edits which, we believe contains superior signatures of their quality. More specifically, we deploy deep encoders to generate representations of the edits from its text content, which we then leverage to infer quality. We further contribute a novel dataset containing 21M revisions across 32K Wikipedia pages and demonstrate that StRE outperforms existing methods by a significant margin at least 17% and at most 103%. Our pretrained model achieves such result after retraining on a set as small as 20% of the edits in a wikipage. This, to the best of our knowledge, is also the first attempt towards employing deep language models to the enormous domain of automated content moderation and review in Wikipedia.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04678v1
PDF https://arxiv.org/pdf/1906.04678v1.pdf
PWC https://paperswithcode.com/paper/stre-self-attentive-edit-quality-prediction
Repo
Framework

Techniques for Inferring Context-Free Lindenmayer Systems With Genetic Algorithm

Title Techniques for Inferring Context-Free Lindenmayer Systems With Genetic Algorithm
Authors Jason Bernard, Ian McQuillan
Abstract Lindenmayer systems (L-systems) are a formal grammar system, where the most notable feature is a set of rewriting rules that are used to replace every symbol in a string in parallel; by repeating this process, a sequence of strings is produced. Some symbols in the strings may be interpreted as instructions for simulation software. Thus, the sequence can be used to model the steps of a process. Currently, creating an L-system for a specific process is done by hand by experts through much effort. The inductive inference problem attempts to infer an L-system from such a sequence of strings generated by an unknown system; this can be thought of as an intermediate step to inferring from a sequence of images. This paper evaluates and analyzes different genetic algorithm encoding schemes and mathematical properties for the L-system inductive inference problem. A new tool, the Plant Model Inference Tool for Context-Free L-systems (PMIT-D0L) is implemented based on these techniques. PMIT-D0L has been successfully evaluated on 28 known L-systems, with alphabets up to 31 symbols and a total sum of 281 symbols across the rewriting rules. PMIT-D0L can infer even the largest of these L-systems in less than a few seconds.
Tasks
Published 2019-05-15
URL https://arxiv.org/abs/1906.08860v1
PDF https://arxiv.org/pdf/1906.08860v1.pdf
PWC https://paperswithcode.com/paper/techniques-for-inferring-context-free
Repo
Framework

A Spatial-Temporal Decomposition Based Deep Neural Network for Time Series Forecasting

Title A Spatial-Temporal Decomposition Based Deep Neural Network for Time Series Forecasting
Authors Reza Asadi, Amelia Regan
Abstract Spatial time series forecasting problems arise in a broad range of applications, such as environmental and transportation problems. These problems are challenging because of the existence of specific spatial, short-term and long-term patterns, and the curse of dimensionality. In this paper, we propose a deep neural network framework for large-scale spatial time series forecasting problems. We explicitly designed the neural network architecture for capturing various types of patterns. In preprocessing, a time series decomposition method is applied to separately feed short-term, long-term and spatial patterns into different components of a neural network. A fuzzy clustering method finds cluster of neighboring time series based on similarity of time series residuals; as they can be meaningful short-term patterns for spatial time series. In neural network architecture, each kernel of a multi-kernel convolution layer is applied to a cluster of time series to extract short-term features in neighboring areas. The output of convolution layer is concatenated by trends and followed by convolution-LSTM layer to capture long-term patterns in larger regional areas. To make a robust prediction when faced with missing data, an unsupervised pretrained denoising autoencoder reconstructs the output of the model in a fine-tuning step. The experimental results illustrate the model outperforms baseline and state of the art models in a traffic flow prediction dataset.
Tasks Denoising, Time Series, Time Series Forecasting
Published 2019-02-02
URL http://arxiv.org/abs/1902.00636v1
PDF http://arxiv.org/pdf/1902.00636v1.pdf
PWC https://paperswithcode.com/paper/a-spatial-temporal-decomposition-based-deep
Repo
Framework

Deep Rigid Instance Scene Flow

Title Deep Rigid Instance Scene Flow
Authors Wei-Chiu Ma, Shenlong Wang, Rui Hu, Yuwen Xiong, Raquel Urtasun
Abstract In this paper we tackle the problem of scene flow estimation in the context of self-driving. We leverage deep learning techniques as well as strong priors as in our application domain the motion of the scene can be composed by the motion of the robot and the 3D motion of the actors in the scene. We formulate the problem as energy minimization in a deep structured model, which can be solved efficiently in the GPU by unrolling a Gaussian-Newton solver. Our experiments in the challenging KITTI scene flow dataset show that we outperform the state-of-the-art by a very large margin, while being 800 times faster.
Tasks Scene Flow Estimation
Published 2019-04-18
URL http://arxiv.org/abs/1904.08913v1
PDF http://arxiv.org/pdf/1904.08913v1.pdf
PWC https://paperswithcode.com/paper/deep-rigid-instance-scene-flow
Repo
Framework

Smart IoT Cameras for Crowd Analysis based on augmentation for automatic pedestrian detection, simulation and annotation

Title Smart IoT Cameras for Crowd Analysis based on augmentation for automatic pedestrian detection, simulation and annotation
Authors Antoine Rimboux, Rob Dupre, Thomas Lagkas, Panagiotis Sarigiannidis, Paolo Remagnino, Vasileios Argyriou
Abstract Smart video sensors for applications related to surveillance and security are IOT-based as they use Internet for various purposes. Such applications include crowd behaviour monitoring and advanced decision support systems operating and transmitting information over internet. The analysis of crowd and pedestrian behaviour is an important task for smart IoT cameras and in particular video processing. In order to provide related behavioural models, simulation and tracking approaches have been considered in the literature. In both cases ground truth is essential to train deep models and provide a meaningful quantitative evaluation. We propose a framework for crowd simulation and automatic data generation and annotation that supports multiple cameras and multiple targets. The proposed approach is based on synthetically generated human agents, augmented frames and compositing techniques combined with path finding and planning methods. A number of popular crowd and pedestrian data sets were used to validate the model, and scenarios related to annotation and simulation were considered.
Tasks Pedestrian Detection
Published 2019-06-06
URL https://arxiv.org/abs/1906.03994v1
PDF https://arxiv.org/pdf/1906.03994v1.pdf
PWC https://paperswithcode.com/paper/smart-iot-cameras-for-crowd-analysis-based-on
Repo
Framework

Nonlinear Multiview Analysis: Identifiability and Neural Network-assisted Implementation

Title Nonlinear Multiview Analysis: Identifiability and Neural Network-assisted Implementation
Authors Qi Lyu, Xiao Fu
Abstract Multiview analysis aims at extracting shared latent components from data samples that are acquired in different domains, e.g., image, text, and audio. Classic multiview analysis, e.g., canonical correlation analysis (CCA), tackles this problem via matching the linearly transformed views in a certain latent domain. More recently, powerful nonlinear learning tools such as kernel methods and neural networks are utilized for enhancing the classic CCA. However, unlike linear CCA whose theoretical aspects are clearly understood, nonlinear CCA approaches are largely intuition-driven. In particular, it is unclear under what conditions the shared latent components across the views can be identified—while identifiability plays an essential role in many applications. In this work, we revisit nonlinear multiview analysis and address both the theoretical and computational aspects. Our work leverages a useful nonlinear model, namely, the post-nonlinear model, from the nonlinear mixture separation literature. Combining with multiview data, we take a nonlinear multiview mixture learning viewpoint, which is a natural extension of the classic generative models for linear CCA. From there, we derive a learning criterion. We show that minimizing this criterion leads to identification of the latent shared components up to certain ambiguities, under reasonable conditions. Our derivation and formulation also offer new insights and interpretations to existing deep neural network-based CCA formulations. On the computation side, we propose an effective algorithm with simple and scalable update rules. A series of simulations and real-data experiments corroborate our theoretical analysis.
Tasks
Published 2019-09-19
URL https://arxiv.org/abs/1909.09177v2
PDF https://arxiv.org/pdf/1909.09177v2.pdf
PWC https://paperswithcode.com/paper/neural-network-assisted-nonlinear-multiview
Repo
Framework
Title Rademacher complexity and spin glasses: A link between the replica and statistical theories of learning
Authors Alia Abbara, Benjamin Aubin, Florent Krzakala, Lenka Zdeborová
Abstract Statistical learning theory provides bounds of the generalization gap, using in particular the Vapnik-Chervonenkis dimension and the Rademacher complexity. An alternative approach, mainly studied in the statistical physics literature, is the study of generalization in simple synthetic-data models. Here we discuss the connections between these approaches and focus on the link between the Rademacher complexity in statistical learning and the theories of generalization for typical-case synthetic models from statistical physics, involving quantities known as Gardner capacity and ground state energy. We show that in these models the Rademacher complexity is closely related to the ground state energy computed by replica theories. Using this connection, one may reinterpret many results of the literature as rigorous Rademacher bounds in a variety of models in the high-dimensional statistics limit. Somewhat surprisingly, we also show that statistical learning theory provides predictions for the behavior of the ground-state energies in some full replica symmetry breaking models.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02729v1
PDF https://arxiv.org/pdf/1912.02729v1.pdf
PWC https://paperswithcode.com/paper/rademacher-complexity-and-spin-glasses-a-link
Repo
Framework

Deferred Neural Rendering: Image Synthesis using Neural Textures

Title Deferred Neural Rendering: Image Synthesis using Neural Textures
Authors Justus Thies, Michael Zollhöfer, Matthias Nießner
Abstract The modern computer graphics pipeline can synthesize images at remarkable visual quality; however, it requires well-defined, high-quality 3D content as input. In this work, we explore the use of imperfect 3D content, for instance, obtained from photo-metric reconstructions with noisy and incomplete surface geometry, while still aiming to produce photo-realistic (re-)renderings. To address this challenging problem, we introduce Deferred Neural Rendering, a new paradigm for image synthesis that combines the traditional graphics pipeline with learnable components. Specifically, we propose Neural Textures, which are learned feature maps that are trained as part of the scene capture process. Similar to traditional textures, neural textures are stored as maps on top of 3D mesh proxies; however, the high-dimensional feature maps contain significantly more information, which can be interpreted by our new deferred neural rendering pipeline. Both neural textures and deferred neural renderer are trained end-to-end, enabling us to synthesize photo-realistic images even when the original 3D content was imperfect. In contrast to traditional, black-box 2D generative neural networks, our 3D representation gives us explicit control over the generated output, and allows for a wide range of application domains. For instance, we can synthesize temporally-consistent video re-renderings of recorded 3D scenes as our representation is inherently embedded in 3D space. This way, neural textures can be utilized to coherently re-render or manipulate existing video content in both static and dynamic environments at real-time rates. We show the effectiveness of our approach in several experiments on novel view synthesis, scene editing, and facial reenactment, and compare to state-of-the-art approaches that leverage the standard graphics pipeline as well as conventional generative neural networks.
Tasks Image Generation, Novel View Synthesis
Published 2019-04-28
URL http://arxiv.org/abs/1904.12356v1
PDF http://arxiv.org/pdf/1904.12356v1.pdf
PWC https://paperswithcode.com/paper/deferred-neural-rendering-image-synthesis
Repo
Framework

On the Benefits of Populations on the Exploitation Speed of Standard Steady-State Genetic Algorithms

Title On the Benefits of Populations on the Exploitation Speed of Standard Steady-State Genetic Algorithms
Authors Dogan Corus, Pietro S. Oliveto
Abstract It is generally accepted that populations are useful for the global exploration of multi-modal optimisation problems. Indeed, several theoretical results are available showing such advantages over single-trajectory search heuristics. In this paper we provide evidence that evolving populations via crossover and mutation may also benefit the optimisation time for hillclimbing unimodal functions. In particular, we prove bounds on the expected runtime of the standard ($\mu$+1)~GA for OneMax that are lower than its unary black box complexity and decrease in the leading constant with the population size up to $\mu=O(\sqrt{\log n})$. Our analysis suggests that the optimal mutation strategy is to flip two bits most of the time. To achieve the results we provide two interesting contributions to the theory of randomised search heuristics: 1) A novel application of drift analysis which compares absorption times of different Markov chains without defining an explicit potential function. 2) The inversion of fundamental matrices to calculate the absorption times of the Markov chains. The latter strategy was previously proposed in the literature but to the best of our knowledge this is the first time is has been used to show non-trivial bounds on expected runtimes.
Tasks
Published 2019-03-26
URL http://arxiv.org/abs/1903.10976v1
PDF http://arxiv.org/pdf/1903.10976v1.pdf
PWC https://paperswithcode.com/paper/on-the-benefits-of-populations-on-the
Repo
Framework

Non-monotonic Logical Reasoning Guiding Deep Learning for Explainable Visual Question Answering

Title Non-monotonic Logical Reasoning Guiding Deep Learning for Explainable Visual Question Answering
Authors Heather Riley, Mohan Sridharan
Abstract State of the art algorithms for many pattern recognition problems rely on deep network models. Training these models requires a large labeled dataset and considerable computational resources. Also, it is difficult to understand the working of these learned models, limiting their use in some critical applications. Towards addressing these limitations, our architecture draws inspiration from research in cognitive systems, and integrates the principles of commonsense logical reasoning, inductive learning, and deep learning. In the context of answering explanatory questions about scenes and the underlying classification problems, the architecture uses deep networks for extracting features from images and for generating answers to queries. Between these deep networks, it embeds components for non-monotonic logical reasoning with incomplete commonsense domain knowledge, and for decision tree induction. It also incrementally learns and reasons with previously unknown constraints governing the domain’s states. We evaluated the architecture in the context of datasets of simulated and real-world images, and a simulated robot computing, executing, and providing explanatory descriptions of plans. Experimental results indicate that in comparison with an ``end to end’’ architecture of deep networks, our architecture provides better accuracy on classification problems when the training dataset is small, comparable accuracy with larger datasets, and more accurate answers to explanatory questions. Furthermore, incremental acquisition of previously unknown constraints improves the ability to answer explanatory questions, and extending non-monotonic logical reasoning to support planning and diagnostics improves the reliability and efficiency of computing and executing plans on a simulated robot. |
Tasks Question Answering, Visual Question Answering
Published 2019-09-23
URL https://arxiv.org/abs/1909.10650v1
PDF https://arxiv.org/pdf/1909.10650v1.pdf
PWC https://paperswithcode.com/paper/non-monotonic-logical-reasoning-guiding-deep
Repo
Framework

Detecting Text in the Wild with Deep Character Embedding Network

Title Detecting Text in the Wild with Deep Character Embedding Network
Authors Jiaming Liu, Chengquan Zhang, Yipeng Sun, Junyu Han, Errui Ding
Abstract Most text detection methods hypothesize texts are horizontal or multi-oriented and thus define quadrangles as the basic detection unit. However, text in the wild is usually perspectively distorted or curved, which can not be easily tackled by existing approaches. In this paper, we propose a deep character embedding network (CENet) which simultaneously predicts the bounding boxes of characters and their embedding vectors, thus making text detection a simple clustering task in the character embedding space. The proposed method does not require strong assumptions of forming a straight line on general text detection, which provides flexibility on arbitrarily curved or perspectively distorted text. For character detection task, a dense prediction subnetwork is designed to obtain the confidence score and bounding boxes of characters. For character embedding task, a subnet is trained with contrastive loss to project detected characters into embedding space. The two tasks share a backbone CNN from which the multi-scale feature maps are extracted. The final text regions can be easily achieved by a thresholding process on character confidence and embedding distance of character pairs. We evaluated our method on ICDAR13, ICDAR15, MSRA-TD500, and Total-Text. The proposed method achieves state-of-the-art or comparable performance on all these datasets, and shows substantial improvement in the irregular-text datasets, i.e. Total-Text.
Tasks
Published 2019-01-02
URL http://arxiv.org/abs/1901.00363v1
PDF http://arxiv.org/pdf/1901.00363v1.pdf
PWC https://paperswithcode.com/paper/detecting-text-in-the-wild-with-deep
Repo
Framework
comments powered by Disqus