July 28, 2019

3024 words 15 mins read

Paper Group ANR 279

Paper Group ANR 279

Robust Tracking Using Region Proposal Networks. Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk. Matroids Hitting Sets and Unsupervised Dependency Grammar Induction. Video Highlight Prediction Using Audience Chat Reactions. Flow Navigation by Smart Microswimmers via Reinforcement Learning. Comparison of echo state network o …

Robust Tracking Using Region Proposal Networks

Title Robust Tracking Using Region Proposal Networks
Authors Jimmy Ren, Zhiyang Yu, Jianbo Liu, Rui Zhang, Wenxiu Sun, Jiahao Pang, Xiaohao Chen, Qiong Yan
Abstract Recent advances in visual tracking showed that deep Convolutional Neural Networks (CNN) trained for image classification can be strong feature extractors for discriminative trackers. However, due to the drastic difference between image classification and tracking, extra treatments such as model ensemble and feature engineering must be carried out to bridge the two domains. Such procedures are either time consuming or hard to generalize well across datasets. In this paper we discovered that the internal structure of Region Proposal Network (RPN)‘s top layer feature can be utilized for robust visual tracking. We showed that such property has to be unleashed by a novel loss function which simultaneously considers classification accuracy and bounding box quality. Without ensemble and any extra treatment on feature maps, our proposed method achieved state-of-the-art results on several large scale benchmarks including OTB50, OTB100 and VOT2016. We will make our code publicly available.
Tasks Feature Engineering, Image Classification, Visual Tracking
Published 2017-05-30
URL http://arxiv.org/abs/1705.10447v1
PDF http://arxiv.org/pdf/1705.10447v1.pdf
PWC https://paperswithcode.com/paper/robust-tracking-using-region-proposal
Repo
Framework

Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk

Title Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk
Authors Paul Hand, Vladislav Voroninski
Abstract We examine the theoretical properties of enforcing priors provided by generative deep neural networks via empirical risk minimization. In particular we consider two models, one in which the task is to invert a generative neural network given access to its last layer and another in which the task is to invert a generative neural network given only compressive linear observations of its last layer. We establish that in both cases, in suitable regimes of network layer sizes and a randomness assumption on the network weights, that the non-convex objective function given by empirical risk minimization does not have any spurious stationary points. That is, we establish that with high probability, at any point away from small neighborhoods around two scalar multiples of the desired solution, there is a descent direction. Hence, there are no local minima, saddle points, or other stationary points outside these neighborhoods. These results constitute the first theoretical guarantees which establish the favorable global geometry of these non-convex optimization problems, and they bridge the gap between the empirical success of enforcing deep generative priors and a rigorous understanding of non-linear inverse problems.
Tasks
Published 2017-05-22
URL http://arxiv.org/abs/1705.07576v3
PDF http://arxiv.org/pdf/1705.07576v3.pdf
PWC https://paperswithcode.com/paper/global-guarantees-for-enforcing-deep
Repo
Framework

Matroids Hitting Sets and Unsupervised Dependency Grammar Induction

Title Matroids Hitting Sets and Unsupervised Dependency Grammar Induction
Authors Nicholas Harvey, Vahab Mirrokni, David Karger, Virginia Savova, Leonid Peshkin
Abstract This paper formulates a novel problem on graphs: find the minimal subset of edges in a fully connected graph, such that the resulting graph contains all spanning trees for a set of specifed sub-graphs. This formulation is motivated by an un-supervised grammar induction problem from computational linguistics. We present a reduction to some known problems and algorithms from graph theory, provide computational complexity results, and describe an approximation algorithm.
Tasks Dependency Grammar Induction
Published 2017-05-24
URL http://arxiv.org/abs/1705.08992v2
PDF http://arxiv.org/pdf/1705.08992v2.pdf
PWC https://paperswithcode.com/paper/matroids-hitting-sets-and-unsupervised
Repo
Framework

Video Highlight Prediction Using Audience Chat Reactions

Title Video Highlight Prediction Using Audience Chat Reactions
Authors Cheng-Yang Fu, Joon Lee, Mohit Bansal, Alexander C. Berg
Abstract Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.
Tasks League of Legends
Published 2017-07-26
URL http://arxiv.org/abs/1707.08559v1
PDF http://arxiv.org/pdf/1707.08559v1.pdf
PWC https://paperswithcode.com/paper/video-highlight-prediction-using-audience
Repo
Framework

Flow Navigation by Smart Microswimmers via Reinforcement Learning

Title Flow Navigation by Smart Microswimmers via Reinforcement Learning
Authors Simona Colabrese, Kristian Gustavsson, Antonio Celani, Luca Biferale
Abstract Smart active particles can acquire some limited knowledge of the fluid environment from simple mechanical cues and exert a control on their preferred steering direction. Their goal is to learn the best way to navigate by exploiting the underlying flow whenever possible. As an example, we focus our attention on smart gravitactic swimmers. These are active particles whose task is to reach the highest altitude within some time horizon, given the constraints enforced by fluid mechanics. By means of numerical experiments, we show that swimmers indeed learn nearly optimal strategies just by experience. A reinforcement learning algorithm allows particles to learn effective strategies even in difficult situations when, in the absence of control, they would end up being trapped by flow structures. These strategies are highly nontrivial and cannot be easily guessed in advance. This Letter illustrates the potential of reinforcement learning algorithms to model adaptive behavior in complex flows and paves the way towards the engineering of smart microswimmers that solve difficult navigation problems.
Tasks
Published 2017-01-30
URL http://arxiv.org/abs/1701.08848v3
PDF http://arxiv.org/pdf/1701.08848v3.pdf
PWC https://paperswithcode.com/paper/flow-navigation-by-smart-microswimmers-via
Repo
Framework

Comparison of echo state network output layer classification methods on noisy data

Title Comparison of echo state network output layer classification methods on noisy data
Authors Ashley Prater
Abstract Echo state networks are a recently developed type of recurrent neural network where the internal layer is fixed with random weights, and only the output layer is trained on specific data. Echo state networks are increasingly being used to process spatiotemporal data in real-world settings, including speech recognition, event detection, and robot control. A strength of echo state networks is the simple method used to train the output layer - typically a collection of linear readout weights found using a least squares approach. Although straightforward to train and having a low computational cost to use, this method may not yield acceptable accuracy performance on noisy data. This study compares the performance of three echo state network output layer methods to perform classification on noisy data: using trained linear weights, using sparse trained linear weights, and using trained low-rank approximations of reservoir states. The methods are investigated experimentally on both synthetic and natural datasets. The experiments suggest that using regularized least squares to train linear output weights is superior on data with low noise, but using the low-rank approximations may significantly improve accuracy on datasets contaminated with higher noise levels.
Tasks Speech Recognition
Published 2017-03-13
URL http://arxiv.org/abs/1703.04496v1
PDF http://arxiv.org/pdf/1703.04496v1.pdf
PWC https://paperswithcode.com/paper/comparison-of-echo-state-network-output-layer
Repo
Framework

Estimation of the volume of the left ventricle from MRI images using deep neural networks

Title Estimation of the volume of the left ventricle from MRI images using deep neural networks
Authors Fangzhou Liao, Xi Chen, Xiaolin Hu, Sen Song
Abstract Segmenting human left ventricle (LV) in magnetic resonance imaging (MRI) images and calculating its volume are important for diagnosing cardiac diseases. In 2016, Kaggle organized a competition to estimate the volume of LV from MRI images. The dataset consisted of a large number of cases, but only provided systole and diastole volumes as labels. We designed a system based on neural networks to solve this problem. It began with a detector combined with a neural network classifier for detecting regions of interest (ROIs) containing LV chambers. Then a deep neural network named hypercolumns fully convolutional network was used to segment LV in ROIs. The 2D segmentation results were integrated across different images to estimate the volume. With ground-truth volume labels, this model was trained end-to-end. To improve the result, an additional dataset with only segmentation label was used. The model was trained alternately on these two datasets with different types of teaching signals. We also proposed a variance estimation method for the final prediction. Our algorithm ranked the 4th on the test set in this competition.
Tasks
Published 2017-02-13
URL http://arxiv.org/abs/1702.03833v1
PDF http://arxiv.org/pdf/1702.03833v1.pdf
PWC https://paperswithcode.com/paper/estimation-of-the-volume-of-the-left
Repo
Framework

Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis

Title Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis
Authors Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra
Abstract Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite number of loss functions. The present paper proposes a Riemannian stochastic quasi-Newton algorithm with variance reduction (R-SQN-VR). The key challenges of averaging, adding, and subtracting multiple gradients are addressed with notions of retraction and vector transport. We present convergence analyses of R-SQN-VR on both non-convex and retraction-convex functions under retraction and vector transport operators. The proposed algorithm is evaluated on the Karcher mean computation on the symmetric positive-definite manifold and the low-rank matrix completion on the Grassmann manifold. In all cases, the proposed algorithm outperforms the state-of-the-art Riemannian batch and stochastic gradient algorithms.
Tasks Low-Rank Matrix Completion, Matrix Completion
Published 2017-03-15
URL http://arxiv.org/abs/1703.04890v3
PDF http://arxiv.org/pdf/1703.04890v3.pdf
PWC https://paperswithcode.com/paper/riemannian-stochastic-quasi-newton-algorithm
Repo
Framework

Feed Forward and Backward Run in Deep Convolution Neural Network

Title Feed Forward and Backward Run in Deep Convolution Neural Network
Authors Pushparaja Murugan
Abstract Convolution Neural Networks (CNN), known as ConvNets are widely used in many visual imagery application, object classification, speech recognition. After the implementation and demonstration of the deep convolution neural network in Imagenet classification in 2012 by krizhevsky, the architecture of deep Convolution Neural Network is attracted many researchers. This has led to the major development in Deep learning frameworks such as Tensorflow, caffe, keras, theno. Though the implementation of deep learning is quite possible by employing deep learning frameworks, mathematical theory and concepts are harder to understand for new learners and practitioners. This article is intended to provide an overview of ConvNets architecture and to explain the mathematical theory behind it including activation function, loss function, feedforward and backward propagation. In this article, grey scale image is taken as input information image, ReLU and Sigmoid activation function are considered for developing the architecture and cross-entropy loss function are used for computing the difference between predicted value and actual value. The architecture is developed in such a way that it can contain one convolution layer, one pooling layer, and multiple dense layers
Tasks Object Classification, Speech Recognition
Published 2017-11-09
URL http://arxiv.org/abs/1711.03278v1
PDF http://arxiv.org/pdf/1711.03278v1.pdf
PWC https://paperswithcode.com/paper/feed-forward-and-backward-run-in-deep
Repo
Framework

Leak Event Identification in Water Systems Using High Order CRF

Title Leak Event Identification in Water Systems Using High Order CRF
Authors Qing Han, Wentao Zhu, Yang Shi
Abstract Today, detection of anomalous events in civil infrastructures (e.g. water pipe breaks and leaks) is time consuming and often takes hours or days. Pipe breakage as one of the most frequent types of failure of water networks often causes community disruptions ranging from temporary interruptions in services to extended loss of business and relocation of residents. In this project, we design and implement a two-phase approach for leak event identification, which leverages dynamic data from multiple information sources including IoT sensing data (pressure values and/or flow rates), geophysical data (water systems), and human inputs (tweets posted on Twitter). In the approach, a high order Conditional Random Field (CRF) is constructed that enforces predictions based on IoT observations consistent with human inputs to improve the performance of event identifications. Considering the physical water network as a graph, a CRF model is built and learned by the Structured Support Vector Machine (SSVM) using node features such as water pressure and flow rate. After that, we built the high order CRF system by enforcing twitter leakage detection information. An optimal inference algorithm is proposed for the adapted high order CRF model. Experimental results show the effectiveness of our system.
Tasks
Published 2017-03-12
URL http://arxiv.org/abs/1703.04170v1
PDF http://arxiv.org/pdf/1703.04170v1.pdf
PWC https://paperswithcode.com/paper/leak-event-identification-in-water-systems
Repo
Framework

Hard Mixtures of Experts for Large Scale Weakly Supervised Vision

Title Hard Mixtures of Experts for Large Scale Weakly Supervised Vision
Authors Sam Gross, Marc’Aurelio Ranzato, Arthur Szlam
Abstract Training convolutional networks (CNN’s) that fit on a single GPU with minibatch stochastic gradient descent has become effective in practice. However, there is still no effective method for training large CNN’s that do not fit in the memory of a few GPU cards, or for parallelizing CNN training. In this work we show that a simple hard mixture of experts model can be efficiently trained to good effect on large scale hashtag (multilabel) prediction tasks. Mixture of experts models are not new (Jacobs et. al. 1991, Collobert et. al. 2003), but in the past, researchers have had to devise sophisticated methods to deal with data fragmentation. We show empirically that modern weakly supervised data sets are large enough to support naive partitioning schemes where each data point is assigned to a single expert. Because the experts are independent, training them in parallel is easy, and evaluation is cheap for the size of the model. Furthermore, we show that we can use a single decoding layer for all the experts, allowing a unified feature embedding space. We demonstrate that it is feasible (and in fact relatively painless) to train far larger models than could be practically trained with standard CNN architectures, and that the extra capacity can be well used on current datasets.
Tasks
Published 2017-04-20
URL http://arxiv.org/abs/1704.06363v1
PDF http://arxiv.org/pdf/1704.06363v1.pdf
PWC https://paperswithcode.com/paper/hard-mixtures-of-experts-for-large-scale
Repo
Framework

Toward Crowd-Sensitive Path Planning

Title Toward Crowd-Sensitive Path Planning
Authors Anoop Aroor, Susan L. Epstein
Abstract If a robot can predict crowds in parts of its environment that are inaccessible to its sensors, then it can plan to avoid them. This paper proposes a fast, online algorithm that learns average crowd densities in different areas. It also describes how these densities can be incorporated into existing navigation architectures. In simulation across multiple challenging crowd scenarios, the robot reaches its target faster, travels less, and risks fewer collisions than if it were to plan with the traditional A* algorithm.
Tasks
Published 2017-10-16
URL http://arxiv.org/abs/1710.05503v1
PDF http://arxiv.org/pdf/1710.05503v1.pdf
PWC https://paperswithcode.com/paper/toward-crowd-sensitive-path-planning
Repo
Framework

Ranking Median Regression: Learning to Order through Local Consensus

Title Ranking Median Regression: Learning to Order through Local Consensus
Authors Stephan Clémençon, Anna Korba, Eric Sibony
Abstract This article is devoted to the problem of predicting the value taken by a random permutation $\Sigma$, describing the preferences of an individual over a set of numbered items ${1,; \ldots,; n}$ say, based on the observation of an input/explanatory r.v. $X$ e.g. characteristics of the individual), when error is measured by the Kendall $\tau$ distance. In the probabilistic formulation of the ‘Learning to Order’ problem we propose, which extends the framework for statistical Kemeny ranking aggregation developped in \citet{CKS17}, this boils down to recovering conditional Kemeny medians of $\Sigma$ given $X$ from i.i.d. training examples $(X_1, \Sigma_1),; \ldots,; (X_N, \Sigma_N)$. For this reason, this statistical learning problem is referred to as \textit{ranking median regression} here. Our contribution is twofold. We first propose a probabilistic theory of ranking median regression: the set of optimal elements is characterized, the performance of empirical risk minimizers is investigated in this context and situations where fast learning rates can be achieved are also exhibited. Next we introduce the concept of local consensus/median, in order to derive efficient methods for ranking median regression. The major advantage of this local learning approach lies in its close connection with the widely studied Kemeny aggregation problem. From an algorithmic perspective, this permits to build predictive rules for ranking median regression by implementing efficient techniques for (approximate) Kemeny median computations at a local level in a tractable manner. In particular, versions of $k$-nearest neighbor and tree-based methods, tailored to ranking median regression, are investigated. Accuracy of piecewise constant ranking median regression rules is studied under a specific smoothness assumption for $\Sigma$'s conditional distribution given $X$.
Tasks
Published 2017-10-31
URL http://arxiv.org/abs/1711.00070v2
PDF http://arxiv.org/pdf/1711.00070v2.pdf
PWC https://paperswithcode.com/paper/ranking-median-regression-learning-to-order
Repo
Framework

Development of a N-type GM-PHD Filter for Multiple Target, Multiple Type Visual Tracking

Title Development of a N-type GM-PHD Filter for Multiple Target, Multiple Type Visual Tracking
Authors Nathanael L. Baisa, Andrew Wallace
Abstract We propose a new framework that extends the standard Probability Hypothesis Density (PHD) filter for multiple targets having $N\geq2$ different types based on Random Finite Set theory, taking into account not only background clutter, but also confusions among detections of different target types, which are in general different in character from background clutter. Under Gaussianity and linearity assumptions, our framework extends the existing Gaussian mixture (GM) implementation of the standard PHD filter to create a N-type GM-PHD filter. The methodology is applied to real video sequences by integrating object detectors’ information into this filter for two scenarios. For both cases, Munkres’s variant of the Hungarian assignment algorithm is used to associate tracked target identities between frames. This approach is evaluated and compared to both raw detection and independent GM-PHD filters using the Optimal Sub-pattern Assignment metric and discrimination rate. This shows the improved performance of our strategy on real video sequences.
Tasks Visual Tracking
Published 2017-05-31
URL http://arxiv.org/abs/1706.00672v5
PDF http://arxiv.org/pdf/1706.00672v5.pdf
PWC https://paperswithcode.com/paper/development-of-a-n-type-gm-phd-filter-for
Repo
Framework

Image Restoration from Patch-based Compressed Sensing Measurement

Title Image Restoration from Patch-based Compressed Sensing Measurement
Authors Guangtao Nie, Ying Fu, Yinqiang Zheng, Hua Huang
Abstract A series of methods have been proposed to reconstruct an image from compressively sensed random measurement, but most of them have high time complexity and are inappropriate for patch-based compressed sensing capture, because of their serious blocky artifacts in the restoration results. In this paper, we present a non-iterative image reconstruction method from patch-based compressively sensed random measurement. Our method features two cascaded networks based on residual convolution neural network to learn the end-to-end full image restoration, which is capable of reconstructing image patches and removing the blocky effect with low time cost. Experimental results on synthetic and real data show that our method outperforms state-of-the-art compressive sensing (CS) reconstruction methods with patch-based CS measurement. To demonstrate the effectiveness of our method in more general setting, we apply the de-block process in our method to JPEG compression artifacts removal and achieve outstanding performance as well.
Tasks Compressive Sensing, Image Reconstruction, Image Restoration
Published 2017-06-02
URL http://arxiv.org/abs/1706.00597v1
PDF http://arxiv.org/pdf/1706.00597v1.pdf
PWC https://paperswithcode.com/paper/image-restoration-from-patch-based-compressed
Repo
Framework
comments powered by Disqus