January 30, 2020

2899 words 14 mins read

Paper Group ANR 213

Paper Group ANR 213

CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification. Reusable neural skill embeddings for vision-guided whole body movement and object manipulation. Progressive Retinex: Mutually Reinforced Illumination-Noise Perception Network for Low Light Image Enhancement. QUOTIENT: Two-Party Secure Neural Netwo …

CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification

Title CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification
Authors Zheng Tang, Milind Naphade, Ming-Yu Liu, Xiaodong Yang, Stan Birchfield, Shuo Wang, Ratnesh Kumar, David Anastasiu, Jenq-Neng Hwang
Abstract Urban traffic optimization using traffic cameras as sensors is driving the need to advance state-of-the-art multi-target multi-camera (MTMC) tracking. This work introduces CityFlow, a city-scale traffic camera dataset consisting of more than 3 hours of synchronized HD videos from 40 cameras across 10 intersections, with the longest distance between two simultaneous cameras being 2.5 km. To the best of our knowledge, CityFlow is the largest-scale dataset in terms of spatial coverage and the number of cameras/videos in an urban environment. The dataset contains more than 200K annotated bounding boxes covering a wide range of scenes, viewing angles, vehicle models, and urban traffic flow conditions. Camera geometry and calibration information are provided to aid spatio-temporal analysis. In addition, a subset of the benchmark is made available for the task of image-based vehicle re-identification (ReID). We conducted an extensive experimental evaluation of baselines/state-of-the-art approaches in MTMC tracking, multi-target single-camera (MTSC) tracking, object detection, and image-based ReID on this dataset, analyzing the impact of different network architectures, loss functions, spatio-temporal models and their combinations on task effectiveness. An evaluation server is launched with the release of our benchmark at the 2019 AI City Challenge (https://www.aicitychallenge.org/) that allows researchers to compare the performance of their newest techniques. We expect this dataset to catalyze research in this field, propel the state-of-the-art forward, and lead to deployed traffic optimization(s) in the real world.
Tasks Calibration, Object Detection, Vehicle Re-Identification
Published 2019-03-21
URL http://arxiv.org/abs/1903.09254v4
PDF http://arxiv.org/pdf/1903.09254v4.pdf
PWC https://paperswithcode.com/paper/cityflow-a-city-scale-benchmark-for-multi
Repo
Framework

Reusable neural skill embeddings for vision-guided whole body movement and object manipulation

Title Reusable neural skill embeddings for vision-guided whole body movement and object manipulation
Authors Josh Merel, Saran Tunyasuvunakool, Arun Ahuja, Yuval Tassa, Leonard Hasenclever, Vu Pham, Tom Erez, Greg Wayne, Nicolas Heess
Abstract Both in simulation settings and robotics, there is an ambition to produce flexible control systems that can enable complex bodies to perform dynamic locomotion and natural object manipulation. In previous work, we developed a framework to train locomotor skills and reuse these skills for whole-body visuomotor tasks. Here, we extend this line of work to tasks involving whole body movement as well as visually guided manipulation of objects. This setting poses novel challenges in terms of task specification, exploration, and generalization. We develop an integrated approach consisting of a flexible motor primitive module, demonstrations, an instructed training regime as well as curricula in the form of task variations. We demonstrate the utility of our approach for solving challenging whole body tasks that require joint locomotion and manipulation, and characterize its behavioral robustness. We also provide a high-level overview video, see https://youtu.be/t0RDGSnE3cM .
Tasks
Published 2019-11-15
URL https://arxiv.org/abs/1911.06636v1
PDF https://arxiv.org/pdf/1911.06636v1.pdf
PWC https://paperswithcode.com/paper/reusable-neural-skill-embeddings-for-vision
Repo
Framework

Progressive Retinex: Mutually Reinforced Illumination-Noise Perception Network for Low Light Image Enhancement

Title Progressive Retinex: Mutually Reinforced Illumination-Noise Perception Network for Low Light Image Enhancement
Authors Yang Wang, Yang Cao, Zheng-Jun Zha, Jing Zhang, Zhiwei Xiong, Wei Zhang, Feng Wu
Abstract Contrast enhancement and noise removal are coupled problems for low-light image enhancement. The existing Retinex based methods do not take the coupling relation into consideration, resulting in under or over-smoothing of the enhanced images. To address this issue, this paper presents a novel progressive Retinex framework, in which illumination and noise of low-light image are perceived in a mutually reinforced manner, leading to noise reduction low-light enhancement results. Specifically, two fully pointwise convolutional neural networks are devised to model the statistical regularities of ambient light and image noise respectively, and to leverage them as constraints to facilitate the mutual learning process. The proposed method not only suppresses the interference caused by the ambiguity between tiny textures and image noises, but also greatly improves the computational efficiency. Moreover, to solve the problem of insufficient training data, we propose an image synthesis strategy based on camera imaging model, which generates color images corrupted by illumination-dependent noises. Experimental results on both synthetic and real low-light images demonstrate the superiority of our proposed approaches against the State-Of-The-Art (SOTA) low-light enhancement methods.
Tasks Image Enhancement, Image Generation, Low-Light Image Enhancement
Published 2019-11-26
URL https://arxiv.org/abs/1911.11323v1
PDF https://arxiv.org/pdf/1911.11323v1.pdf
PWC https://paperswithcode.com/paper/progressive-retinex-mutually-reinforced
Repo
Framework

QUOTIENT: Two-Party Secure Neural Network Training and Prediction

Title QUOTIENT: Two-Party Secure Neural Network Training and Prediction
Authors Nitin Agrawal, Ali Shahin Shamsabadi, Matt J. Kusner, Adrià Gascón
Abstract Recently, there has been a wealth of effort devoted to the design of secure protocols for machine learning tasks. Much of this is aimed at enabling secure prediction from highly-accurate Deep Neural Networks (DNNs). However, as DNNs are trained on data, a key question is how such models can be also trained securely. The few prior works on secure DNN training have focused either on designing custom protocols for existing training algorithms, or on developing tailored training algorithms and then applying generic secure protocols. In this work, we investigate the advantages of designing training algorithms alongside a novel secure protocol, incorporating optimizations on both fronts. We present QUOTIENT, a new method for discretized training of DNNs, along with a customized secure two-party protocol for it. QUOTIENT incorporates key components of state-of-the-art DNN training such as layer normalization and adaptive gradient methods, and improves upon the state-of-the-art in DNN training in two-party computation. Compared to prior work, we obtain an improvement of 50X in WAN time and 6% in absolute accuracy.
Tasks
Published 2019-07-08
URL https://arxiv.org/abs/1907.03372v1
PDF https://arxiv.org/pdf/1907.03372v1.pdf
PWC https://paperswithcode.com/paper/quotient-two-party-secure-neural-network
Repo
Framework

Self-supervised Training of Proposal-based Segmentation via Background Prediction

Title Self-supervised Training of Proposal-based Segmentation via Background Prediction
Authors Isinsu Katircioglu, Helge Rhodin, Victor Constantin, Jörg Spörri, Mathieu Salzmann, Pascal Fua
Abstract While supervised object detection methods achieve impressive accuracy, they generalize poorly to images whose appearance significantly differs from the data they have been trained on. To address this in scenarios where annotating data is prohibitively expensive, we introduce a self-supervised approach to object detection and segmentation, able to work with monocular images captured with a moving camera. At the heart of our approach lies the observation that segmentation and background reconstruction are linked tasks, and the idea that, because we observe a structured scene, background regions can be re-synthesized from their surroundings, whereas regions depicting the object cannot. We therefore encode this intuition as a self-supervised loss function that we exploit to train a proposal-based segmentation network. To account for the discrete nature of object proposals, we develop a Monte Carlo-based training strategy that allows us to explore the large space of object proposals. Our experiments demonstrate that our approach yields accurate detections and segmentations in images that visually depart from those of standard benchmarks, outperforming existing self-supervised methods and approaching weakly supervised ones that exploit large annotated datasets.
Tasks Object Detection
Published 2019-07-18
URL https://arxiv.org/abs/1907.08051v1
PDF https://arxiv.org/pdf/1907.08051v1.pdf
PWC https://paperswithcode.com/paper/self-supervised-training-of-proposal-based
Repo
Framework

DSTP-RNN: a dual-stage two-phase attention-based recurrent neural networks for long-term and multivariate time series prediction

Title DSTP-RNN: a dual-stage two-phase attention-based recurrent neural networks for long-term and multivariate time series prediction
Authors Yeqi Liu, Chuanyang Gong, Ling Yang, Yingyi Chen
Abstract Long-term prediction of multivariate time series is still an important but challenging problem. The key to solve this problem is to capture the spatial correlations at the same time, the spatio-temporal relationships at different times and the long-term dependence of the temporal relationships between different series. Attention-based recurrent neural networks (RNN) can effectively represent the dynamic spatio-temporal relationships between exogenous series and target series, but it only performs well in one-step time prediction and short-term time prediction. In this paper, inspired by human attention mechanism including the dual-stage two-phase (DSTP) model and the influence mechanism of target information and non-target information, we propose DSTP-based RNN (DSTP-RNN) and DSTP-RNN-2 respectively for long-term time series prediction. Specifically, we first propose the DSTP-based structure to enhance the spatial correlations between exogenous series. The first phase produces violent but decentralized response weight, while the second phase leads to stationary and concentrated response weight. Secondly, we employ multiple attentions on target series to boost the long-term dependence. Finally, we study the performance of deep spatial attention mechanism and provide experiment and interpretation. Our methods outperform nine baseline methods on four datasets in the fields of energy, finance, environment and medicine, respectively.
Tasks Time Series, Time Series Prediction
Published 2019-04-16
URL http://arxiv.org/abs/1904.07464v1
PDF http://arxiv.org/pdf/1904.07464v1.pdf
PWC https://paperswithcode.com/paper/dstp-rnn-a-dual-stage-two-phase-attention
Repo
Framework

Iterative and Adaptive Sampling with Spatial Attention for Black-Box Model Explanations

Title Iterative and Adaptive Sampling with Spatial Attention for Black-Box Model Explanations
Authors Bhavan Vasu, Chengjiang Long
Abstract Deep neural networks have achieved great success in many real-world applications, yet it remains unclear and difficult to explain their decision-making process to an end-user. In this paper, we address the explainable AI problem for deep neural networks with our proposed framework, named IASSA, which generates an importance map indicating how salient each pixel is for the model’s prediction with an iterative and adaptive sampling module. We employ an affinity matrix calculated on multi-level deep learning features to explore long-range pixel-to-pixel correlation, which can shift the saliency values guided by our long-range and parameter-free spatial attention. Extensive experiments on the MS-COCO dataset show that our proposed approach matches or exceeds the performance of state-of-the-art black-box explanation methods.
Tasks Decision Making
Published 2019-12-18
URL https://arxiv.org/abs/1912.08387v1
PDF https://arxiv.org/pdf/1912.08387v1.pdf
PWC https://paperswithcode.com/paper/iterative-and-adaptive-sampling-with-spatial
Repo
Framework

EdgeSegNet: A Compact Network for Semantic Segmentation

Title EdgeSegNet: A Compact Network for Semantic Segmentation
Authors Zhong Qiu Lin, Brendan Chwyl, Alexander Wong
Abstract In this study, we introduce EdgeSegNet, a compact deep convolutional neural network for the task of semantic segmentation. A human-machine collaborative design strategy is leveraged to create EdgeSegNet, where principled network design prototyping is coupled with machine-driven design exploration to create networks with customized module-level macroarchitecture and microarchitecture designs tailored for the task. Experimental results showed that EdgeSegNet can achieve semantic segmentation accuracy comparable with much larger and computationally complex networks (>20x} smaller model size than RefineNet) as well as achieving an inference speed of ~38.5 FPS on an NVidia Jetson AGX Xavier. As such, the proposed EdgeSegNet is well-suited for low-power edge scenarios.
Tasks Semantic Segmentation
Published 2019-05-10
URL https://arxiv.org/abs/1905.04222v1
PDF https://arxiv.org/pdf/1905.04222v1.pdf
PWC https://paperswithcode.com/paper/edgesegnet-a-compact-network-for-semantic
Repo
Framework

Empirical Bayes Regret Minimization

Title Empirical Bayes Regret Minimization
Authors Chih-Wei Hsu, Branislav Kveton, Ofer Meshi, Martin Mladenov, Csaba Szepesvari
Abstract Most existing bandit algorithms are designed to have low regret on any problem instance. While celebrated, this approach is often conservative in practice because it ignores many intricate properties of problem instances. In this work, we pioneer the idea of minimizing an empirical approximation to the Bayes regret, the expected regret with respect to a distribution of problems. We focus on a tractable instance of this problem, the confidence interval and posterior width tuning, and propose a computationally and sample efficient algorithm for solving it. The tuning algorithm is analyzed and extensively evaluated in multi-armed, linear, and generalized linear bandits. We report several-fold reductions in regret for state-of-the-art bandit algorithms, simply by optimizing over a sample from a distribution.
Tasks
Published 2019-04-04
URL https://arxiv.org/abs/1904.02664v3
PDF https://arxiv.org/pdf/1904.02664v3.pdf
PWC https://paperswithcode.com/paper/empirical-bayes-regret-minimization
Repo
Framework

Sorted Top-k in Rounds

Title Sorted Top-k in Rounds
Authors Mark Braverman, Jieming Mao, Yuval Peres
Abstract We consider the sorted top-$k$ problem whose goal is to recover the top-$k$ items with the correct order out of $n$ items using pairwise comparisons. In many applications, multiple rounds of interaction can be costly. We restrict our attention to algorithms with a constant number of rounds $r$ and try to minimize the sample complexity, i.e. the number of comparisons. When the comparisons are noiseless, we characterize how the optimal sample complexity depends on the number of rounds (up to a polylogarithmic factor for general $r$ and up to a constant factor for $r=1$ or 2). In particular, the sample complexity is $\Theta(n^2)$ for $r=1$, $\Theta(n\sqrt{k} + n^{4/3})$ for $r=2$ and $\tilde{\Theta}\left(n^{2/r} k^{(r-1)/r} + n\right)$ for $r \geq 3$. We extend our results of sorted top-$k$ to the noisy case where each comparison is correct with probability $2/3$. When $r=1$ or 2, we show that the sample complexity gets an extra $\Theta(\log(k))$ factor when we transition from the noiseless case to the noisy case. We also prove new results for top-$k$ and sorting in the noisy case. We believe our techniques can be generally useful for understanding the trade-off between round complexities and sample complexities of rank aggregation problems.
Tasks
Published 2019-06-12
URL https://arxiv.org/abs/1906.05208v1
PDF https://arxiv.org/pdf/1906.05208v1.pdf
PWC https://paperswithcode.com/paper/sorted-top-k-in-rounds
Repo
Framework

Document classification methods

Title Document classification methods
Authors Madjid Khalilian, Shiva Hassanzadeh
Abstract Information on different fields which are collected by users requires appropriate management and organization to be structured in a standard way and retrieved fast and more easily. Document classification is a conventional method to separate text based on their subjects among scientific text, web pages and digital library. Different methods and techniques are proposed for document classifications that have advantages and deficiencies. In this paper, several unsupervised and supervised document classification methods are studied and compared.
Tasks Document Classification
Published 2019-09-16
URL https://arxiv.org/abs/1909.07368v1
PDF https://arxiv.org/pdf/1909.07368v1.pdf
PWC https://paperswithcode.com/paper/document-classification-methods
Repo
Framework

Benchmarking the Neural Linear Model for Regression

Title Benchmarking the Neural Linear Model for Regression
Authors Sebastian W. Ober, Carl Edward Rasmussen
Abstract The neural linear model is a simple adaptive Bayesian linear regression method that has recently been used in a number of problems ranging from Bayesian optimization to reinforcement learning. Despite its apparent successes in these settings, to the best of our knowledge there has been no systematic exploration of its capabilities on simple regression tasks. In this work we characterize these on the UCI datasets, a popular benchmark for Bayesian regression models, as well as on the recently introduced UCI “gap” datasets, which are better tests of out-of-distribution uncertainty. We demonstrate that the neural linear model is a simple method that shows generally good performance on these tasks, but at the cost of requiring good hyperparameter tuning.
Tasks
Published 2019-12-18
URL https://arxiv.org/abs/1912.08416v1
PDF https://arxiv.org/pdf/1912.08416v1.pdf
PWC https://paperswithcode.com/paper/benchmarking-the-neural-linear-model-for
Repo
Framework

Informing Unsupervised Pretraining with External Linguistic Knowledge

Title Informing Unsupervised Pretraining with External Linguistic Knowledge
Authors Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš
Abstract Unsupervised pretraining models have been shown to facilitate a wide range of downstream applications. These models, however, still encode only the distributional knowledge, incorporated through language modeling objectives. In this work, we complement the encoded distributional knowledge with external lexical knowledge. We generalize the recently proposed (state-of-the-art) unsupervised pretraining model BERT to a multi-task learning setting: we couple BERT’s masked language modeling and next sentence prediction objectives with the auxiliary binary word relation classification, through which we inject clean linguistic knowledge into the model. Our initial experiments suggest that our “linguistically-informed” BERT (LIBERT) yields performance gains over the linguistically-blind “vanilla” BERT on several language understanding tasks.
Tasks Language Modelling, Multi-Task Learning, Relation Classification
Published 2019-09-05
URL https://arxiv.org/abs/1909.02339v1
PDF https://arxiv.org/pdf/1909.02339v1.pdf
PWC https://paperswithcode.com/paper/informing-unsupervised-pretraining-with
Repo
Framework

Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting

Title Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting
Authors Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins, Chicheng Zhang
Abstract We study contextual bandit learning with an abstract policy class and continuous action space. We obtain two qualitatively different regret bounds: one competes with a smoothed version of the policy class under no continuity assumptions, while the other requires standard Lipschitz assumptions. Both bounds exhibit data-dependent “zooming” behavior and, with no tuning, yield improved guarantees for benign problems. We also study adapting to unknown smoothness parameters, establishing a price-of-adaptivity and deriving optimal adaptive algorithms that require no additional information.
Tasks Multi-Armed Bandits
Published 2019-02-05
URL https://arxiv.org/abs/1902.01520v3
PDF https://arxiv.org/pdf/1902.01520v3.pdf
PWC https://paperswithcode.com/paper/contextual-bandits-with-continuous-actions
Repo
Framework

Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training

Title Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Authors Saptadeep Pal, Eiman Ebrahimi, Arslan Zulfiqar, Yaosheng Fu, Victor Zhang, Szymon Migacz, David Nellans, Puneet Gupta
Abstract Deploying deep learning (DL) models across multiple compute devices to train large and complex models continues to grow in importance because of the demand for faster and more frequent training. Data parallelism (DP) is the most widely used parallelization strategy, but as the number of devices in data parallel training grows, so does the communication overhead between devices. Additionally, a larger aggregate batch size per step leads to statistical efficiency loss, i.e., a larger number of epochs are required to converge to a desired accuracy. These factors affect overall training time and beyond a certain number of devices, the speedup from leveraging DP begins to scale poorly. In addition to DP, each training step can be accelerated by exploiting model parallelism (MP). This work explores hybrid parallelization, where each data parallel worker is comprised of more than one device, across which the model dataflow graph (DFG) is split using MP. We show that at scale, hybrid training will be more effective at minimizing end-to-end training time than exploiting DP alone. We project that for Inception-V3, GNMT, and BigLSTM, the hybrid strategy provides an end-to-end training speedup of at least 26.5%, 8%, and 22% respectively compared to what DP alone can achieve at scale.
Tasks
Published 2019-07-30
URL https://arxiv.org/abs/1907.13257v1
PDF https://arxiv.org/pdf/1907.13257v1.pdf
PWC https://paperswithcode.com/paper/optimizing-multi-gpu-parallelization
Repo
Framework
comments powered by Disqus