January 28, 2020

3048 words 15 mins read

Paper Group ANR 1006

Cross-referencing Social Media and Public Surveillance Camera Data for Disaster Response. Deep Learning and Control Algorithms of Direct Perception for Autonomous Driving. DCEF: Deep Collaborative Encoder Framework for Unsupervised Clustering. Butterfly Transform: An Efficient FFT Based Neural Architecture Design. A Deep Q-Learning Method for Downl …


Title	Cross-referencing Social Media and Public Surveillance Camera Data for Disaster Response
Authors	Chittayong Surakitbanharn, Calvin Yau, Guizhen Wang, Aniesh Chawla, Yinuo Pan, Zhaoya Sun, Sam Yellin, David Ebert, Yung-Hsiang Lu, George K. Thiruvathukal
Abstract	Physical media (like surveillance cameras) and social media (like Instagram and Twitter) may both be useful in attaining on-the-ground information during an emergency or disaster situation. However, the intersection and reliability of both surveillance cameras and social media during a natural disaster are not fully understood. To address this gap, we tested whether social media is of utility when physical surveillance cameras went off-line during Hurricane Irma in 2017. Specifically, we collected and compared geo-tagged Instagram and Twitter posts in the state of Florida during times and in areas where public surveillance cameras went off-line. We report social media content and frequency and content to determine the utility for emergency managers or first responders during a natural disaster.
Tasks
Published	2019-01-19
URL	http://arxiv.org/abs/1901.06459v1
PDF	http://arxiv.org/pdf/1901.06459v1.pdf
PWC	https://paperswithcode.com/paper/cross-referencing-social-media-and-public
Repo
Framework

Deep Learning and Control Algorithms of Direct Perception for Autonomous Driving


Title	Deep Learning and Control Algorithms of Direct Perception for Autonomous Driving
Authors	Der-Hau Lee, Kuan-Lin Chen, Kuan-Han Liou, Chang-Lun Liu, Jinn-Liang Liu
Abstract	Based on the direct perception paradigm of autonomous driving, we investigate and modify the CNNs (convolutional neural networks) AlexNet and GoogLeNet that map an input image to few perception indicators (heading angle, distances to preceding cars, and distance to road centerline) for estimating driving affordances in highway traffic. We also design a controller with these indicators and the short-range sensor information of TORCS (the open racing car simulator) for driving simulated cars to avoid collisions. We collect a set of images from a TORCS camera in various driving scenarios, train these CNNs using the dataset, test them in unseen traffics, and find that they perform better than earlier algorithms and controllers in terms of training efficiency and driving stability. Source code and data are available on our website.
Tasks	Autonomous Driving
Published	2019-10-26
URL	https://arxiv.org/abs/1910.12031v2
PDF	https://arxiv.org/pdf/1910.12031v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-and-control-algorithms-of
Repo
Framework

DCEF: Deep Collaborative Encoder Framework for Unsupervised Clustering


Title	DCEF: Deep Collaborative Encoder Framework for Unsupervised Clustering
Authors	Jielei Chu, Hongjun Wang, Jing Liu, Zeng Yu, Tianrui Li
Abstract	Collaborative representation is a popular feature learning approach, which encoding process is assisted by variety types of information. In this paper, we propose a collaborative representation restricted Boltzmann Machine (CRRBM) for modeling binary data and a collaborative representation Gaussian restricted Boltzmann Machine (CRGRBM) for modeling realvalued data by applying a collaborative representation strategy in the encoding procedure. We utilize Locality Sensitive Hashing (LSH) to generate similar sample subsets of the instance and observed feature set simultaneously from input data. Hence, we can obtain some mini blocks, which come from the intersection of instance and observed feature subsets. Then we integrate Contrastive Divergence and Bregman Divergence methods with mini blocks to optimize our CRRBM and CRGRBM models. In their training process, the complex collaborative relationships between multiple instances and features are fused into the hidden layer encoding. Hence, these encodings have dual characteristics of concealment and cooperation. Here, we develop two deep collaborative encoder frameworks (DCEF) based on the CRRBM and CRGRBM models: one is a DCEF with Gaussian linear visible units (GDCEF) for modeling real-valued data, and the other is a DCEF with binary visible units (BDCEF) for modeling binary data. We explore the collaborative representation capability of the hidden features in every layer of the GDCEF and BDCEF framework, especially in the deepest hidden layer. The experimental results show that the GDCEF and BDCEF frameworks have more outstanding performances than the classic Autoencoder framework for unsupervised clustering task on the MSRA-MM2.0 and UCI datasets, respectively.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05173v1
PDF	https://arxiv.org/pdf/1906.05173v1.pdf
PWC	https://paperswithcode.com/paper/dcef-deep-collaborative-encoder-framework-for
Repo
Framework

Butterfly Transform: An Efficient FFT Based Neural Architecture Design


Title	Butterfly Transform: An Efficient FFT Based Neural Architecture Design
Authors	Keivan Alizadeh, Ali Farhadi, Mohammad Rastegari
Abstract	In this paper, we introduce the Butterfly Transform (BFT), a light weight channel fusion method that reduces the computational complexity of point-wise convolutions from O(n^2) of conventional solutions to O(n log n) with respect to the number of channels while improving the accuracy of the networks under the same range of FLOPs. The proposed BFT generalizes the Discrete Fourier Transform in a way that its parameters are learned at training time. Our experimental evaluations show that replacing channel fusion modules with \sys results in significant accuracy gains at similar FLOPs across a wide range of network architectures. For example, replacing channel fusion convolutions with BFT offers 3% absolute top-1 improvement for MobileNetV1-0.25 and 2.5% for ShuffleNet V2-0.5 while maintaining the same number of FLOPS. Notably, the ShuffleNet-V2+BFT outperforms state-of-the-art architecture search methods MNasNet \cite{tan2018mnasnet} and FBNet \cite{wu2018fbnet}. We also show that the structure imposed by BFT has interesting properties that ensures the efficacy of the resulting network.
Tasks	Neural Architecture Search
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02256v1
PDF	https://arxiv.org/pdf/1906.02256v1.pdf
PWC	https://paperswithcode.com/paper/butterfly-transform-an-efficient-fft-based
Repo
Framework

A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks


Title	A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks
Authors	Kazi Ishfaq Ahmed, Ekram Hossain
Abstract	Optimal resource allocation is a fundamental challenge for dense and heterogeneous wireless networks with massive wireless connections. Because of the non-convex nature of the optimization problem, it is computationally demanding to obtain the optimal resource allocation. Recently, deep reinforcement learning (DRL) has emerged as a promising technique in solving non-convex optimization problems. Unlike deep learning (DL), DRL does not require any optimal/ near-optimal training dataset which is either unavailable or computationally expensive in generating synthetic data. In this paper, we propose a novel centralized DRL based downlink power allocation scheme for a multi-cell system intending to maximize the total network throughput. Specifically, we apply a deep Q-learning (DQL) approach to achieve near-optimal power allocation policy. For benchmarking the proposed approach, we use a Genetic Algorithm (GA) to obtain near-optimal power allocation solution. Simulation results show that the proposed DRL-based power allocation scheme performs better compared to the conventional power allocation schemes in a multi-cell scenario.
Tasks	Q-Learning
Published	2019-04-30
URL	http://arxiv.org/abs/1904.13032v1
PDF	http://arxiv.org/pdf/1904.13032v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-q-learning-method-for-downlink-power
Repo
Framework

Inductive Bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters


Title	Inductive Bias-driven Reinforcement Learning For Efficient Schedules in Heterogeneous Clusters
Authors	Subho S Banerjee, Saurabh Jha, Ravishankar K. Iyer
Abstract	The problem of scheduling of workloads onto heterogeneous processors (e.g., CPUs, GPUs, FPGAs) is of fundamental importance in modern datacenters. Most current approaches rely on building application/system-specific heuristics that have to be reinvented on a case-by-case basis. This can be prohibitively expensive and is untenable going forward. In this paper, we propose a domain-driven reinforcement learning (RL) model for scheduling that can be broadly applied to a large class of heterogeneous processors. The key novelty of our approach is (i) the RL model; and (ii) the significant reduction of training-data (using domain knowledge) and -time (using sampling based end-to-end gradient propagation). We demonstrate the approach using real world GPU and FPGA accelerated applications to produce scheduling policies that significantly outperform hand-tuned heuristics.
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1909.02119v1
PDF	https://arxiv.org/pdf/1909.02119v1.pdf
PWC	https://paperswithcode.com/paper/inductive-bias-driven-reinforcement-learning
Repo
Framework

Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation


Title	Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation
Authors	Dmitry Akimov
Abstract	In this paper, we describe NeurIPS 2019 Learning to Move - Walk Around challenge physics-based environment and present our solution to this competition which scored 1303.727 mean reward points and took 3rd place. Our method combines recent advances from both continuous- and discrete-action space reinforcement learning, such as Soft Actor-Critic and Recurrent Experience Replay in Distributed Reinforcement Learning. We trained our agent in two stages: to move somewhere at the first stage and to follow the target velocity field at the second stage. We also introduce novel Q-function split technique, which we believe facilitates the task of training an agent, allows critic pretraining and reusing it for solving harder problems, and mitigate reward shaping design efforts.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13056v1
PDF	https://arxiv.org/pdf/1911.13056v1.pdf
PWC	https://paperswithcode.com/paper/distributed-soft-actor-critic-with
Repo
Framework

Lesson Learnt: Modularization of Deep Networks Allow Cross-Modality Reuse


Title	Lesson Learnt: Modularization of Deep Networks Allow Cross-Modality Reuse
Authors	Weilin Fu, Lennart Husvogt, Stefan Ploner James G. Fujimoto Andreas Maier
Abstract	Fundus photography and Optical Coherence Tomography Angiography (OCT-A) are two commonly used modalities in ophthalmic imaging. With the development of deep learning algorithms, fundus image processing, especially retinal vessel segmentation, has been extensively studied. Built upon the known operator theory, interpretable deep network pipelines with well-defined modules have been constructed on fundus images. In this work, we firstly train a modularized network pipeline for the task of retinal vessel segmentation on the fundus database DRIVE. The pretrained preprocessing module from the pipeline is then directly transferred onto OCT-A data for image quality enhancement without further fine-tuning. Output images show that the preprocessing net can balance the contrast, suppress noise and thereby produce vessel trees with improved connectivity in both image modalities. The visual impression is confirmed by an observer study with five OCT-A experts. Statistics of the grades by the experts indicate that the transferred module improves both the image quality and the diagnostic quality. Our work provides an example that modules within network pipelines that are built upon the known operator theory facilitate cross-modality reuse without additional training or transfer learning.
Tasks	Retinal Vessel Segmentation, Transfer Learning
Published	2019-11-05
URL	https://arxiv.org/abs/1911.02080v1
PDF	https://arxiv.org/pdf/1911.02080v1.pdf
PWC	https://paperswithcode.com/paper/lesson-learnt-modularization-of-deep-networks
Repo
Framework

A new constraint programming model and a linear programming-based adaptive large neighborhood search for the vehicle routing problem with synchronization constraints


Title	A new constraint programming model and a linear programming-based adaptive large neighborhood search for the vehicle routing problem with synchronization constraints
Authors	Minh Hoàng Hà, Tat Dat Nguyen, Thinh Nguyen Duy, Hoang Giang Pham, Thuy Do, Louis-Martin Rousseau
Abstract	We consider a vehicle routing problem which seeks to minimize cost subject to time window and synchronization constraints. In this problem, the fleet of vehicles is categorized into regular and special vehicles. Some customers require both vehicles’ services, whose starting service times at the customer are synchronized. Despite its important real-world application, this problem has rarely been studied in the literature. To solve the problem, we propose a Constraint Programming (CP) model and an Adaptive Large Neighborhood Search (ALNS) in which the design of insertion operators is based on solving linear programming (LP) models to check the insertion feasibility. A number of acceleration techniques is also proposed to significantly reduce the computational time. The computational experiments show that our new CP model finds better solutions than an existing CP-based ANLS, when used on small instances with 25 customers and with a much shorter running time. Our LP-based ALNS dominates the cp-ALNS, in terms of solution quality, when it provides solutions with better objective values, on average, for all instance classes. This demonstrates the advantage of using linear programming instead of constraint programming when dealing with a variant of vehicle routing problems with relatively tight constraints, which is often considered to be more favorable for CP-based methods.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.13513v1
PDF	https://arxiv.org/pdf/1910.13513v1.pdf
PWC	https://paperswithcode.com/paper/a-new-constraint-programming-model-and-a
Repo
Framework

Investigation on N-gram Approximated RNNLMs for Recognition of Morphologically Rich Speech


Title	Investigation on N-gram Approximated RNNLMs for Recognition of Morphologically Rich Speech
Authors	Balázs Tarján, György Szaszák, Tibor Fegyó, Péter Mihajlik
Abstract	Recognition of Hungarian conversational telephone speech is challenging due to the informal style and morphological richness of the language. Recurrent Neural Network Language Model (RNNLM) can provide remedy for the high perplexity of the task; however, two-pass decoding introduces a considerable processing delay. In order to eliminate this delay we investigate approaches aiming at the complexity reduction of RNNLM, while preserving its accuracy. We compare the performance of conventional back-off n-gram language models (BNLM), BNLM approximation of RNNLMs (RNN-BNLM) and RNN n-grams in terms of perplexity and word error rate (WER). Morphological richness is often addressed by using statistically derived subwords - morphs - in the language models, hence our investigations are extended to morph-based models, as well. We found that using RNN-BNLMs 40% of the RNNLM perplexity reduction can be recovered, which is roughly equal to the performance of a RNN 4-gram model. Combining morph-based modeling and approximation of RNNLM, we were able to achieve 8% relative WER reduction and preserve real-time operation of our conversational telephone speech recognition system.
Tasks	Language Modelling, Speech Recognition
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06407v3
PDF	https://arxiv.org/pdf/1907.06407v3.pdf
PWC	https://paperswithcode.com/paper/investigation-on-n-gram-approximated-rnnlms
Repo
Framework

Fully Quantized Transformer for Machine Translation


Title	Fully Quantized Transformer for Machine Translation
Authors	Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
Abstract	State-of-the-art neural machine translation methods employ massive amounts of parameters. Drastically reducing computational costs of such methods without affecting performance has been up to this point unsuccessful. To this end, we propose FullyQT: an all-inclusive quantization strategy for the Transformer. To the best of our knowledge, we are the first to show that it is possible to avoid any loss in translation quality with a fully quantized Transformer. Indeed, compared to full-precision, our 8-bit models score greater or equal BLEU on most tasks. Comparing ourselves to all previously proposed methods, we achieve state-of-the-art quantization results.
Tasks	Machine Translation, Quantization
Published	2019-10-17
URL	https://arxiv.org/abs/1910.10485v3
PDF	https://arxiv.org/pdf/1910.10485v3.pdf
PWC	https://paperswithcode.com/paper/fully-quantized-transformer-for-improved
Repo
Framework

Normal Estimation for 3D Point Clouds via Local Plane Constraint and Multi-scale Selection


Title	Normal Estimation for 3D Point Clouds via Local Plane Constraint and Multi-scale Selection
Authors	Jun Zhou, Hua Huang, Bin Liu, Xiuping Liu
Abstract	In this paper, we propose a normal estimation method for unstructured 3D point clouds. In this method, a feature constraint mechanism called Local Plane Features Constraint (LPFC) is used and then a multi-scale selection strategy is introduced. The LPEC can be used in a single-scale point network architecture for a more stable normal estimation of the unstructured 3D point clouds. In particular, it can partly overcome the influence of noise on a large sampling scale compared to the other methods which only use regression loss for normal estimation. For more details, a subnetwork is built after point-wise features extracted layers of the network and it gives more constraints to each point of the local patch via a binary classifier in the end. Then we use multi-task optimization to train the normal estimation and local plane classification tasks simultaneously.Also, to integrate the advantages of multi-scale results, a scale selection strategy is adopted, which is a data-driven approach for selecting the optimal scale around each point and encourages subnetwork specialization. Specifically, we employed a subnetwork called Scale Estimation Network to extract scale weight information from multi-scale features. More analysis is given about the relations between noise levels, local boundary, and scales in the experiment. These relationships can be a better guide to choosing particular scales for a particular model. Besides, the experimental result shows that our network can distinguish the points on the fitting plane accurately and this can be used to guide the normal estimation and our multi-scale method can improve the results well. Compared to some state-of-the-art surface normal estimators, our method is robust to noise and can achieve competitive results.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08537v1
PDF	https://arxiv.org/pdf/1910.08537v1.pdf
PWC	https://paperswithcode.com/paper/normal-estimation-for-3d-point-clouds-via
Repo
Framework

Graphon Estimation from Partially Observed Network Data


Title	Graphon Estimation from Partially Observed Network Data
Authors	Soumendu Sundar Mukherjee, Sayak Chakrabarti
Abstract	We consider estimating the edge-probability matrix of a network generated from a graphon model when the full network is not observed—only some overlapping subgraphs are. We extend the neighbourhood smoothing (NBS) algorithm of Zhang et al. (2017) to this missing-data set-up and show experimentally that, for a wide range of graphons, the extended NBS algorithm achieves significantly smaller error rates than standard graphon estimation algorithms such as vanilla neighbourhood smoothing (NBS), universal singular value thresholding (USVT), blockmodel approximation, matrix completion, etc. We also show that the extended NBS algorithm is much more robust to missing data.
Tasks	Graphon Estimation, Matrix Completion
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00494v2
PDF	https://arxiv.org/pdf/1906.00494v2.pdf
PWC	https://paperswithcode.com/paper/190600494
Repo
Framework

Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control


Title	Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control
Authors	Róbert Csordás, Jürgen Schmidhuber
Abstract	The Differentiable Neural Computer (DNC) can learn algorithmic and question answering tasks. An analysis of its internal activation patterns reveals three problems: Most importantly, the lack of key-value separation makes the address distribution resulting from content-based look-up noisy and flat, since the value influences the score calculation, although only the key should. Second, DNC’s de-allocation of memory results in aliasing, which is a problem for content-based look-up. Thirdly, chaining memory reads with the temporal linkage matrix exponentially degrades the quality of the address distribution. Our proposed fixes of these problems yield improved performance on arithmetic tasks, and also improve the mean error rate on the bAbI question answering dataset by 43%.
Tasks	Question Answering
Published	2019-04-23
URL	http://arxiv.org/abs/1904.10278v1
PDF	http://arxiv.org/pdf/1904.10278v1.pdf
PWC	https://paperswithcode.com/paper/improving-differentiable-neural-computers-1
Repo
Framework

Non-imaging single-pixel sensing with optimized binary modulation


Title	Non-imaging single-pixel sensing with optimized binary modulation
Authors	Hao Fu, Liheng Bian, Jun Zhang
Abstract	The conventional high-level sensing techniques require high-fidelity images as input to extract target features, which are produced by either complex imaging hardware or high-complexity reconstruction algorithms. In this letter, we propose single-pixel sensing (SPS) that performs high-level sensing directly from coupled measurements of a single-pixel detector, without the conventional image acquisition and reconstruction process. The technique consists of three steps including binary light modulation that can be physically implemented at $\sim$22kHz, single-pixel coupled detection owning wide working spectrum and high signal-to-noise ratio, and end-to-end deep-learning based sensing that reduces both hardware and software complexity. Besides, the binary modulation is trained and optimized together with the sensing network, which ensures least required measurements and optimal sensing accuracy. The effectiveness of SPS is demonstrated on the classification task of handwritten MNIST dataset, and 96.68% classification accuracy at $\sim$1kHz is achieved. The reported single-pixel sensing technique is a novel framework for highly efficient machine intelligence.
Tasks	Image Classification
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11498v2
PDF	https://arxiv.org/pdf/1909.11498v2.pdf
PWC	https://paperswithcode.com/paper/non-imaging-single-pixel-sensing-with
Repo
Framework