April 2, 2020

3596 words 17 mins read

Paper Group ANR 118

Closed-loop Parameter Identification of Linear Dynamical Systems through the Lens of Feedback Channel Coding Theory. A Big Data Enabled Channel Model for 5G Wireless Communication Systems. Learning Cross-domain Generalizable Features by Representation Disentanglement. Meta Segmentation Network for Ultra-Resolution Medical Images. Variational infere …

Closed-loop Parameter Identification of Linear Dynamical Systems through the Lens of Feedback Channel Coding Theory


Title	Closed-loop Parameter Identification of Linear Dynamical Systems through the Lens of Feedback Channel Coding Theory
Authors	Ali Reza Pedram, Takashi Tanaka
Abstract	This paper considers the problem of closed-loop identification of linear scalar systems with Gaussian process noise, where the system input is determined by a deterministic state feedback policy. The regularized least-square estimate (LSE) algorithm is adopted, seeking to find the best estimate of unknown model parameters based on noiseless measurements of the state. We are interested in the fundamental limitation of the rate at which unknown parameters can be learned, in the sense of the D-optimality scalarization criterion subject to a quadratic control cost. We first establish a novel connection between a closed-loop identification problem of interest and a channel coding problem involving an additive white Gaussian noise (AWGN) channel with feedback and a certain structural constraint. Based on this connection, we show that the learning rate is fundamentally upper bounded by the capacity of the corresponding AWGN channel. Although the optimal design of the feedback policy remains challenging, we derive conditions under which the upper bound is achieved. Finally, we show that the obtained upper bound implies that super-linear convergence is unattainable for any choice of the policy.
Tasks
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12548v1
PDF	https://arxiv.org/pdf/2003.12548v1.pdf
PWC	https://paperswithcode.com/paper/closed-loop-parameter-identification-of
Repo
Framework

A Big Data Enabled Channel Model for 5G Wireless Communication Systems


Title	A Big Data Enabled Channel Model for 5G Wireless Communication Systems
Authors	Jie Huang, Cheng-Xiang Wang, Lu Bai, Jian Sun, Yang Yang, Jie Li, Olav Tirkkonen, Ming-Tuo Zhou
Abstract	The standardization process of the fifth generation (5G) wireless communications has recently been accelerated and the first commercial 5G services would be provided as early as in 2018. The increasing of enormous smartphones, new complex scenarios, large frequency bands, massive antenna elements, and dense small cells will generate big datasets and bring 5G communications to the era of big data. This paper investigates various applications of big data analytics, especially machine learning algorithms in wireless communications and channel modeling. We propose a big data and machine learning enabled wireless channel model framework. The proposed channel model is based on artificial neural networks (ANNs), including feed-forward neural network (FNN) and radial basis function neural network (RBF-NN). The input parameters are transmitter (Tx) and receiver (Rx) coordinates, Tx-Rx distance, and carrier frequency, while the output parameters are channel statistical properties, including the received power, root mean square (RMS) delay spread (DS), and RMS angle spreads (ASs). Datasets used to train and test the ANNs are collected from both real channel measurements and a geometry based stochastic model (GBSM). Simulation results show good performance and indicate that machine learning algorithms can be powerful analytical tools for future measurement-based wireless channel modeling.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12561v1
PDF	https://arxiv.org/pdf/2002.12561v1.pdf
PWC	https://paperswithcode.com/paper/a-big-data-enabled-channel-model-for-5g
Repo
Framework

Learning Cross-domain Generalizable Features by Representation Disentanglement


Title	Learning Cross-domain Generalizable Features by Representation Disentanglement
Authors	Qingjie Meng, Daniel Rueckert, Bernhard Kainz
Abstract	Deep learning models exhibit limited generalizability across different domains. Specifically, transferring knowledge from available entangled domain features(source/target domain) and categorical features to new unseen categorical features in a target domain is an interesting and difficult problem that is rarely discussed in the current literature. This problem is essential for many real-world applications such as improving diagnostic classification or prediction in medical imaging. To address this problem, we propose Mutual-Information-based Disentangled Neural Networks (MIDNet) to extract generalizable features that enable transferring knowledge to unseen categorical features in target domains. The proposed MIDNet is developed as a semi-supervised learning paradigm to alleviate the dependency on labeled data. This is important for practical applications where data annotation requires rare expertise as well as intense time and labor. We demonstrate our method on handwritten digits datasets and a fetal ultrasound dataset for image classification tasks. Experiments show that our method outperforms the state-of-the-art and achieve expected performance with sparsely labeled data.
Tasks	Image Classification
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00321v1
PDF	https://arxiv.org/pdf/2003.00321v1.pdf
PWC	https://paperswithcode.com/paper/learning-cross-domain-generalizable-features
Repo
Framework

Meta Segmentation Network for Ultra-Resolution Medical Images


Title	Meta Segmentation Network for Ultra-Resolution Medical Images
Authors	Tong Wu, Yuan Xie, Yanyun Qu, Bicheng Dai, Shuxin Chen
Abstract	Despite recent progress on semantic segmentation, there still exist huge challenges in medical ultra-resolution image segmentation. The methods based on multi-branch structure can make a good balance between computational burdens and segmentation accuracy. However, the fusion structure in these methods require to be designed elaborately to achieve desirable result, which leads to model redundancy. In this paper, we propose Meta Segmentation Network (MSN) to solve this challenging problem. With the help of meta-learning, the fusion module of MSN is quite simple but effective. MSN can fast generate the weights of fusion layers through a simple meta-learner, requiring only a few training samples and epochs to converge. In addition, to avoid learning all branches from scratch, we further introduce a particular weight sharing mechanism to realize a fast knowledge adaptation and share the weights among multiple branches, resulting in the performance improvement and significant parameters reduction. The experimental results on two challenging ultra-resolution medical datasets BACH and ISIC show that MSN achieves the best performance compared with the state-of-the-art methods.
Tasks	Meta-Learning, Semantic Segmentation
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08043v1
PDF	https://arxiv.org/pdf/2002.08043v1.pdf
PWC	https://paperswithcode.com/paper/meta-segmentation-network-for-ultra
Repo
Framework

Variational inference formulation for a model-free simulation of a dynamical system with unknown parameters by a recurrent neural network


Title	Variational inference formulation for a model-free simulation of a dynamical system with unknown parameters by a recurrent neural network
Authors	Kyongmin Yeo, Dylan E. C. Grullon, Fan-Keng Sun, Duane S. Boning, Jayant R. Kalagnanam
Abstract	We propose a recurrent neural network for a “model-free” simulation of a dynamical system with unknown parameters without prior knowledge. The deep learning model aims to jointly learn the nonlinear time marching operator and the effects of the unknown parameters from a time series dataset. We assume that the time series data set consists of an ensemble of trajectories for a range of the parameters. The learning task is formulated as a statistical inference problem by considering the unknown parameters as random variables. A variational inference method is employed to train a recurrent neural network jointly with a feedforward neural network for an approximately posterior distribution. The approximate posterior distribution makes an inference on a trajectory to identify the effects of the unknown parameters and a recurrent neural network makes a prediction by using the outcome of the inference. In the numerical experiments, it is shown that the proposed variational inference model makes a more accurate simulation compared to the standard recurrent neural networks. It is found that the proposed deep learning model is capable of correctly identifying the dimensions of the random parameters and learning a representation of complex time series data.
Tasks	Time Series
Published	2020-03-02
URL	https://arxiv.org/abs/2003.01184v1
PDF	https://arxiv.org/pdf/2003.01184v1.pdf
PWC	https://paperswithcode.com/paper/variational-inference-formulation-for-a-model
Repo
Framework

Smarter Parking: Using AI to Identify Parking Inefficiencies in Vancouver


Title	Smarter Parking: Using AI to Identify Parking Inefficiencies in Vancouver
Authors	Devon Graham, Satish Kumar Sarraf, Taylor Lundy, Ali MohammadMehr, Sara Uppal, Tae Yoon Lee, Hedayat Zarkoob, Scott Duke Kominers, Kevin Leyton-Brown
Abstract	On-street parking is convenient, but has many disadvantages: on-street spots come at the expense of other road uses such as traffic lanes, transit lanes, bike lanes, or parklets; drivers looking for parking contribute substantially to traffic congestion and hence to greenhouse gas emissions; safety is reduced both due to the fact that drivers looking for spots are more distracted than other road users and that people exiting parked cars pose a risk to cyclists. These social costs may not be worth paying when off-street parking lots are nearby and have surplus capacity. To see where this might be true in downtown Vancouver, we used artificial intelligence techniques to estimate the amount of time it would take drivers to both park on and off street for destinations throughout the city. For on-street parking, we developed (1) a deep-learning model of block-by-block parking availability based on data from parking meters and audits and (2) a computational simulation of drivers searching for an on-street spot. For off-street parking, we developed a computational simulation of the time it would take drivers drive from their original destination to the nearest city-owned off-street lot and then to queue for a spot based on traffic and lot occupancy data. Finally, in both cases we also computed the time it would take the driver to walk from their parking spot to their original destination. We compared these time estimates for destinations in each block of Vancouver’s downtown core and each hour of the day. We found many areas where off street would actually save drivers time over searching the streets for a spot, and many more where the time cost for parking off street was small. The identification of such areas provides an opportunity for the city to repurpose valuable curbside space for community-friendly uses more in line with its transportation goals.
Tasks
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09761v1
PDF	https://arxiv.org/pdf/2003.09761v1.pdf
PWC	https://paperswithcode.com/paper/smarter-parking-using-ai-to-identify-parking
Repo
Framework

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting


Title	Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting
Authors	Liang Qiao, Sanli Tang, Zhanzhan Cheng, Yunlu Xu, Yi Niu, Shiliang Pu, Fei Wu
Abstract	Many approaches have recently been proposed to detect irregular scene text and achieved promising results. However, their localization results may not well satisfy the following text recognition part mainly because of two reasons: 1) recognizing arbitrary shaped text is still a challenging task, and 2) prevalent non-trainable pipeline strategies between text detection and text recognition will lead to suboptimal performances. To handle this incompatibility problem, in this paper we propose an end-to-end trainable text spotting approach named Text Perceptron. Concretely, Text Perceptron first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information. Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies without extra parameters. It unites text detection and the following recognition part into a whole framework, and helps the whole network achieve global optimization. Experiments show that our method achieves competitive performance on two standard text benchmarks, i.e., ICDAR 2013 and ICDAR 2015, and also obviously outperforms existing methods on irregular text benchmarks SCUT-CTW1500 and Total-Text.
Tasks	Text Spotting
Published	2020-02-17
URL	https://arxiv.org/abs/2002.06820v1
PDF	https://arxiv.org/pdf/2002.06820v1.pdf
PWC	https://paperswithcode.com/paper/text-perceptron-towards-end-to-end-arbitrary
Repo
Framework

Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks


Title	Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks
Authors	Joonatan Mänttäri, Sofia Broomé, John Folkesson, Hedvig Kjellström
Abstract	A number of techniques for interpretability have been presented for deep learning in computer vision, typically with the goal of understanding what it is that the networks have actually learned underneath a given classification decision. However, when it comes to deep video architectures, interpretability is still in its infancy and we do not yet have a clear concept of how we should decode spatiotemporal features. In this paper, we present a study comparing how 3D convolutional networks and convolutional LSTM networks learn features across temporally dependent frames. This is the first comparison of two video models that both convolve to learn spatial features but that have principally different methods of modeling time. Additionally, we extend the concept of meaningful perturbation introduced by Fong & Vedaldi (2017) to the temporal dimension to search for the most meaningful part of a sequence for a classification decision.
Tasks
Published	2020-02-02
URL	https://arxiv.org/abs/2002.00367v1
PDF	https://arxiv.org/pdf/2002.00367v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-video-features-a-comparison-of-1
Repo
Framework

Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime


Title	Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime
Authors	Stéphane d’Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala
Abstract	Deep neural networks can achieve remarkable generalization performances while interpolating the training data perfectly. Rather than the U-curve emblematic of the bias-variance trade-off, their test error often follows a double descent - a mark of the beneficial role of overparametrization. In this work, we develop a quantitative theory for this phenomenon in the so-called lazy learning regime of neural networks, by considering the problem of learning a high-dimensional function with random features regression. We obtain a precise asymptotic expression for the bias-variance decomposition of the test error, and show that the bias displays a phase transition at the interpolation threshold, beyond it which it remains constant. We disentangle the variances stemming from the sampling of the dataset, from the additive noise corrupting the labels, and from the initialization of the weights. Following Geiger et al., we first show that the latter two contributions are the crux of the double descent: they lead to the overfitting peak at the interpolation threshold and to the decay of the test error upon overparametrization. We then quantify how they are suppressed by ensembling the outputs of K independently initialized estimators. When K is sent to infinity, the test error remains constant beyond the interpolation threshold. We further compare the effects of overparametrizing, ensembling and regularizing. Finally, we present numerical experiments on classic deep learning setups to show that our results hold qualitatively in realistic lazy learning scenarios.
Tasks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.01054v1
PDF	https://arxiv.org/pdf/2003.01054v1.pdf
PWC	https://paperswithcode.com/paper/double-trouble-in-double-descent-bias-and
Repo
Framework

Learning Deep Analysis Dictionaries – Part I: Unstructured Dictionaries


Title	Learning Deep Analysis Dictionaries – Part I: Unstructured Dictionaries
Authors	Jun-Jie Huang, Pier Luigi Dragotti
Abstract	Inspired by the recent success of Deep Neural Networks and the recent efforts to develop multi-layer dictionary models, we propose a Deep Analysis dictionary Model (DeepAM) which is optimized to address a specific regression task known as single image super-resolution. Contrary to other multi-layer dictionary models, our architecture contains L layers of analysis dictionary and soft-thresholding operators to gradually extract high-level features and a layer of synthesis dictionary which is designed to optimize the regression task at hand. In our approach, each analysis dictionary is partitioned into two sub-dictionaries: an Information Preserving Analysis Dictionary (IPAD) and a Clustering Analysis Dictionary (CAD). The IPAD together with the corresponding soft-thresholds is designed to pass the key information from the previous layer to the next layer, while the CAD together with the corresponding soft-thresholding operator is designed to produce a sparse feature representation of its input data that facilitates discrimination of key features. Simulation results show that the proposed deep analysis dictionary model achieves comparable performance with a Deep Neural Network which has the same structure and is optimized using back-propagation.
Tasks	Image Super-Resolution, Super-Resolution
Published	2020-01-31
URL	https://arxiv.org/abs/2001.12010v1
PDF	https://arxiv.org/pdf/2001.12010v1.pdf
PWC	https://paperswithcode.com/paper/learning-deep-analysis-dictionaries-part-i
Repo
Framework

FSD-10: A Dataset for Competitive Sports Content Analysis


Title	FSD-10: A Dataset for Competitive Sports Content Analysis
Authors	Shenlan Liu, Xiang Liu, Gao Huang, Lin Feng, Lianyu Hu, Dong Jiang, Aibin Zhang, Yang Liu, Hong Qiao
Abstract	Action recognition is an important and challenging problem in video analysis. Although the past decade has witnessed progress in action recognition with the development of deep learning, such process has been slow in competitive sports content analysis. To promote the research on action recognition from competitive sports video clips, we introduce a Figure Skating Dataset (FSD-10) for finegrained sports content analysis. To this end, we collect 1484 clips from the worldwide figure skating championships in 2017-2018, which consist of 10 different actions in men/ladies programs. Each clip is at a rate of 30 frames per second with resolution 1080 $\times$ 720. These clips are then annotated by experts in type, grade of execution, skater info, .etc. To build a baseline for action recognition in figure skating, we evaluate state-of-the-art action recognition methods on FSD-10. Motivated by the idea that domain knowledge is of great concern in sports field, we propose a keyframe based temporal segment network (KTSN) for classification and achieve remarkable performance. Experimental results demonstrate that FSD-10 is an ideal dataset for benchmarking action recognition algorithms, as it requires to accurately extract action motions rather than action poses. We hope FSD-10, which is designed to have a large collection of finegrained actions, can serve as a new challenge to develop more robust and advanced action recognition models.
Tasks
Published	2020-02-09
URL	https://arxiv.org/abs/2002.03312v1
PDF	https://arxiv.org/pdf/2002.03312v1.pdf
PWC	https://paperswithcode.com/paper/fsd-10-a-dataset-for-competitive-sports
Repo
Framework

Differentiable Molecular Simulations for Control and Learning


Title	Differentiable Molecular Simulations for Control and Learning
Authors	Wujie Wang, Simon Axelrod, Rafael Gómez-Bombarelli
Abstract	Molecular dynamics simulations use statistical mechanics at the atomistic scale to enable both the elucidation of fundamental mechanisms and the engineering of matter for desired tasks. The behavior of molecular systems at the microscale is typically simulated with differential equations parameterized by a Hamiltonian, or energy function. The Hamiltonian describes the state of the system and its interactions with the environment. In order to derive predictive microscopic models, one wishes to infer a molecular Hamiltonian that agrees with observed macroscopic quantities. From the perspective of engineering, one wishes to control the Hamiltonian to achieve desired simulation outcomes and structures, as in self-assembly and optical control, to then realize systems with the desired Hamiltonian in the lab. In both cases, the goal is to modify the Hamiltonian such that emergent properties of the simulated system match a given target. We demonstrate how this can be achieved using differentiable simulations where bulk target observables and simulation outcomes can be analytically differentiated with respect to Hamiltonians, opening up new routes for parameterizing Hamiltonians to infer macroscopic models and develop control protocols.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2003.00868v1
PDF	https://arxiv.org/pdf/2003.00868v1.pdf
PWC	https://paperswithcode.com/paper/differentiable-molecular-simulations-for
Repo
Framework

Temporal Convolutional Attention-based Network For Sequence Modeling


Title	Temporal Convolutional Attention-based Network For Sequence Modeling
Authors	Hongyan Hao, Yan Wang, Yudi Xia, Jian Zhao, Furao Shen
Abstract	With the development of feed-forward models, the default model for sequence modeling has gradually evolved to replace recurrent networks. Many powerful feed-forward models based on convolutional networks and attention mechanism were proposed and show more potential to handle sequence modeling tasks. We wonder that is there an architecture that can not only achieve an approximate substitution of recurrent network, but also absorb the advantages of feed-forward models. So we propose an exploratory architecture referred to Temporal Convolutional Attention-based Network (TCAN) which combines temporal convolutional network and attention mechanism. TCAN includes two parts, one is Temporal Attention (TA) which captures relevant features inside the sequence, the other is Enhanced Residual (ER) which extracts shallow layer’s important information and transfers to deep layers. We improve the state-of-the-art results of bpc/perplexity to 26.92 on word-level PTB, 1.043 on character-level PTB, and 6.66 on WikiText-2.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12530v2
PDF	https://arxiv.org/pdf/2002.12530v2.pdf
PWC	https://paperswithcode.com/paper/temporal-convolutional-attention-based
Repo
Framework

Energy-based Periodicity Mining with Deep Features for Action Repetition Counting in Unconstrained Videos


Title	Energy-based Periodicity Mining with Deep Features for Action Repetition Counting in Unconstrained Videos
Authors	Jianqin Yin, Yanchun Wu, Huaping Liu, Yonghao Dang, Zhiyi Liu, Jun Liu
Abstract	Action repetition counting is to estimate the occurrence times of the repetitive motion in one action, which is a relatively new, important but challenging measurement problem. To solve this problem, we propose a new method superior to the traditional ways in two aspects, without preprocessing and applicable for arbitrary periodicity actions. Without preprocessing, the proposed model makes our method convenient for real applications; processing the arbitrary periodicity action makes our model more suitable for the actual circumstance. In terms of methodology, firstly, we analyze the movement patterns of the repetitive actions based on the spatial and temporal features of actions extracted by deep ConvNets; Secondly, the Principal Component Analysis algorithm is used to generate the intuitive periodic information from the chaotic high-dimensional deep features; Thirdly, the periodicity is mined based on the high-energy rule using Fourier transform; Finally, the inverse Fourier transform with a multi-stage threshold filter is proposed to improve the quality of the mined periodicity, and peak detection is introduced to finish the repetition counting. Our work features two-fold: 1) An important insight that deep features extracted for action recognition can well model the self-similarity periodicity of the repetitive action is presented. 2) A high-energy based periodicity mining rule using deep features is presented, which can process arbitrary actions without preprocessing. Experimental results show that our method achieves comparable results on the public datasets YT Segments and QUVA.
Tasks
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06838v1
PDF	https://arxiv.org/pdf/2003.06838v1.pdf
PWC	https://paperswithcode.com/paper/energy-based-periodicity-mining-with-deep
Repo
Framework

Domain Independent Unsupervised Learning to grasp the Novel Objects


Title	Domain Independent Unsupervised Learning to grasp the Novel Objects
Authors	Siddhartha Vibhu Pharswan, Mohit Vohra, Ashish Kumar, Laxmidhar Behera
Abstract	One of the main challenges in the vision-based grasping is the selection of feasible grasp regions while interacting with novel objects. Recent approaches exploit the power of the convolutional neural network (CNN) to achieve accurate grasping at the cost of high computational power and time. In this paper, we present a novel unsupervised learning based algorithm for the selection of feasible grasp regions. Unsupervised learning infers the pattern in data-set without any external labels. We apply k-means clustering on the image plane to identify the grasp regions, followed by an axis assignment method. We define a novel concept of Grasp Decide Index (GDI) to select the best grasp pose in image plane. We have conducted several experiments in clutter or isolated environment on standard objects of Amazon Robotics Challenge 2017 and Amazon Picking Challenge 2016. We compare the results with prior learning based approaches to validate the robustness and adaptive nature of our algorithm for a variety of novel objects in different domains.
Tasks
Published	2020-01-09
URL	https://arxiv.org/abs/2001.05856v1
PDF	https://arxiv.org/pdf/2001.05856v1.pdf
PWC	https://paperswithcode.com/paper/domain-independent-unsupervised-learning-to
Repo
Framework