February 2, 2020

3201 words 16 mins read

Paper Group AWR 36

Paper Group AWR 36

Artificial Neural Network Modeling for Path Loss Prediction in Urban Environments. LambdaOpt: Learn to Regularize Recommender Models in Finer Levels. Handling correlated and repeated measurements with the smoothed Multivariate square-root Lasso. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation. Evaluating …

Artificial Neural Network Modeling for Path Loss Prediction in Urban Environments

Title Artificial Neural Network Modeling for Path Loss Prediction in Urban Environments
Authors Chanshin Park, Daniel K. Tettey, Han-Shin Jo
Abstract Although various linear log-distance path loss models have been developed, advanced models are requiring to more accurately and flexibly represent the path loss for complex environments such as the urban area. This letter proposes an artificial neural network (ANN) based multi-dimensional regression framework for path loss modeling in urban environments at 3 to 6 GHz frequency band. ANN is used to learn the path loss structure from the measured path loss data which is a function of distance and frequency. The effect of the network architecture parameter (activation function, the number of hidden layers and nodes) on the prediction accuracy are analyzed. We observe that the proposed model is more accurate and flexible compared to the conventional linear model.
Tasks
Published 2019-04-04
URL http://arxiv.org/abs/1904.02383v1
PDF http://arxiv.org/pdf/1904.02383v1.pdf
PWC https://paperswithcode.com/paper/artificial-neural-network-modeling-for-path
Repo https://github.com/chanship/pathloss
Framework none

LambdaOpt: Learn to Regularize Recommender Models in Finer Levels

Title LambdaOpt: Learn to Regularize Recommender Models in Finer Levels
Authors Yihong Chen, Bei Chen, Xiangnan He, Chen Gao, Yong Li, Jian-Guang Lou, Yue Wang
Abstract Recommendation models mainly deal with categorical variables, such as user/item ID and attributes. Besides the high-cardinality issue, the interactions among such categorical variables are usually long-tailed, with the head made up of highly frequent values and a long tail of rare ones. This phenomenon results in the data sparsity issue, making it essential to regularize the models to ensure generalization. The common practice is to employ grid search to manually tune regularization hyperparameters based on the validation data. However, it requires non-trivial efforts and large computation resources to search the whole candidate space; even so, it may not lead to the optimal choice, for which different parameters should have different regularization strengths. In this paper, we propose a hyperparameter optimization method, LambdaOpt, which automatically and adaptively enforces regularization during training. Specifically, it updates the regularization coefficients based on the performance of validation data. With LambdaOpt, the notorious tuning of regularization hyperparameters can be avoided; more importantly, it allows fine-grained regularization (i.e. each parameter can have an individualized regularization coefficient), leading to better generalized models. We show how to employ LambdaOpt on matrix factorization, a classical model that is representative of a large family of recommender models. Extensive experiments on two public benchmarks demonstrate the superiority of our method in boosting the performance of top-K recommendation.
Tasks Hyperparameter Optimization, Recommendation Systems
Published 2019-05-28
URL https://arxiv.org/abs/1905.11596v1
PDF https://arxiv.org/pdf/1905.11596v1.pdf
PWC https://paperswithcode.com/paper/lambdaopt-learn-to-regularize-recommender
Repo https://github.com/LaceyChen17/lambda-opt
Framework pytorch

Handling correlated and repeated measurements with the smoothed Multivariate square-root Lasso

Title Handling correlated and repeated measurements with the smoothed Multivariate square-root Lasso
Authors Quentin Bertrand, Mathurin Massias, Alexandre Gramfort, Joseph Salmon
Abstract Sparsity promoting norms are frequently used in high dimensional regression. A limitation of such Lasso-type estimators is that the optimal regularization parameter depends on the unknown noise level. Estimators such as the concomitant Lasso address this dependence by jointly estimating the noise level and the regression coefficients. Additionally, in many applications, the data is obtained by averaging multiple measurements: this reduces the noise variance, but it dramatically reduces sample sizes and prevents refined noise modeling. In this work, we propose a concomitant estimator that can cope with complex noise structure by using non-averaged measurements. The resulting optimization problem is convex and amenable, thanks to smoothing theory, to state-of-the-art optimization techniques that leverage the sparsity of the solutions. Practical benefits are demonstrated on toy datasets, realistic simulated data and real neuroimaging data.
Tasks
Published 2019-02-07
URL https://arxiv.org/abs/1902.02509v3
PDF https://arxiv.org/pdf/1902.02509v3.pdf
PWC https://paperswithcode.com/paper/concomitant-lasso-with-repetitions-clar
Repo https://github.com/QB3/CLaR
Framework none

The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation

Title The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
Authors Junjie Huang, Zheng Zhu, Feng Guo, Guan Huang
Abstract Recently, the leading performance of human pose estimation is dominated by top-down methods. Being a fundamental component in training and inference, data processing has not been systematically considered in pose estimation community, to the best of our knowledge. In this paper, we focus on this problem and find that the devil of top-down pose estimator is in the biased data processing. Specifically, by investigating the standard data processing in state-of-the-art approaches mainly including data transformation and encoding-decoding, we find that the results obtained by common flipping strategy are unaligned with the original ones in inference. Moreover, there is statistical error in standard encoding-decoding during both training and inference. Two problems couple together and significantly degrade the pose estimation performance. Based on quantitative analyses, we then formulate a principled way to tackle this dilemma. Data is processed based on unit length instead of pixel, and an offset-based strategy is adopted to perform encoding-decoding. The Unbiased Data Processing (UDP) for human pose estimation can be achieved by combining the two together. UDP not only boosts the performance of existing methods by a large margin but also plays a important role in result reproducing and future exploration. As a model-agnostic approach, UDP promotes SimpleBaseline-ResNet-50-256x192 by 1.5 AP (70.2 to 71.7) and HRNet-W32-256x192 by 1.7 AP (73.5 to 75.2) on COCO test-dev set. The HRNet-W48-384x288 equipped with UDP achieves 76.5 AP and sets a new state-of-the-art for human pose estimation. The code will be released.
Tasks Pose Estimation
Published 2019-11-18
URL https://arxiv.org/abs/1911.07524v1
PDF https://arxiv.org/pdf/1911.07524v1.pdf
PWC https://paperswithcode.com/paper/the-devil-is-in-the-details-delving-into
Repo https://github.com/HuangJunJie2017/UDP-Pose
Framework mxnet

Evaluating Differentially Private Machine Learning in Practice

Title Evaluating Differentially Private Machine Learning in Practice
Authors Bargav Jayaraman, David Evans
Abstract Differential privacy is a strong notion for privacy that can be used to prove formal guarantees, in terms of a privacy budget, $\epsilon$, about how much information is leaked by a mechanism. However, implementations of privacy-preserving machine learning often select large values of $\epsilon$ in order to get acceptable utility of the model, with little understanding of the impact of such choices on meaningful privacy. Moreover, in scenarios where iterative learning procedures are used, differential privacy variants that offer tighter analyses are used which appear to reduce the needed privacy budget but present poorly understood trade-offs between privacy and utility. In this paper, we quantify the impact of these choices on privacy in experiments with logistic regression and neural network models. Our main finding is that there is a huge gap between the upper bounds on privacy loss that can be guaranteed, even with advanced mechanisms, and the effective privacy loss that can be measured using current inference attacks. Current mechanisms for differentially private machine learning rarely offer acceptable utility-privacy trade-offs with guarantees for complex learning tasks: settings that provide limited accuracy loss provide meaningless privacy guarantees, and settings that provide strong privacy guarantees result in useless models. Code for the experiments can be found here: https://github.com/bargavj/EvaluatingDPML
Tasks Calibration
Published 2019-02-24
URL https://arxiv.org/abs/1902.08874v4
PDF https://arxiv.org/pdf/1902.08874v4.pdf
PWC https://paperswithcode.com/paper/when-relaxations-go-bad-differentially
Repo https://github.com/bargavj/EvaluatingDPML
Framework tf

Musical Instrument Playing Technique Detection Based on FCN: Using Chinese Bowed-Stringed Instrument as an Example

Title Musical Instrument Playing Technique Detection Based on FCN: Using Chinese Bowed-Stringed Instrument as an Example
Authors Zehao Wang, Jingru Li, Xiaoou Chen, Zijin Li, Shicheng Zhang, Baoqiang Han, Deshun Yang
Abstract Unlike melody extraction and other aspects of music transcription, research on playing technique detection is still in its early stages. Compared to existing work mostly focused on playing technique detection for individual single notes, we propose a general end-to-end method based on Sound Event Detection by FCN for musical instrument playing technique detection. In our case, we choose Erhu, a well-known Chinese bowed-stringed instrument, to experiment with our method. Because of the limitation of FCN, we present an algorithm to detect on variable length audio. The effectiveness of the proposed framework is tested on a new dataset, its categorization of techniques is similar to our training dataset. The highest accuracy of our 3 experiments on the new test set is 87.31%. Furthermore, we also evaluate the performance of the proposed framework on 10 real-world studio music (produced by midi) and 7 real-world recording samples to address the ability of generalization on our model.
Tasks Melody Extraction, Sound Event Detection
Published 2019-10-20
URL https://arxiv.org/abs/1910.09021v1
PDF https://arxiv.org/pdf/1910.09021v1.pdf
PWC https://paperswithcode.com/paper/musical-instrument-playing-technique
Repo https://github.com/water45wzh/MIPTD_Erhu
Framework pytorch

Attribute-Guided Sketch Generation

Title Attribute-Guided Sketch Generation
Authors Hao Tang, Xinya Chen, Wei Wang, Dan Xu, Jason J. Corso, Nicu Sebe, Yan Yan
Abstract Facial attributes are important since they provide a detailed description and determine the visual appearance of human faces. In this paper, we aim at converting a face image to a sketch while simultaneously generating facial attributes. To this end, we propose a novel Attribute-Guided Sketch Generative Adversarial Network (ASGAN) which is an end-to-end framework and contains two pairs of generators and discriminators, one of which is used to generate faces with attributes while the other one is employed for image-to-sketch translation. The two generators form a W-shaped network (W-net) and they are trained jointly with a weight-sharing constraint. Additionally, we also propose two novel discriminators, the residual one focusing on attribute generation and the triplex one helping to generate realistic looking sketches. To validate our model, we have created a new large dataset with 8,804 images, named the Attribute Face Photo & Sketch (AFPS) dataset which is the first dataset containing attributes associated to face sketch images. The experimental results demonstrate that the proposed network (i) generates more photo-realistic faces with sharper facial attributes than baselines and (ii) has good generalization capability on different generative tasks.
Tasks
Published 2019-01-28
URL http://arxiv.org/abs/1901.09774v2
PDF http://arxiv.org/pdf/1901.09774v2.pdf
PWC https://paperswithcode.com/paper/attribute-guided-sketch-generation
Repo https://github.com/Ha0Tang/ASGAN
Framework pytorch

CE-Net: Context Encoder Network for 2D Medical Image Segmentation

Title CE-Net: Context Encoder Network for 2D Medical Image Segmentation
Authors Zaiwang Gu, Jun Cheng, Huazhu Fu, Kang Zhou, Huaying Hao, Yitian Zhao, Tianyang Zhang, Shenghua Gao, Jiang Liu
Abstract Medical image segmentation is an important step in medical image analysis. With the rapid development of convolutional neural network in image processing, deep learning has been used for medical image segmentation, such as optic disc segmentation, blood vessel detection, lung segmentation, cell segmentation, etc. Previously, U-net based approaches have been proposed. However, the consecutive pooling and strided convolutional operations lead to the loss of some spatial information. In this paper, we propose a context encoder network (referred to as CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation. CE-Net mainly contains three major components: a feature encoder module, a context extractor and a feature decoder module. We use pretrained ResNet block as the fixed feature extractor. The context extractor module is formed by a newly proposed dense atrous convolution (DAC) block and residual multi-kernel pooling (RMP) block. We applied the proposed CE-Net to different 2D medical image segmentation tasks. Comprehensive results show that the proposed method outperforms the original U-Net method and other state-of-the-art methods for optic disc segmentation, vessel detection, lung segmentation, cell contour segmentation and retinal optical coherence tomography layer segmentation.
Tasks Cell Segmentation, Medical Image Segmentation, Semantic Segmentation
Published 2019-03-07
URL http://arxiv.org/abs/1903.02740v1
PDF http://arxiv.org/pdf/1903.02740v1.pdf
PWC https://paperswithcode.com/paper/ce-net-context-encoder-network-for-2d-medical
Repo https://github.com/HzFu/MNet_DeepCDR
Framework tf

Multi-View Reinforcement Learning

Title Multi-View Reinforcement Learning
Authors Minne Li, Lisheng Wu, Haitham Bou Ammar, Jun Wang
Abstract This paper is concerned with multi-view reinforcement learning (MVRL), which allows for decision making when agents share common dynamics but adhere to different observation models. We define the MVRL framework by extending partially observable Markov decision processes (POMDPs) to support more than one observation model and propose two solution methods through observation augmentation and cross-view policy transfer. We empirically evaluate our method and demonstrate its effectiveness in a variety of environments. Specifically, we show reductions in sample complexities and computational time for acquiring policies that handle multi-view environments.
Tasks Decision Making
Published 2019-10-18
URL https://arxiv.org/abs/1910.08285v1
PDF https://arxiv.org/pdf/1910.08285v1.pdf
PWC https://paperswithcode.com/paper/multi-view-reinforcement-learning
Repo https://github.com/mlii/mvrl
Framework tf

Physics-as-Inverse-Graphics: Joint Unsupervised Learning of Objects and Physics from Video

Title Physics-as-Inverse-Graphics: Joint Unsupervised Learning of Objects and Physics from Video
Authors Miguel Jaques, Michael Burke, Timothy Hospedales
Abstract We aim to perform unsupervised discovery of objects and their states such as location and velocity, as well as physical system parameters such as mass and gravity from video – given only the differential equations governing the scene dynamics. Existing physical scene understanding methods require either object state supervision, or do not integrate with differentiable physics to learn interpretable system parameters and states. We address this problem through a $\textit{physics-as-inverse-graphics}$ approach that brings together vision-as-inverse-graphics and differentiable physics engines. This framework allows us to perform long term extrapolative video prediction, as well as vision-based model-predictive control. Our approach significantly outperforms related unsupervised methods in long-term future frame prediction of systems with interacting objects (such as ball-spring or 3-body gravitational systems). We further show the value of this tight vision-physics integration by demonstrating data-efficient learning of vision-actuated model-based control for a pendulum system. The controller’s interpretability also provides unique capabilities in goal-driven control and physical reasoning for zero-data adaptation.
Tasks Scene Understanding, Video Prediction
Published 2019-05-27
URL https://arxiv.org/abs/1905.11169v1
PDF https://arxiv.org/pdf/1905.11169v1.pdf
PWC https://paperswithcode.com/paper/physics-as-inverse-graphics-joint
Repo https://github.com/seuqaj114/paig
Framework tf

Learning to Self-Train for Semi-Supervised Few-Shot Classification

Title Learning to Self-Train for Semi-Supervised Few-Shot Classification
Authors Xinzhe Li, Qianru Sun, Yaoyao Liu, Shibao Zheng, Qin Zhou, Tat-Seng Chua, Bernt Schiele
Abstract Few-shot classification (FSC) is challenging due to the scarcity of labeled training data (e.g. only one labeled data point per class). Meta-learning has shown to achieve promising results by learning to initialize a classification model for FSC. In this paper we propose a novel semi-supervised meta-learning method called learning to self-train (LST) that leverages unlabeled data and specifically meta-learns how to cherry-pick and label such unsupervised data to further improve performance. To this end, we train the LST model through a large number of semi-supervised few-shot tasks. On each task, we train a few-shot model to predict pseudo labels for unlabeled data, and then iterate the self-training steps on labeled and pseudo-labeled data with each step followed by fine-tuning. We additionally learn a soft weighting network (SWN) to optimize the self-training weights of pseudo labels so that better ones can contribute more to gradient descent optimization. We evaluate our LST method on two ImageNet benchmarks for semi-supervised few-shot classification and achieve large improvements over the state-of-the-art method. Code is at https://github.com/xinzheli1217/learning-to-self-train.
Tasks Meta-Learning
Published 2019-06-03
URL https://arxiv.org/abs/1906.00562v2
PDF https://arxiv.org/pdf/1906.00562v2.pdf
PWC https://paperswithcode.com/paper/190600562
Repo https://github.com/xinzheli1217/learning-to-self-train
Framework tf

Adversarially Learned Abnormal Trajectory Classifier

Title Adversarially Learned Abnormal Trajectory Classifier
Authors Pankaj Raj Roy, Guillaume-Alexandre Bilodeau
Abstract We address the problem of abnormal event detection from trajectory data. In this paper, a new adversarial approach is proposed for building a deep neural network binary classifier, trained in an unsupervised fashion, that can distinguish normal from abnormal trajectory-based events without the need for setting manual detection threshold. Inspired by the generative adversarial network (GAN) framework, our GAN version is a discriminative one in which the discriminator is trained to distinguish normal and abnormal trajectory reconstruction errors given by a deep autoencoder. With urban traffic videos and their associated trajectories, our proposed method gives the best accuracy for abnormal trajectory detection. In addition, our model can easily be generalized for abnormal trajectory-based event detection and can still yield the best behavioural detection results as demonstrated on the CAVIAR dataset.
Tasks
Published 2019-03-26
URL http://arxiv.org/abs/1903.11040v2
PDF http://arxiv.org/pdf/1903.11040v2.pdf
PWC https://paperswithcode.com/paper/adversarially-learned-abnormal-trajectory
Repo https://github.com/proy3/Abnormal_Trajectory_Classifier
Framework tf

An empirical comparison between stochastic and deterministic centroid initialisation for K-Means variations

Title An empirical comparison between stochastic and deterministic centroid initialisation for K-Means variations
Authors Avgoustinos Vouros, Stephen Langdell, Mike Croucher, Eleni Vasilaki
Abstract K-Means is one of the most used algorithms for data clustering and the usual clustering method for benchmarking. Despite its wide application it is well-known that it suffers from a series of disadvantages, such as the positions of the initial clustering centres (centroids), which can greatly affect the clustering solution. Over the years many K-Means variations and initialisations techniques have been proposed with different degrees of complexity. In this study we focus on common K-Means variations and deterministic initialisation techniques and we first show that more sophisticated initialisation methods reduce or alleviates the need of complex K-Means clustering, and secondly, that deterministic methods can achieve equivalent or better performance than stochastic methods. These conclusions are obtained through extensive benchmarking using different model data sets from various studies as well as clustering data sets.
Tasks
Published 2019-08-26
URL https://arxiv.org/abs/1908.09946v4
PDF https://arxiv.org/pdf/1908.09946v4.pdf
PWC https://paperswithcode.com/paper/an-empirical-comparison-between-stochastic
Repo https://github.com/avouros/clustering-workplace
Framework none

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Title vGraph: A Generative Model for Joint Community Detection and Node Representation Learning
Authors Fan-Yun Sun, Meng Qu, Jordan Hoffmann, Chin-Wei Huang, Jian Tang
Abstract This paper focuses on two fundamental tasks of graph analysis: community detection and node representation learning, which capture the global and local structures of graphs, respectively. In the current literature, these two tasks are usually independently studied while they are actually highly correlated. We propose a probabilistic generative model called vGraph to learn community membership and node representation collaboratively. Specifically, we assume that each node can be represented as a mixture of communities, and each community is defined as a multinomial distribution over nodes. Both the mixing coefficients and the community distribution are parameterized by the low-dimensional representations of the nodes and communities. We designed an effective variational inference algorithm which regularizes the community membership of neighboring nodes to be similar in the latent space. Experimental results on multiple real-world graphs show that vGraph is very effective in both community detection and node representation learning, outperforming many competitive baselines in both tasks. We show that the framework of vGraph is quite flexible and can be easily extended to detect hierarchical communities.
Tasks Community Detection, Representation Learning
Published 2019-06-18
URL https://arxiv.org/abs/1906.07159v2
PDF https://arxiv.org/pdf/1906.07159v2.pdf
PWC https://paperswithcode.com/paper/vgraph-a-generative-model-for-joint-community
Repo https://github.com/fanyun-sun/vGraph
Framework pytorch

Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions

Title Bi-Directional ConvLSTM U-Net with Densley Connected Convolutions
Authors Reza Azad, Maryam Asadi-Aghbolaghi, Mahmood Fathy, Sergio Escalera
Abstract In recent years, deep learning-based networks have achieved state-of-the-art performance in medical image segmentation. Among the existing networks, U-Net has been successfully applied on medical image segmentation. In this paper, we propose an extension of U-Net, Bi-directional ConvLSTM U-Net with Densely connected convolutions (BCDU-Net), for medical image segmentation, in which we take full advantages of U-Net, bi-directional ConvLSTM (BConvLSTM) and the mechanism of dense convolutions. Instead of a simple concatenation in the skip connection of U-Net, we employ BConvLSTM to combine the feature maps extracted from the corresponding encoding path and the previous decoding up-convolutional layer in a non-linear way. To strengthen feature propagation and encourage feature reuse, we use densely connected convolutions in the last convolutional layer of the encoding path. Finally, we can accelerate the convergence speed of the proposed network by employing batch normalization (BN). The proposed model is evaluated on three datasets of: retinal blood vessel segmentation, skin lesion segmentation, and lung nodule segmentation, achieving state-of-the-art performance.
Tasks Lesion Segmentation, Lung Nodule Segmentation, Medical Image Segmentation, Semantic Segmentation
Published 2019-08-31
URL https://arxiv.org/abs/1909.00166v1
PDF https://arxiv.org/pdf/1909.00166v1.pdf
PWC https://paperswithcode.com/paper/bi-directional-convlstm-u-net-with-densley
Repo https://github.com/rezazad68/BCDU-Net
Framework tf
comments powered by Disqus