February 1, 2020

2997 words 15 mins read

Paper Group AWR 237

Paper Group AWR 237

Star-Transformer. Estimating Solar Irradiance Using Sky Imagers. Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search. Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach. Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation. Joint Learning of Saliency Detecti …

Star-Transformer

Title Star-Transformer
Authors Qipeng Guo, Xipeng Qiu, Pengfei Liu, Yunfan Shao, Xiangyang Xue, Zheng Zhang
Abstract Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data. In this paper, we present Star-Transformer, a lightweight alternative by careful sparsification. To reduce model complexity, we replace the fully-connected structure with a star-shaped topology, in which every two non-adjacent nodes are connected through a shared relay node. Thus, complexity is reduced from quadratic to linear, while preserving capacity to capture both local composition and long-range dependency. The experiments on four tasks (22 datasets) show that Star-Transformer achieved significant improvements against the standard Transformer for the modestly sized datasets.
Tasks Named Entity Recognition, Natural Language Inference, Sentiment Analysis, Text Classification
Published 2019-02-25
URL http://arxiv.org/abs/1902.09113v2
PDF http://arxiv.org/pdf/1902.09113v2.pdf
PWC https://paperswithcode.com/paper/star-transformer
Repo https://github.com/fastnlp/fastNLP
Framework pytorch

Estimating Solar Irradiance Using Sky Imagers

Title Estimating Solar Irradiance Using Sky Imagers
Authors Soumyabrata Dev, Florian M. Savoy, Yee Hui Lee, Stefan Winkler
Abstract Ground-based whole sky cameras are extensively used for localized monitoring of clouds nowadays. They capture hemispherical images of the sky at regular intervals using a fisheye lens. In this paper, we propose a framework for estimating solar irradiance from pictures taken by those imagers. Unlike pyranometers, such sky images contain information about cloud coverage and can be used to derive cloud movement. An accurate estimation of solar irradiance using solely those images is thus a first step towards short-term forecasting of solar energy generation based on cloud movement. We derive and validate our model using pyranometers co-located with our whole sky imagers. We achieve a better performance in estimating solar irradiance and in particular its short-term variations as compared to other related methods using ground-based observations.
Tasks
Published 2019-10-11
URL https://arxiv.org/abs/1910.04981v1
PDF https://arxiv.org/pdf/1910.04981v1.pdf
PWC https://paperswithcode.com/paper/estimating-solar-irradiance-using-sky-imagers
Repo https://github.com/Soumyabrata/estimate-solar-irradiance
Framework none
Title Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search
Authors Xiangxiang Chu, Bo Zhang, Ruijun Xu, Hailong Ma
Abstract Fabricating neural models for a wide range of mobile devices demands for a specific design of networks due to highly constrained resources. Both evolution algorithms (EA) and reinforced learning methods (RL) have been dedicated to solve neural architecture search problems. However, these combinations usually concentrate on a single objective such as the error rate of image classification. They also fail to harness the very benefits from both sides. In this paper, we present a new multi-objective oriented algorithm called MoreMNAS (Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search) by leveraging good virtues from both EA and RL. In particular, we incorporate a variant of multi-objective genetic algorithm NSGA-II, in which the search space is composed of various cells so that crossovers and mutations can be performed at the cell level. Moreover, reinforced control is mixed with a natural mutating process to regulate arbitrary mutation, maintaining a delicate balance between exploration and exploitation. Therefore, not only does our method prevent the searched models from degrading during the evolution process, but it also makes better use of learned knowledge. Our experiments conducted in Super-resolution domain (SR) deliver rivalling models compared to some state-of-the-art methods with fewer FLOPS.
Tasks Image Classification, Neural Architecture Search, Super-Resolution
Published 2019-01-04
URL http://arxiv.org/abs/1901.01074v3
PDF http://arxiv.org/pdf/1901.01074v3.pdf
PWC https://paperswithcode.com/paper/multi-objective-reinforced-evolution-in
Repo https://github.com/moremnas/MoreMNAS
Framework tf

Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach

Title Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach
Authors Bingfeng Zhang, Jimin Xiao, Yunchao Wei, Mingjie Sun, Kaizhu Huang
Abstract Weakly supervised semantic segmentation is a challenging task as it only takes image-level information as supervision for training but produces pixel-level predictions for testing. To address such a challenging task, most recent state-of-the-art approaches propose to adopt two-step solutions, \emph{i.e. } 1) learn to generate pseudo pixel-level masks, and 2) engage FCNs to train the semantic segmentation networks with the pseudo masks. However, the two-step solutions usually employ many bells and whistles in producing high-quality pseudo masks, making this kind of methods complicated and inelegant. In this work, we harness the image-level labels to produce reliable pixel-level annotations and design a fully end-to-end network to learn to predict segmentation maps. Concretely, we firstly leverage an image classification branch to generate class activation maps for the annotated categories, which are further pruned into confident yet tiny object/background regions. Such reliable regions are then directly served as ground-truth labels for the parallel segmentation branch, where a newly designed dense energy loss function is adopted for optimization. Despite its apparent simplicity, our one-step solution achieves competitive mIoU scores (\emph{val}: 62.6, \emph{test}: 62.9) on Pascal VOC compared with those two-step state-of-the-arts. By extending our one-step method to two-step, we get a new state-of-the-art performance on the Pascal VOC (\emph{val}: 66.3, \emph{test}: 66.5).
Tasks Image Classification, Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published 2019-11-19
URL https://arxiv.org/abs/1911.08039v1
PDF https://arxiv.org/pdf/1911.08039v1.pdf
PWC https://paperswithcode.com/paper/reliability-does-matter-an-end-to-end-weakly
Repo https://github.com/zbf1991/RRM
Framework pytorch

Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation

Title Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation
Authors Yude Wang, Jie Zhang, Meina Kan, Shiguang Shan, Xilin Chen
Abstract Weakly supervised semantic segmentation has attracted much research interest in recent years considering its advantage of low labeling cost. Most of the advanced algorithms follow the design principle that expands and constrains the seed regions from class activation maps (CAM). As well-known, conventional CAM tends to be incomplete or over-activated due to weak supervision. Fortunately, we find that semantic segmentation has a characteristic of spatial transformation equivariance, which can form a few self-supervisions to help weakly supervised learning. This work mainly explores the advantages of scale equivariant constrains for CAM generation, formulated as a self-supervised scale equivariant network (SSENet). Specifically, a novel scale equivariant regularization is elaborately designed to ensure consistency of CAMs from the same input image with different resolutions. This novel scale equivariant regularization can guide the whole network to learn more accurate class activation. This regularized CAM can be embedded in most recent advanced weakly supervised semantic segmentation framework. Extensive experiments on PASCAL VOC 2012 datasets demonstrate that our method achieves the state-of-the-art performance both quantitatively and qualitatively for weakly supervised semantic segmentation. Code has been made available.
Tasks Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published 2019-09-09
URL https://arxiv.org/abs/1909.03714v1
PDF https://arxiv.org/pdf/1909.03714v1.pdf
PWC https://paperswithcode.com/paper/self-supervised-scale-equivariant-network-for
Repo https://github.com/YudeWang/SSENet-pytorch
Framework pytorch

Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation

Title Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation
Authors Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang
Abstract Existing weakly supervised semantic segmentation (WSSS) methods usually utilize the results of pre-trained saliency detection (SD) models without explicitly modeling the connections between the two tasks, which is not the most efficient configuration. Here we propose a unified multi-task learning framework to jointly solve WSSS and SD using a single network, \ie saliency, and segmentation network (SSNet). SSNet consists of a segmentation network (SN) and a saliency aggregation module (SAM). For an input image, SN generates the segmentation result and, SAM predicts the saliency of each category and aggregating the segmentation masks of all categories into a saliency map. The proposed network is trained end-to-end with image-level category labels and class-agnostic pixel-level saliency labels. Experiments on PASCAL VOC 2012 segmentation dataset and four saliency benchmark datasets show the performance of our method compares favorably against state-of-the-art weakly supervised segmentation methods and fully supervised saliency detection methods.
Tasks Multi-Task Learning, Saliency Detection, Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published 2019-09-09
URL https://arxiv.org/abs/1909.04161v1
PDF https://arxiv.org/pdf/1909.04161v1.pdf
PWC https://paperswithcode.com/paper/joint-learning-of-saliency-detection-and
Repo https://github.com/zengxianyu/jsws
Framework none

Sparse Reduced-Rank Regression for Simultaneous Rank and Variable Selection via Manifold Optimization

Title Sparse Reduced-Rank Regression for Simultaneous Rank and Variable Selection via Manifold Optimization
Authors Kohei Yoshikawa, Shuichi Kawano
Abstract We consider the problem of constructing a reduced-rank regression model whose coefficient parameter is represented as a singular value decomposition with sparse singular vectors. The traditional estimation procedure for the coefficient parameter often fails when the true rank of the parameter is high. To overcome this issue, we develop an estimation algorithm with rank and variable selection via sparse regularization and manifold optimization, which enables us to obtain an accurate estimation of the coefficient parameter even if the true rank of the coefficient parameter is high. Using sparse regularization, we can also select an optimal value of the rank. We conduct Monte Carlo experiments and real data analysis to illustrate the effectiveness of our proposed method.
Tasks
Published 2019-10-11
URL https://arxiv.org/abs/1910.05083v2
PDF https://arxiv.org/pdf/1910.05083v2.pdf
PWC https://paperswithcode.com/paper/sparse-reduced-rank-regression-for
Repo https://github.com/yoshikawa-kohei/RVSManOpt
Framework none

Training Agents using Upside-Down Reinforcement Learning

Title Training Agents using Upside-Down Reinforcement Learning
Authors Rupesh Kumar Srivastava, Pranav Shyam, Filipe Mutz, Wojciech Jaśkowski, Jürgen Schmidhuber
Abstract Traditional Reinforcement Learning (RL) algorithms either predict rewards with value functions or maximize them using policy search. We study an alternative: Upside-Down Reinforcement Learning (Upside-Down RL or UDRL), that solves RL problems primarily using supervised learning techniques. Many of its main principles are outlined in a companion report [34]. Here we present the first concrete implementation of UDRL and demonstrate its feasibility on certain episodic learning problems. Experimental results show that its performance can be surprisingly competitive with, and even exceed that of traditional baseline algorithms developed over decades of research.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02877v1
PDF https://arxiv.org/pdf/1912.02877v1.pdf
PWC https://paperswithcode.com/paper/training-agents-using-upside-down
Repo https://github.com/parthchadha/upsideDownRL
Framework pytorch

Urban Sound Tagging using Convolutional Neural Networks

Title Urban Sound Tagging using Convolutional Neural Networks
Authors Sainath Adapa
Abstract In this paper, we propose a framework for environmental sound classification in a low-data context (less than 100 labeled examples per class). We show that using pre-trained image classification models along with the usage of data augmentation techniques results in higher performance over alternative approaches. We applied this system to the task of Urban Sound Tagging, part of the DCASE 2019. The objective was to label different sources of noise from raw audio data. A modified form of MobileNetV2, a convolutional neural network (CNN) model was trained to classify both coarse and fine tags jointly. The proposed model uses log-scaled Mel-spectrogram as the representation format for the audio data. Mixup, Random erasing, scaling, and shifting are used as data augmentation techniques. A second model that uses scaled labels was built to account for human errors in the annotations. The proposed model achieved the first rank on the leaderboard with Micro-AUPRC values of 0.751 and 0.860 on fine and coarse tags, respectively.
Tasks Data Augmentation, Environmental Sound Classification, Image Classification
Published 2019-09-27
URL https://arxiv.org/abs/1909.12699v1
PDF https://arxiv.org/pdf/1909.12699v1.pdf
PWC https://paperswithcode.com/paper/urban-sound-tagging-using-convolutional
Repo https://github.com/sainathadapa/urban-sound-tagging
Framework pytorch

Multi-scale Dynamic Graph Convolutional Network for Hyperspectral Image Classification

Title Multi-scale Dynamic Graph Convolutional Network for Hyperspectral Image Classification
Authors Sheng Wan, Chen Gong, Ping Zhong, Bo Du, Lefei Zhang, Jian Yang
Abstract Convolutional Neural Network (CNN) has demonstrated impressive ability to represent hyperspectral images and to achieve promising results in hyperspectral image classification. However, traditional CNN models can only operate convolution on regular square image regions with fixed size and weights, so they cannot universally adapt to the distinct local regions with various object distributions and geometric appearances. Therefore, their classification performances are still to be improved, especially in class boundaries. To alleviate this shortcoming, we consider employing the recently proposed Graph Convolutional Network (GCN) for hyperspectral image classification, as it can conduct the convolution on arbitrarily structured non-Euclidean data and is applicable to the irregular image regions represented by graph topological information. Different from the commonly used GCN models which work on a fixed graph, we enable the graph to be dynamically updated along with the graph convolution process, so that these two steps can be benefited from each other to gradually produce the discriminative embedded features as well as a refined graph. Moreover, to comprehensively deploy the multi-scale information inherited by hyperspectral images, we establish multiple input graphs with different neighborhood scales to extensively exploit the diversified spectral-spatial correlations at multiple scales. Therefore, our method is termed ‘Multi-scale Dynamic Graph Convolutional Network’ (MDGCN). The experimental results on three typical benchmark datasets firmly demonstrate the superiority of the proposed MDGCN to other state-of-the-art methods in both qualitative and quantitative aspects.
Tasks Hyperspectral Image Classification, Image Classification
Published 2019-05-14
URL https://arxiv.org/abs/1905.06133v1
PDF https://arxiv.org/pdf/1905.06133v1.pdf
PWC https://paperswithcode.com/paper/multi-scale-dynamic-graph-convolutional
Repo https://github.com/LEAP-WS/MDGCN
Framework tf

Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization

Title Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization
Authors Eldad Meller, Alexander Finkelstein, Uri Almog, Mark Grobman
Abstract Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a given layer, individual output channels can be scaled by any factor provided that the corresponding weights of the next layer are inversely scaled. Therefore, a given network has many factorizations which change the weights of the network without changing its function. We present a conceptually simple and easy to implement method that uses this property and show that proper factorizations significantly decrease the degradation caused by quantization. We show improvement on a wide variety of networks and achieve state-of-the-art degradation results for MobileNets. While our focus is on quantization, this type of factorization is applicable to other domains such as network-pruning, neural nets regularization and network interpretability.
Tasks Network Pruning, Quantization
Published 2019-02-05
URL http://arxiv.org/abs/1902.01917v1
PDF http://arxiv.org/pdf/1902.01917v1.pdf
PWC https://paperswithcode.com/paper/same-same-but-different-recovering-neural
Repo https://github.com/Adamdad/Samesame
Framework tf

NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization

Title NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization
Authors Ali Ramezani-Kebrya, Fartash Faghri, Daniel M. Roy
Abstract As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed on clusters to perform model fitting in parallel. Alistarh et al. (2017) describe two variants of data-parallel SGD that quantize and encode gradients to lessen communication costs. For the first variant, QSGD, they provide strong theoretical guarantees. For the second variant, which we call QSGDinf, they demonstrate impressive empirical gains for distributed training of large neural networks. Building on their work, we propose an alternative scheme for quantizing gradients and show that it yields stronger theoretical guarantees than exist for QSGD while matching the empirical performance of QSGDinf.
Tasks Quantization
Published 2019-08-16
URL https://arxiv.org/abs/1908.06077v1
PDF https://arxiv.org/pdf/1908.06077v1.pdf
PWC https://paperswithcode.com/paper/nuqsgd-improved-communication-efficiency-for
Repo https://github.com/fartashf/nuqsgd
Framework pytorch

3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization

Title 3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization
Authors Tsun-Hsuan Wang, Hou-Ning Hu, Chieh Hubert Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun
Abstract The complementary characteristics of active and passive depth sensing techniques motivate the fusion of the Li-DAR sensor and stereo camera for improved depth perception. Instead of directly fusing estimated depths across LiDAR and stereo modalities, we take advantages of the stereo matching network with two enhanced techniques: Input Fusion and Conditional Cost Volume Normalization (CCVNorm) on the LiDAR information. The proposed framework is generic and closely integrated with the cost volume component that is commonly utilized in stereo matching neural networks. We experimentally verify the efficacy and robustness of our method on the KITTI Stereo and Depth Completion datasets, obtaining favorable performance against various fusion strategies. Moreover, we demonstrate that, with a hierarchical extension of CCVNorm, the proposed method brings only slight overhead to the stereo matching network in terms of computation time and model size. For project page, see https://zswang666.github.io/Stereo-LiDAR-CCVNorm-Project-Page/
Tasks Depth Completion, Stereo Matching, Stereo Matching Hand
Published 2019-04-05
URL http://arxiv.org/abs/1904.02917v1
PDF http://arxiv.org/pdf/1904.02917v1.pdf
PWC https://paperswithcode.com/paper/3d-lidar-and-stereo-fusion-using-stereo
Repo https://github.com/zswang666/Stereo-LiDAR-CCVNorm
Framework pytorch

Extending Monocular Visual Odometry to Stereo Camera Systems by Scale Optimization

Title Extending Monocular Visual Odometry to Stereo Camera Systems by Scale Optimization
Authors Jiawei Mo, Junaed Sattar
Abstract This paper proposes a novel approach for extending monocular visual odometry to a stereo camera system. The proposed method uses an additional camera to accurately estimate and optimize the scale of the monocular visual odometry, rather than triangulating 3D points from stereo matching. Specifically, the 3D points generated by the monocular visual odometry are projected onto the other camera of the stereo pair, and the scale is recovered and optimized by directly minimizing the photometric error. It is computationally efficient, adding minimal overhead to the stereo vision system compared to straightforward stereo matching, and is robust to repetitive texture. Additionally, direct scale optimization enables stereo visual odometry to be purely based on the direct method. Extensive evaluation on public datasets (e.g., KITTI), and outdoor environments (both terrestrial and underwater) demonstrates the accuracy and efficiency of a stereo visual odometry approach extended by scale optimization, and its robustness in environments with challenging textures.
Tasks Monocular Visual Odometry, Stereo Matching, Stereo Matching Hand, Visual Odometry
Published 2019-05-29
URL https://arxiv.org/abs/1905.12723v3
PDF https://arxiv.org/pdf/1905.12723v3.pdf
PWC https://paperswithcode.com/paper/extending-monocular-visual-odometry-to-stereo
Repo https://github.com/jiawei-mo/scale_optimization
Framework none

Understanding Isomorphism Bias in Graph Data Sets

Title Understanding Isomorphism Bias in Graph Data Sets
Authors Sergei Ivanov, Sergei Sviridov, Evgeny Burnaev
Abstract In recent years there has been a rapid increase in classification methods on graph structured data. Both in graph kernels and graph neural networks, one of the implicit assumptions of successful state-of-the-art models was that incorporating graph isomorphism features into the architecture leads to better empirical performance. However, as we discover in this work, commonly used data sets for graph classification have repeating instances which cause the problem of isomorphism bias, i.e. artificially increasing the accuracy of the models by memorizing target information from the training set. This prevents fair competition of the algorithms and raises a question of the validity of the obtained results. We analyze 54 data sets, previously extensively used for graph-related tasks, on the existence of isomorphism bias, give a set of recommendations to machine learning practitioners to properly set up their models, and open source new data sets for the future experiments.
Tasks Graph Classification
Published 2019-10-26
URL https://arxiv.org/abs/1910.12091v2
PDF https://arxiv.org/pdf/1910.12091v2.pdf
PWC https://paperswithcode.com/paper/understanding-isomorphism-bias-in-graph-data
Repo https://github.com/nd7141/graph_datasets
Framework pytorch
comments powered by Disqus