Paper Group AWR 237
Star-Transformer. Estimating Solar Irradiance Using Sky Imagers. Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search. Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach. Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation. Joint Learning of Saliency Detecti …
Star-Transformer
Title | Star-Transformer |
Authors | Qipeng Guo, Xipeng Qiu, Pengfei Liu, Yunfan Shao, Xiangyang Xue, Zheng Zhang |
Abstract | Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data. In this paper, we present Star-Transformer, a lightweight alternative by careful sparsification. To reduce model complexity, we replace the fully-connected structure with a star-shaped topology, in which every two non-adjacent nodes are connected through a shared relay node. Thus, complexity is reduced from quadratic to linear, while preserving capacity to capture both local composition and long-range dependency. The experiments on four tasks (22 datasets) show that Star-Transformer achieved significant improvements against the standard Transformer for the modestly sized datasets. |
Tasks | Named Entity Recognition, Natural Language Inference, Sentiment Analysis, Text Classification |
Published | 2019-02-25 |
URL | http://arxiv.org/abs/1902.09113v2 |
http://arxiv.org/pdf/1902.09113v2.pdf | |
PWC | https://paperswithcode.com/paper/star-transformer |
Repo | https://github.com/fastnlp/fastNLP |
Framework | pytorch |
Estimating Solar Irradiance Using Sky Imagers
Title | Estimating Solar Irradiance Using Sky Imagers |
Authors | Soumyabrata Dev, Florian M. Savoy, Yee Hui Lee, Stefan Winkler |
Abstract | Ground-based whole sky cameras are extensively used for localized monitoring of clouds nowadays. They capture hemispherical images of the sky at regular intervals using a fisheye lens. In this paper, we propose a framework for estimating solar irradiance from pictures taken by those imagers. Unlike pyranometers, such sky images contain information about cloud coverage and can be used to derive cloud movement. An accurate estimation of solar irradiance using solely those images is thus a first step towards short-term forecasting of solar energy generation based on cloud movement. We derive and validate our model using pyranometers co-located with our whole sky imagers. We achieve a better performance in estimating solar irradiance and in particular its short-term variations as compared to other related methods using ground-based observations. |
Tasks | |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.04981v1 |
https://arxiv.org/pdf/1910.04981v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-solar-irradiance-using-sky-imagers |
Repo | https://github.com/Soumyabrata/estimate-solar-irradiance |
Framework | none |
Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search
Title | Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search |
Authors | Xiangxiang Chu, Bo Zhang, Ruijun Xu, Hailong Ma |
Abstract | Fabricating neural models for a wide range of mobile devices demands for a specific design of networks due to highly constrained resources. Both evolution algorithms (EA) and reinforced learning methods (RL) have been dedicated to solve neural architecture search problems. However, these combinations usually concentrate on a single objective such as the error rate of image classification. They also fail to harness the very benefits from both sides. In this paper, we present a new multi-objective oriented algorithm called MoreMNAS (Multi-Objective Reinforced Evolution in Mobile Neural Architecture Search) by leveraging good virtues from both EA and RL. In particular, we incorporate a variant of multi-objective genetic algorithm NSGA-II, in which the search space is composed of various cells so that crossovers and mutations can be performed at the cell level. Moreover, reinforced control is mixed with a natural mutating process to regulate arbitrary mutation, maintaining a delicate balance between exploration and exploitation. Therefore, not only does our method prevent the searched models from degrading during the evolution process, but it also makes better use of learned knowledge. Our experiments conducted in Super-resolution domain (SR) deliver rivalling models compared to some state-of-the-art methods with fewer FLOPS. |
Tasks | Image Classification, Neural Architecture Search, Super-Resolution |
Published | 2019-01-04 |
URL | http://arxiv.org/abs/1901.01074v3 |
http://arxiv.org/pdf/1901.01074v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-objective-reinforced-evolution-in |
Repo | https://github.com/moremnas/MoreMNAS |
Framework | tf |
Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach
Title | Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach |
Authors | Bingfeng Zhang, Jimin Xiao, Yunchao Wei, Mingjie Sun, Kaizhu Huang |
Abstract | Weakly supervised semantic segmentation is a challenging task as it only takes image-level information as supervision for training but produces pixel-level predictions for testing. To address such a challenging task, most recent state-of-the-art approaches propose to adopt two-step solutions, \emph{i.e. } 1) learn to generate pseudo pixel-level masks, and 2) engage FCNs to train the semantic segmentation networks with the pseudo masks. However, the two-step solutions usually employ many bells and whistles in producing high-quality pseudo masks, making this kind of methods complicated and inelegant. In this work, we harness the image-level labels to produce reliable pixel-level annotations and design a fully end-to-end network to learn to predict segmentation maps. Concretely, we firstly leverage an image classification branch to generate class activation maps for the annotated categories, which are further pruned into confident yet tiny object/background regions. Such reliable regions are then directly served as ground-truth labels for the parallel segmentation branch, where a newly designed dense energy loss function is adopted for optimization. Despite its apparent simplicity, our one-step solution achieves competitive mIoU scores (\emph{val}: 62.6, \emph{test}: 62.9) on Pascal VOC compared with those two-step state-of-the-arts. By extending our one-step method to two-step, we get a new state-of-the-art performance on the Pascal VOC (\emph{val}: 66.3, \emph{test}: 66.5). |
Tasks | Image Classification, Semantic Segmentation, Weakly-Supervised Semantic Segmentation |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08039v1 |
https://arxiv.org/pdf/1911.08039v1.pdf | |
PWC | https://paperswithcode.com/paper/reliability-does-matter-an-end-to-end-weakly |
Repo | https://github.com/zbf1991/RRM |
Framework | pytorch |
Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation
Title | Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation |
Authors | Yude Wang, Jie Zhang, Meina Kan, Shiguang Shan, Xilin Chen |
Abstract | Weakly supervised semantic segmentation has attracted much research interest in recent years considering its advantage of low labeling cost. Most of the advanced algorithms follow the design principle that expands and constrains the seed regions from class activation maps (CAM). As well-known, conventional CAM tends to be incomplete or over-activated due to weak supervision. Fortunately, we find that semantic segmentation has a characteristic of spatial transformation equivariance, which can form a few self-supervisions to help weakly supervised learning. This work mainly explores the advantages of scale equivariant constrains for CAM generation, formulated as a self-supervised scale equivariant network (SSENet). Specifically, a novel scale equivariant regularization is elaborately designed to ensure consistency of CAMs from the same input image with different resolutions. This novel scale equivariant regularization can guide the whole network to learn more accurate class activation. This regularized CAM can be embedded in most recent advanced weakly supervised semantic segmentation framework. Extensive experiments on PASCAL VOC 2012 datasets demonstrate that our method achieves the state-of-the-art performance both quantitatively and qualitatively for weakly supervised semantic segmentation. Code has been made available. |
Tasks | Semantic Segmentation, Weakly-Supervised Semantic Segmentation |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03714v1 |
https://arxiv.org/pdf/1909.03714v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-scale-equivariant-network-for |
Repo | https://github.com/YudeWang/SSENet-pytorch |
Framework | pytorch |
Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation
Title | Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation |
Authors | Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang |
Abstract | Existing weakly supervised semantic segmentation (WSSS) methods usually utilize the results of pre-trained saliency detection (SD) models without explicitly modeling the connections between the two tasks, which is not the most efficient configuration. Here we propose a unified multi-task learning framework to jointly solve WSSS and SD using a single network, \ie saliency, and segmentation network (SSNet). SSNet consists of a segmentation network (SN) and a saliency aggregation module (SAM). For an input image, SN generates the segmentation result and, SAM predicts the saliency of each category and aggregating the segmentation masks of all categories into a saliency map. The proposed network is trained end-to-end with image-level category labels and class-agnostic pixel-level saliency labels. Experiments on PASCAL VOC 2012 segmentation dataset and four saliency benchmark datasets show the performance of our method compares favorably against state-of-the-art weakly supervised segmentation methods and fully supervised saliency detection methods. |
Tasks | Multi-Task Learning, Saliency Detection, Semantic Segmentation, Weakly-Supervised Semantic Segmentation |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.04161v1 |
https://arxiv.org/pdf/1909.04161v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-learning-of-saliency-detection-and |
Repo | https://github.com/zengxianyu/jsws |
Framework | none |
Sparse Reduced-Rank Regression for Simultaneous Rank and Variable Selection via Manifold Optimization
Title | Sparse Reduced-Rank Regression for Simultaneous Rank and Variable Selection via Manifold Optimization |
Authors | Kohei Yoshikawa, Shuichi Kawano |
Abstract | We consider the problem of constructing a reduced-rank regression model whose coefficient parameter is represented as a singular value decomposition with sparse singular vectors. The traditional estimation procedure for the coefficient parameter often fails when the true rank of the parameter is high. To overcome this issue, we develop an estimation algorithm with rank and variable selection via sparse regularization and manifold optimization, which enables us to obtain an accurate estimation of the coefficient parameter even if the true rank of the coefficient parameter is high. Using sparse regularization, we can also select an optimal value of the rank. We conduct Monte Carlo experiments and real data analysis to illustrate the effectiveness of our proposed method. |
Tasks | |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05083v2 |
https://arxiv.org/pdf/1910.05083v2.pdf | |
PWC | https://paperswithcode.com/paper/sparse-reduced-rank-regression-for |
Repo | https://github.com/yoshikawa-kohei/RVSManOpt |
Framework | none |
Training Agents using Upside-Down Reinforcement Learning
Title | Training Agents using Upside-Down Reinforcement Learning |
Authors | Rupesh Kumar Srivastava, Pranav Shyam, Filipe Mutz, Wojciech Jaśkowski, Jürgen Schmidhuber |
Abstract | Traditional Reinforcement Learning (RL) algorithms either predict rewards with value functions or maximize them using policy search. We study an alternative: Upside-Down Reinforcement Learning (Upside-Down RL or UDRL), that solves RL problems primarily using supervised learning techniques. Many of its main principles are outlined in a companion report [34]. Here we present the first concrete implementation of UDRL and demonstrate its feasibility on certain episodic learning problems. Experimental results show that its performance can be surprisingly competitive with, and even exceed that of traditional baseline algorithms developed over decades of research. |
Tasks | |
Published | 2019-12-05 |
URL | https://arxiv.org/abs/1912.02877v1 |
https://arxiv.org/pdf/1912.02877v1.pdf | |
PWC | https://paperswithcode.com/paper/training-agents-using-upside-down |
Repo | https://github.com/parthchadha/upsideDownRL |
Framework | pytorch |
Urban Sound Tagging using Convolutional Neural Networks
Title | Urban Sound Tagging using Convolutional Neural Networks |
Authors | Sainath Adapa |
Abstract | In this paper, we propose a framework for environmental sound classification in a low-data context (less than 100 labeled examples per class). We show that using pre-trained image classification models along with the usage of data augmentation techniques results in higher performance over alternative approaches. We applied this system to the task of Urban Sound Tagging, part of the DCASE 2019. The objective was to label different sources of noise from raw audio data. A modified form of MobileNetV2, a convolutional neural network (CNN) model was trained to classify both coarse and fine tags jointly. The proposed model uses log-scaled Mel-spectrogram as the representation format for the audio data. Mixup, Random erasing, scaling, and shifting are used as data augmentation techniques. A second model that uses scaled labels was built to account for human errors in the annotations. The proposed model achieved the first rank on the leaderboard with Micro-AUPRC values of 0.751 and 0.860 on fine and coarse tags, respectively. |
Tasks | Data Augmentation, Environmental Sound Classification, Image Classification |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1909.12699v1 |
https://arxiv.org/pdf/1909.12699v1.pdf | |
PWC | https://paperswithcode.com/paper/urban-sound-tagging-using-convolutional |
Repo | https://github.com/sainathadapa/urban-sound-tagging |
Framework | pytorch |
Multi-scale Dynamic Graph Convolutional Network for Hyperspectral Image Classification
Title | Multi-scale Dynamic Graph Convolutional Network for Hyperspectral Image Classification |
Authors | Sheng Wan, Chen Gong, Ping Zhong, Bo Du, Lefei Zhang, Jian Yang |
Abstract | Convolutional Neural Network (CNN) has demonstrated impressive ability to represent hyperspectral images and to achieve promising results in hyperspectral image classification. However, traditional CNN models can only operate convolution on regular square image regions with fixed size and weights, so they cannot universally adapt to the distinct local regions with various object distributions and geometric appearances. Therefore, their classification performances are still to be improved, especially in class boundaries. To alleviate this shortcoming, we consider employing the recently proposed Graph Convolutional Network (GCN) for hyperspectral image classification, as it can conduct the convolution on arbitrarily structured non-Euclidean data and is applicable to the irregular image regions represented by graph topological information. Different from the commonly used GCN models which work on a fixed graph, we enable the graph to be dynamically updated along with the graph convolution process, so that these two steps can be benefited from each other to gradually produce the discriminative embedded features as well as a refined graph. Moreover, to comprehensively deploy the multi-scale information inherited by hyperspectral images, we establish multiple input graphs with different neighborhood scales to extensively exploit the diversified spectral-spatial correlations at multiple scales. Therefore, our method is termed ‘Multi-scale Dynamic Graph Convolutional Network’ (MDGCN). The experimental results on three typical benchmark datasets firmly demonstrate the superiority of the proposed MDGCN to other state-of-the-art methods in both qualitative and quantitative aspects. |
Tasks | Hyperspectral Image Classification, Image Classification |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.06133v1 |
https://arxiv.org/pdf/1905.06133v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-scale-dynamic-graph-convolutional |
Repo | https://github.com/LEAP-WS/MDGCN |
Framework | tf |
Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization
Title | Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization |
Authors | Eldad Meller, Alexander Finkelstein, Uri Almog, Mark Grobman |
Abstract | Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a given layer, individual output channels can be scaled by any factor provided that the corresponding weights of the next layer are inversely scaled. Therefore, a given network has many factorizations which change the weights of the network without changing its function. We present a conceptually simple and easy to implement method that uses this property and show that proper factorizations significantly decrease the degradation caused by quantization. We show improvement on a wide variety of networks and achieve state-of-the-art degradation results for MobileNets. While our focus is on quantization, this type of factorization is applicable to other domains such as network-pruning, neural nets regularization and network interpretability. |
Tasks | Network Pruning, Quantization |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01917v1 |
http://arxiv.org/pdf/1902.01917v1.pdf | |
PWC | https://paperswithcode.com/paper/same-same-but-different-recovering-neural |
Repo | https://github.com/Adamdad/Samesame |
Framework | tf |
NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization
Title | NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization |
Authors | Ali Ramezani-Kebrya, Fartash Faghri, Daniel M. Roy |
Abstract | As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed on clusters to perform model fitting in parallel. Alistarh et al. (2017) describe two variants of data-parallel SGD that quantize and encode gradients to lessen communication costs. For the first variant, QSGD, they provide strong theoretical guarantees. For the second variant, which we call QSGDinf, they demonstrate impressive empirical gains for distributed training of large neural networks. Building on their work, we propose an alternative scheme for quantizing gradients and show that it yields stronger theoretical guarantees than exist for QSGD while matching the empirical performance of QSGDinf. |
Tasks | Quantization |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.06077v1 |
https://arxiv.org/pdf/1908.06077v1.pdf | |
PWC | https://paperswithcode.com/paper/nuqsgd-improved-communication-efficiency-for |
Repo | https://github.com/fartashf/nuqsgd |
Framework | pytorch |
3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization
Title | 3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization |
Authors | Tsun-Hsuan Wang, Hou-Ning Hu, Chieh Hubert Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun |
Abstract | The complementary characteristics of active and passive depth sensing techniques motivate the fusion of the Li-DAR sensor and stereo camera for improved depth perception. Instead of directly fusing estimated depths across LiDAR and stereo modalities, we take advantages of the stereo matching network with two enhanced techniques: Input Fusion and Conditional Cost Volume Normalization (CCVNorm) on the LiDAR information. The proposed framework is generic and closely integrated with the cost volume component that is commonly utilized in stereo matching neural networks. We experimentally verify the efficacy and robustness of our method on the KITTI Stereo and Depth Completion datasets, obtaining favorable performance against various fusion strategies. Moreover, we demonstrate that, with a hierarchical extension of CCVNorm, the proposed method brings only slight overhead to the stereo matching network in terms of computation time and model size. For project page, see https://zswang666.github.io/Stereo-LiDAR-CCVNorm-Project-Page/ |
Tasks | Depth Completion, Stereo Matching, Stereo Matching Hand |
Published | 2019-04-05 |
URL | http://arxiv.org/abs/1904.02917v1 |
http://arxiv.org/pdf/1904.02917v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-lidar-and-stereo-fusion-using-stereo |
Repo | https://github.com/zswang666/Stereo-LiDAR-CCVNorm |
Framework | pytorch |
Extending Monocular Visual Odometry to Stereo Camera Systems by Scale Optimization
Title | Extending Monocular Visual Odometry to Stereo Camera Systems by Scale Optimization |
Authors | Jiawei Mo, Junaed Sattar |
Abstract | This paper proposes a novel approach for extending monocular visual odometry to a stereo camera system. The proposed method uses an additional camera to accurately estimate and optimize the scale of the monocular visual odometry, rather than triangulating 3D points from stereo matching. Specifically, the 3D points generated by the monocular visual odometry are projected onto the other camera of the stereo pair, and the scale is recovered and optimized by directly minimizing the photometric error. It is computationally efficient, adding minimal overhead to the stereo vision system compared to straightforward stereo matching, and is robust to repetitive texture. Additionally, direct scale optimization enables stereo visual odometry to be purely based on the direct method. Extensive evaluation on public datasets (e.g., KITTI), and outdoor environments (both terrestrial and underwater) demonstrates the accuracy and efficiency of a stereo visual odometry approach extended by scale optimization, and its robustness in environments with challenging textures. |
Tasks | Monocular Visual Odometry, Stereo Matching, Stereo Matching Hand, Visual Odometry |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12723v3 |
https://arxiv.org/pdf/1905.12723v3.pdf | |
PWC | https://paperswithcode.com/paper/extending-monocular-visual-odometry-to-stereo |
Repo | https://github.com/jiawei-mo/scale_optimization |
Framework | none |
Understanding Isomorphism Bias in Graph Data Sets
Title | Understanding Isomorphism Bias in Graph Data Sets |
Authors | Sergei Ivanov, Sergei Sviridov, Evgeny Burnaev |
Abstract | In recent years there has been a rapid increase in classification methods on graph structured data. Both in graph kernels and graph neural networks, one of the implicit assumptions of successful state-of-the-art models was that incorporating graph isomorphism features into the architecture leads to better empirical performance. However, as we discover in this work, commonly used data sets for graph classification have repeating instances which cause the problem of isomorphism bias, i.e. artificially increasing the accuracy of the models by memorizing target information from the training set. This prevents fair competition of the algorithms and raises a question of the validity of the obtained results. We analyze 54 data sets, previously extensively used for graph-related tasks, on the existence of isomorphism bias, give a set of recommendations to machine learning practitioners to properly set up their models, and open source new data sets for the future experiments. |
Tasks | Graph Classification |
Published | 2019-10-26 |
URL | https://arxiv.org/abs/1910.12091v2 |
https://arxiv.org/pdf/1910.12091v2.pdf | |
PWC | https://paperswithcode.com/paper/understanding-isomorphism-bias-in-graph-data |
Repo | https://github.com/nd7141/graph_datasets |
Framework | pytorch |