January 26, 2020

2770 words 14 mins read

Paper Group ANR 1353

Robust Shape Regularity Criteria for Superpixel Evaluation. A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting. Beauty Learning and Counterfactual Inference. Real-Time Semantic Segmentation via Multiply Spatial Fusion Network. Causal Discovery with General Non-Linear Relationships Using …

Robust Shape Regularity Criteria for Superpixel Evaluation


Title	Robust Shape Regularity Criteria for Superpixel Evaluation
Authors	Rémi Giraud, Vinh-Thong Ta, Nicolas Papadakis
Abstract	Regular decompositions are necessary for most superpixel-based object recognition or tracking applications. So far in the literature, the regularity or compactness of a superpixel shape is mainly measured by its circularity. In this work, we first demonstrate that such measure is not adapted for superpixel evaluation, since it does not directly express regularity but circular appearance. Then, we propose a new metric that considers several shape regularity aspects: convexity, balanced repartition, and contour smoothness. Finally, we demonstrate that our measure is robust to scale and noise and enables to more relevantly compare superpixel methods.
Tasks	Object Recognition
Published	2019-03-17
URL	http://arxiv.org/abs/1903.07146v1
PDF	http://arxiv.org/pdf/1903.07146v1.pdf
PWC	https://paperswithcode.com/paper/robust-shape-regularity-criteria-for
Repo
Framework

A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting


Title	A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting
Authors	Zhijian Li, Xiyang Luo, Bao Wang, Andrea L. Bertozzi, Jack Xin
Abstract	We study epidemic forecasting on real-world health data by a graph-structured recurrent neural network (GSRNN). We achieve state-of-the-art forecasting accuracy on the benchmark CDC dataset. To improve model efficiency, we sparsify the network weights via transformed-$\ell_1$ penalty and maintain prediction accuracy at the same level with 70% of the network weights being zero.
Tasks
Published	2019-02-13
URL	http://arxiv.org/abs/1902.05113v1
PDF	http://arxiv.org/pdf/1902.05113v1.pdf
PWC	https://paperswithcode.com/paper/a-study-on-graph-structured-recurrent-neural
Repo
Framework

Beauty Learning and Counterfactual Inference


Title	Beauty Learning and Counterfactual Inference
Authors	Tao Li
Abstract	This work showcases a new approach for causal discovery by leveraging user experiments and recent advances in photo-realistic image editing, demonstrating a potential of identifying causal factors and understanding complex systems counterfactually. We introduce the beauty learning problem as an example, which has been discussed metaphysically for centuries and been proved exists, is quantifiable, and can be learned by deep models in our recent paper, where we utilize a natural image generator coupled with user studies to infer causal effects from facial semantics to beauty outcomes, the results of which also align with existing empirical studies. We expect the proposed framework for a broader application in causal inference.
Tasks	Causal Discovery, Causal Inference, Counterfactual Inference
Published	2019-04-24
URL	http://arxiv.org/abs/1904.12629v1
PDF	http://arxiv.org/pdf/1904.12629v1.pdf
PWC	https://paperswithcode.com/paper/190412629
Repo
Framework

Real-Time Semantic Segmentation via Multiply Spatial Fusion Network


Title	Real-Time Semantic Segmentation via Multiply Spatial Fusion Network
Authors	Haiyang Si, Zhiqiang Zhang, Feifan Lv, Gang Yu, Feng Lu
Abstract	Real-time semantic segmentation plays a significant role in industry applications, such as autonomous driving, robotics and so on. It is a challenging task as both efficiency and performance need to be considered simultaneously. To address such a complex task, this paper proposes an efficient CNN called Multiply Spatial Fusion Network (MSFNet) to achieve fast and accurate perception. The proposed MSFNet uses Class Boundary Supervision to process the relevant boundary information based on our proposed Multi-features Fusion Module which can obtain spatial information and enlarge receptive field. Therefore, the final upsampling of the feature maps of 1/8 original image size can achieve impressive results while maintaining a high speed. Experiments on Cityscapes and Camvid datasets show an obvious advantage of the proposed approach compared with the existing approaches. Specifically, it achieves 77.1% Mean IOU on the Cityscapes test dataset with the speed of 41 FPS for a 1024*2048 input, and 75.4% Mean IOU with the speed of 91 FPS on the Camvid test dataset.
Tasks	Autonomous Driving, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07217v1
PDF	https://arxiv.org/pdf/1911.07217v1.pdf
PWC	https://paperswithcode.com/paper/real-time-semantic-segmentation-via-multiply
Repo
Framework

Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA


Title	Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA
Authors	Ricardo Pio Monti, Kun Zhang, Aapo Hyvarinen
Abstract	We consider the problem of inferring causal relationships between two or more passively observed variables. While the problem of such causal discovery has been extensively studied especially in the bivariate setting, the majority of current methods assume a linear causal relationship, and the few methods which consider non-linear dependencies usually make the assumption of additive noise. Here, we propose a framework through which we can perform causal discovery in the presence of general non-linear relationships. The proposed method is based on recent progress in non-linear independent component analysis and exploits the non-stationarity of observations in order to recover the underlying sources or latent disturbances. We show rigorously that in the case of bivariate causal discovery, such non-linear ICA can be used to infer the causal direction via a series of independence tests. We further propose an alternative measure of causal direction based on asymptotic approximations to the likelihood ratio, as well as an extension to multivariate causal discovery. We demonstrate the capabilities of the proposed method via a series of simulation studies and conclude with an application to neuroimaging data.
Tasks	Causal Discovery
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09096v1
PDF	http://arxiv.org/pdf/1904.09096v1.pdf
PWC	https://paperswithcode.com/paper/causal-discovery-with-general-non-linear
Repo
Framework

Approximating Human Judgment of Generated Image Quality


Title	Approximating Human Judgment of Generated Image Quality
Authors	Y. Alex Kolchinski, Sharon Zhou, Shengjia Zhao, Mitchell Gordon, Stefano Ermon
Abstract	Generative models have made immense progress in recent years, particularly in their ability to generate high quality images. However, that quality has been difficult to evaluate rigorously, with evaluation dominated by heuristic approaches that do not correlate well with human judgment, such as the Inception Score and Fr'echet Inception Distance. Real human labels have also been used in evaluation, but are inefficient and expensive to collect for each image. Here, we present a novel method to automatically evaluate images based on their quality as perceived by humans. By not only generating image embeddings from Inception network activations and comparing them to the activations for real images, of which other methods perform a variant, but also regressing the activation statistics to match gold standard human labels, we demonstrate 66% accuracy in predicting human scores of image realism, matching the human inter-rater agreement rate. Our approach also generalizes across generative models, suggesting the potential for capturing a model-agnostic measure of image quality. We open source our dataset of human labels for the advancement of research and techniques in this area.
Tasks
Published	2019-11-30
URL	https://arxiv.org/abs/1912.12121v1
PDF	https://arxiv.org/pdf/1912.12121v1.pdf
PWC	https://paperswithcode.com/paper/approximating-human-judgment-of-generated
Repo
Framework

Orthogonal Structure Search for Efficient Causal Discovery from Observational Data


Title	Orthogonal Structure Search for Efficient Causal Discovery from Observational Data
Authors	Anant Raj, Luigi Gresele, Michel Besserve, Bernhard Schölkopf, Stefan Bauer
Abstract	The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. Recent work exploits stability of regression coefficients or invariance properties of models across different experimental conditions for reconstructing the full causal graph. These approaches generally do not scale well with the number of the explanatory variables and are difficult to extend to nonlinear relationships. Contrary to existing work, we propose an approach which even works for observational data alone, while still offering theoretical guarantees including the case of partially nonlinear relationships. Our algorithm requires only one estimation for each variable and in our experiments we apply our causal discovery algorithm even to large graphs, demonstrating significant improvements compared to well established approaches.
Tasks	Causal Discovery
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02456v1
PDF	http://arxiv.org/pdf/1903.02456v1.pdf
PWC	https://paperswithcode.com/paper/orthogonal-structure-search-for-efficient
Repo
Framework

Feature Pyramid Encoding Network for Real-time Semantic Segmentation


Title	Feature Pyramid Encoding Network for Real-time Semantic Segmentation
Authors	Mengyu Liu, Hujun Yin
Abstract	Although current deep learning methods have achieved impressive results for semantic segmentation, they incur high computational costs and have a huge number of parameters. For real-time applications, inference speed and memory usage are two important factors. To address the challenge, we propose a lightweight feature pyramid encoding network (FPENet) to make a good trade-off between accuracy and speed. Specifically, we use a feature pyramid encoding block to encode multi-scale contextual features with depthwise dilated convolutions in all stages of the encoder. A mutual embedding upsample module is introduced in the decoder to aggregate the high-level semantic features and low-level spatial details efficiently. The proposed network outperforms existing real-time methods with fewer parameters and improved inference speed on the Cityscapes and CamVid benchmark datasets. Specifically, FPENet achieves 68.0% mean IoU on the Cityscapes test set with only 0.4M parameters and 102 FPS speed on an NVIDIA TITAN V GPU.
Tasks	Panoptic Segmentation, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08599v1
PDF	https://arxiv.org/pdf/1909.08599v1.pdf
PWC	https://paperswithcode.com/paper/feature-pyramid-encoding-network-for-real
Repo
Framework

Emergent Linguistic Phenomena in Multi-Agent Communication Games


Title	Emergent Linguistic Phenomena in Multi-Agent Communication Games
Authors	Laura Graesser, Kyunghyun Cho, Douwe Kiela
Abstract	In this work, we propose a computational framework in which agents equipped with communication capabilities simultaneously play a series of referential games, where agents are trained using deep reinforcement learning. We demonstrate that the framework mirrors linguistic phenomena observed in natural language: i) the outcome of contact between communities is a function of inter- and intra-group connectivity; ii) linguistic contact either converges to the majority protocol, or in balanced cases leads to novel creole languages of lower complexity; and iii) a linguistic continuum emerges where neighboring languages are more mutually intelligible than farther removed languages. We conclude that intricate properties of language evolution need not depend on complex evolved linguistic capabilities, but can emerge from simple social exchanges between perceptually-enabled agents playing communication games.
Tasks
Published	2019-01-25
URL	https://arxiv.org/abs/1901.08706v2
PDF	https://arxiv.org/pdf/1901.08706v2.pdf
PWC	https://paperswithcode.com/paper/emergent-linguistic-phenomena-in-multi-agent
Repo
Framework

Context and Attribute Grounded Dense Captioning


Title	Context and Attribute Grounded Dense Captioning
Authors	Guojun Yin, Lu Sheng, Bin Liu, Nenghai Yu, Xiaogang Wang, Jing Shao
Abstract	Dense captioning aims at simultaneously localizing semantic regions and describing these regions-of-interest (ROIs) with short phrases or sentences in natural language. Previous studies have shown remarkable progresses, but they are often vulnerable to the aperture problem that a caption generated by the features inside one ROI lacks contextual coherence with its surrounding context in the input image. In this work, we investigate contextual reasoning based on multi-scale message propagations from the neighboring contents to the target ROIs. To this end, we design a novel end-to-end context and attribute grounded dense captioning framework consisting of 1) a contextual visual mining module and 2) a multi-level attribute grounded description generation module. Knowing that captions often co-occur with the linguistic attributes (such as who, what and where), we also incorporate an auxiliary supervision from hierarchical linguistic attributes to augment the distinctiveness of the learned captions. Extensive experiments and ablation studies on Visual Genome dataset demonstrate the superiority of the proposed model in comparison to state-of-the-art methods.
Tasks
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01410v1
PDF	http://arxiv.org/pdf/1904.01410v1.pdf
PWC	https://paperswithcode.com/paper/context-and-attribute-grounded-dense
Repo
Framework

RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers


Title	RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers
Authors	Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson
Abstract	When translating natural language questions into SQL queries to answer questions from a database, contemporary semantic parsing models struggle to generalize to unseen database schemas. The generalization challenge lies in (a) encoding the database relations in an accessible way for the semantic parser, and (b) modeling alignment between database columns and their mentions in a given query. We present a unified framework, based on the relation-aware self-attention mechanism, to address schema encoding, schema linking, and feature representation within a text-to-SQL encoder. On the challenging Spider dataset this framework boosts the exact match accuracy to 53.7%, compared to 47.4% for the state-of-the-art model unaugmented with BERT embeddings. In addition, we observe qualitative improvements in the model’s understanding of schema linking and alignment.
Tasks	Semantic Parsing, Text-To-Sql
Published	2019-11-10
URL	https://arxiv.org/abs/1911.04942v1
PDF	https://arxiv.org/pdf/1911.04942v1.pdf
PWC	https://paperswithcode.com/paper/rat-sql-relation-aware-schema-encoding-and-1
Repo
Framework

Confidence Calibration for Convolutional Neural Networks Using Structured Dropout


Title	Confidence Calibration for Convolutional Neural Networks Using Structured Dropout
Authors	Zhilu Zhang, Adrian V. Dalca, Mert R. Sabuncu
Abstract	In classification applications, we often want probabilistic predictions to reflect confidence or uncertainty. Dropout, a commonly used training technique, has recently been linked to Bayesian inference, yielding an efficient way to quantify uncertainty in neural network models. However, as previously demonstrated, confidence estimates computed with a naive implementation of dropout can be poorly calibrated, particularly when using convolutional networks. In this paper, through the lens of ensemble learning, we associate calibration error with the correlation between the models sampled with dropout. Motivated by this, we explore the use of structured dropout to promote model diversity and improve confidence calibration. We use the SVHN, CIFAR-10 and CIFAR-100 datasets to empirically compare model diversity and confidence errors obtained using various dropout techniques. We also show the merit of structured dropout in a Bayesian active learning application.
Tasks	Active Learning, Bayesian Inference, Calibration
Published	2019-06-23
URL	https://arxiv.org/abs/1906.09551v1
PDF	https://arxiv.org/pdf/1906.09551v1.pdf
PWC	https://paperswithcode.com/paper/confidence-calibration-for-convolutional
Repo
Framework

A Deep Learning Approach Towards Prediction of Faults in Wind Turbines


Title	A Deep Learning Approach Towards Prediction of Faults in Wind Turbines
Authors	Joyjit Chatterjee, Nina Dethlefs
Abstract	With the rising costs of conventional sources of energy, the world is moving towards sustainable energy sources including wind energy. Wind turbines consist of several electrical and mechanical components and experience an enormous amount of irregular loads, making their operational behaviour at times inconsistent. Operations and Maintenance (O&M) is a key factor in monitoring such inconsistent behaviour of the turbines in order to predict and prevent any incipient faults which may occur in the near future. Machine learning has been applied to the domain of wind energy over the last decade for analysing, diagnosing and predicting wind turbine faults. In particular, we follow the idea of modelling a turbine’s performance as a power curve where any power outputs that fall off the curve can be seen as performance errors. Existing work using this idea has used data from a turbine’s Supervisory Control & Acquisition (SCADA) system to filter and analyse fault & alarm data using regression techniques. In contrast to previous work, we explore how deep learning can be applied to fault prediction from open access meteorological data only.
Tasks
Published	2019-12-12
URL	https://arxiv.org/abs/2001.03612v1
PDF	https://arxiv.org/pdf/2001.03612v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-approach-towards-prediction
Repo
Framework

ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks


Title	ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks
Authors	Xiaoqi Jia, Jianwei Tai, Hang Zhou, Yakai Li, Weijuan Zhang, Haichao Du, Qingjia Huang
Abstract	Despite the remarkable progress made in synthesizing emotional speech from text, it is still challenging to provide emotion information to existing speech segments. Previous methods mainly rely on parallel data, and few works have studied the generalization ability for one model to transfer emotion information across different languages. To cope with such problems, we propose an emotion transfer system named ET-GAN, for learning language-independent emotion transfer from one emotion to another without parallel training samples. Based on cycle-consistent generative adversarial network, our method ensures the transfer of only emotion information across speeches with simple loss designs. Besides, we introduce an approach for migrating emotion information across different languages by using transfer learning. The experiment results show that our method can efficiently generate high-quality emotional speech for any given emotion category, without aligned speech pairs.
Tasks	Domain Adaptation, Speech Synthesis, Transfer Learning
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11173v3
PDF	https://arxiv.org/pdf/1905.11173v3.pdf
PWC	https://paperswithcode.com/paper/eg-gan-cross-language-emotion-gain-synthesis
Repo
Framework

Graph-guided Architecture Search for Real-time Semantic Segmentation


Title	Graph-guided Architecture Search for Real-time Semantic Segmentation
Authors	Peiwen Lin, Peng Sun, Guangliang Cheng, Sirui Xie, Xi Li, Jianping Shi
Abstract	Designing a lightweight semantic segmentation network often requires researchers to find a trade-off between performance and speed, which is always empirical due to the limited interpretability of neural networks. In order to release researchers from these tedious mechanical trials, we propose a Graph-guided Architecture Search (GAS) pipeline to automatically search real-time semantic segmentation networks. Unlike previous works that use a simplified search space and stack a repeatable cell to form a network, we introduce a novel search mechanism with new search space where a lightweight model can be effectively explored through the cell-level diversity and latencyoriented constraint. Specifically, to produce the cell-level diversity, the cell-sharing constraint is eliminated through the cell-independent manner. Then a graph convolution network (GCN) is seamlessly integrated as a communication mechanism between cells. Finally, a latency-oriented constraint is endowed into the search process to balance the speed and performance. Extensive experiments on Cityscapes and CamVid datasets demonstrate that GAS achieves the new state-of-the-art trade-off between accuracy and speed. In particular, on Cityscapes dataset, GAS achieves the new best performance of 73.5% mIoU with speed of 108.4 FPS on Titan Xp.
Tasks	Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06793v2
PDF	https://arxiv.org/pdf/1909.06793v2.pdf
PWC	https://paperswithcode.com/paper/graph-guided-architecture-search-for-real
Repo
Framework