Paper Group ANR 1353
Robust Shape Regularity Criteria for Superpixel Evaluation. A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting. Beauty Learning and Counterfactual Inference. Real-Time Semantic Segmentation via Multiply Spatial Fusion Network. Causal Discovery with General Non-Linear Relationships Using …
Robust Shape Regularity Criteria for Superpixel Evaluation
Title | Robust Shape Regularity Criteria for Superpixel Evaluation |
Authors | Rémi Giraud, Vinh-Thong Ta, Nicolas Papadakis |
Abstract | Regular decompositions are necessary for most superpixel-based object recognition or tracking applications. So far in the literature, the regularity or compactness of a superpixel shape is mainly measured by its circularity. In this work, we first demonstrate that such measure is not adapted for superpixel evaluation, since it does not directly express regularity but circular appearance. Then, we propose a new metric that considers several shape regularity aspects: convexity, balanced repartition, and contour smoothness. Finally, we demonstrate that our measure is robust to scale and noise and enables to more relevantly compare superpixel methods. |
Tasks | Object Recognition |
Published | 2019-03-17 |
URL | http://arxiv.org/abs/1903.07146v1 |
http://arxiv.org/pdf/1903.07146v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-shape-regularity-criteria-for |
Repo | |
Framework | |
A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting
Title | A Study on Graph-Structured Recurrent Neural Networks and Sparsification with Application to Epidemic Forecasting |
Authors | Zhijian Li, Xiyang Luo, Bao Wang, Andrea L. Bertozzi, Jack Xin |
Abstract | We study epidemic forecasting on real-world health data by a graph-structured recurrent neural network (GSRNN). We achieve state-of-the-art forecasting accuracy on the benchmark CDC dataset. To improve model efficiency, we sparsify the network weights via transformed-$\ell_1$ penalty and maintain prediction accuracy at the same level with 70% of the network weights being zero. |
Tasks | |
Published | 2019-02-13 |
URL | http://arxiv.org/abs/1902.05113v1 |
http://arxiv.org/pdf/1902.05113v1.pdf | |
PWC | https://paperswithcode.com/paper/a-study-on-graph-structured-recurrent-neural |
Repo | |
Framework | |
Beauty Learning and Counterfactual Inference
Title | Beauty Learning and Counterfactual Inference |
Authors | Tao Li |
Abstract | This work showcases a new approach for causal discovery by leveraging user experiments and recent advances in photo-realistic image editing, demonstrating a potential of identifying causal factors and understanding complex systems counterfactually. We introduce the beauty learning problem as an example, which has been discussed metaphysically for centuries and been proved exists, is quantifiable, and can be learned by deep models in our recent paper, where we utilize a natural image generator coupled with user studies to infer causal effects from facial semantics to beauty outcomes, the results of which also align with existing empirical studies. We expect the proposed framework for a broader application in causal inference. |
Tasks | Causal Discovery, Causal Inference, Counterfactual Inference |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.12629v1 |
http://arxiv.org/pdf/1904.12629v1.pdf | |
PWC | https://paperswithcode.com/paper/190412629 |
Repo | |
Framework | |
Real-Time Semantic Segmentation via Multiply Spatial Fusion Network
Title | Real-Time Semantic Segmentation via Multiply Spatial Fusion Network |
Authors | Haiyang Si, Zhiqiang Zhang, Feifan Lv, Gang Yu, Feng Lu |
Abstract | Real-time semantic segmentation plays a significant role in industry applications, such as autonomous driving, robotics and so on. It is a challenging task as both efficiency and performance need to be considered simultaneously. To address such a complex task, this paper proposes an efficient CNN called Multiply Spatial Fusion Network (MSFNet) to achieve fast and accurate perception. The proposed MSFNet uses Class Boundary Supervision to process the relevant boundary information based on our proposed Multi-features Fusion Module which can obtain spatial information and enlarge receptive field. Therefore, the final upsampling of the feature maps of 1/8 original image size can achieve impressive results while maintaining a high speed. Experiments on Cityscapes and Camvid datasets show an obvious advantage of the proposed approach compared with the existing approaches. Specifically, it achieves 77.1% Mean IOU on the Cityscapes test dataset with the speed of 41 FPS for a 1024*2048 input, and 75.4% Mean IOU with the speed of 91 FPS on the Camvid test dataset. |
Tasks | Autonomous Driving, Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07217v1 |
https://arxiv.org/pdf/1911.07217v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-semantic-segmentation-via-multiply |
Repo | |
Framework | |
Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA
Title | Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA |
Authors | Ricardo Pio Monti, Kun Zhang, Aapo Hyvarinen |
Abstract | We consider the problem of inferring causal relationships between two or more passively observed variables. While the problem of such causal discovery has been extensively studied especially in the bivariate setting, the majority of current methods assume a linear causal relationship, and the few methods which consider non-linear dependencies usually make the assumption of additive noise. Here, we propose a framework through which we can perform causal discovery in the presence of general non-linear relationships. The proposed method is based on recent progress in non-linear independent component analysis and exploits the non-stationarity of observations in order to recover the underlying sources or latent disturbances. We show rigorously that in the case of bivariate causal discovery, such non-linear ICA can be used to infer the causal direction via a series of independence tests. We further propose an alternative measure of causal direction based on asymptotic approximations to the likelihood ratio, as well as an extension to multivariate causal discovery. We demonstrate the capabilities of the proposed method via a series of simulation studies and conclude with an application to neuroimaging data. |
Tasks | Causal Discovery |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09096v1 |
http://arxiv.org/pdf/1904.09096v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-discovery-with-general-non-linear |
Repo | |
Framework | |
Approximating Human Judgment of Generated Image Quality
Title | Approximating Human Judgment of Generated Image Quality |
Authors | Y. Alex Kolchinski, Sharon Zhou, Shengjia Zhao, Mitchell Gordon, Stefano Ermon |
Abstract | Generative models have made immense progress in recent years, particularly in their ability to generate high quality images. However, that quality has been difficult to evaluate rigorously, with evaluation dominated by heuristic approaches that do not correlate well with human judgment, such as the Inception Score and Fr'echet Inception Distance. Real human labels have also been used in evaluation, but are inefficient and expensive to collect for each image. Here, we present a novel method to automatically evaluate images based on their quality as perceived by humans. By not only generating image embeddings from Inception network activations and comparing them to the activations for real images, of which other methods perform a variant, but also regressing the activation statistics to match gold standard human labels, we demonstrate 66% accuracy in predicting human scores of image realism, matching the human inter-rater agreement rate. Our approach also generalizes across generative models, suggesting the potential for capturing a model-agnostic measure of image quality. We open source our dataset of human labels for the advancement of research and techniques in this area. |
Tasks | |
Published | 2019-11-30 |
URL | https://arxiv.org/abs/1912.12121v1 |
https://arxiv.org/pdf/1912.12121v1.pdf | |
PWC | https://paperswithcode.com/paper/approximating-human-judgment-of-generated |
Repo | |
Framework | |
Orthogonal Structure Search for Efficient Causal Discovery from Observational Data
Title | Orthogonal Structure Search for Efficient Causal Discovery from Observational Data |
Authors | Anant Raj, Luigi Gresele, Michel Besserve, Bernhard Schölkopf, Stefan Bauer |
Abstract | The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. Recent work exploits stability of regression coefficients or invariance properties of models across different experimental conditions for reconstructing the full causal graph. These approaches generally do not scale well with the number of the explanatory variables and are difficult to extend to nonlinear relationships. Contrary to existing work, we propose an approach which even works for observational data alone, while still offering theoretical guarantees including the case of partially nonlinear relationships. Our algorithm requires only one estimation for each variable and in our experiments we apply our causal discovery algorithm even to large graphs, demonstrating significant improvements compared to well established approaches. |
Tasks | Causal Discovery |
Published | 2019-03-06 |
URL | http://arxiv.org/abs/1903.02456v1 |
http://arxiv.org/pdf/1903.02456v1.pdf | |
PWC | https://paperswithcode.com/paper/orthogonal-structure-search-for-efficient |
Repo | |
Framework | |
Feature Pyramid Encoding Network for Real-time Semantic Segmentation
Title | Feature Pyramid Encoding Network for Real-time Semantic Segmentation |
Authors | Mengyu Liu, Hujun Yin |
Abstract | Although current deep learning methods have achieved impressive results for semantic segmentation, they incur high computational costs and have a huge number of parameters. For real-time applications, inference speed and memory usage are two important factors. To address the challenge, we propose a lightweight feature pyramid encoding network (FPENet) to make a good trade-off between accuracy and speed. Specifically, we use a feature pyramid encoding block to encode multi-scale contextual features with depthwise dilated convolutions in all stages of the encoder. A mutual embedding upsample module is introduced in the decoder to aggregate the high-level semantic features and low-level spatial details efficiently. The proposed network outperforms existing real-time methods with fewer parameters and improved inference speed on the Cityscapes and CamVid benchmark datasets. Specifically, FPENet achieves 68.0% mean IoU on the Cityscapes test set with only 0.4M parameters and 102 FPS speed on an NVIDIA TITAN V GPU. |
Tasks | Panoptic Segmentation, Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08599v1 |
https://arxiv.org/pdf/1909.08599v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-pyramid-encoding-network-for-real |
Repo | |
Framework | |
Emergent Linguistic Phenomena in Multi-Agent Communication Games
Title | Emergent Linguistic Phenomena in Multi-Agent Communication Games |
Authors | Laura Graesser, Kyunghyun Cho, Douwe Kiela |
Abstract | In this work, we propose a computational framework in which agents equipped with communication capabilities simultaneously play a series of referential games, where agents are trained using deep reinforcement learning. We demonstrate that the framework mirrors linguistic phenomena observed in natural language: i) the outcome of contact between communities is a function of inter- and intra-group connectivity; ii) linguistic contact either converges to the majority protocol, or in balanced cases leads to novel creole languages of lower complexity; and iii) a linguistic continuum emerges where neighboring languages are more mutually intelligible than farther removed languages. We conclude that intricate properties of language evolution need not depend on complex evolved linguistic capabilities, but can emerge from simple social exchanges between perceptually-enabled agents playing communication games. |
Tasks | |
Published | 2019-01-25 |
URL | https://arxiv.org/abs/1901.08706v2 |
https://arxiv.org/pdf/1901.08706v2.pdf | |
PWC | https://paperswithcode.com/paper/emergent-linguistic-phenomena-in-multi-agent |
Repo | |
Framework | |
Context and Attribute Grounded Dense Captioning
Title | Context and Attribute Grounded Dense Captioning |
Authors | Guojun Yin, Lu Sheng, Bin Liu, Nenghai Yu, Xiaogang Wang, Jing Shao |
Abstract | Dense captioning aims at simultaneously localizing semantic regions and describing these regions-of-interest (ROIs) with short phrases or sentences in natural language. Previous studies have shown remarkable progresses, but they are often vulnerable to the aperture problem that a caption generated by the features inside one ROI lacks contextual coherence with its surrounding context in the input image. In this work, we investigate contextual reasoning based on multi-scale message propagations from the neighboring contents to the target ROIs. To this end, we design a novel end-to-end context and attribute grounded dense captioning framework consisting of 1) a contextual visual mining module and 2) a multi-level attribute grounded description generation module. Knowing that captions often co-occur with the linguistic attributes (such as who, what and where), we also incorporate an auxiliary supervision from hierarchical linguistic attributes to augment the distinctiveness of the learned captions. Extensive experiments and ablation studies on Visual Genome dataset demonstrate the superiority of the proposed model in comparison to state-of-the-art methods. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01410v1 |
http://arxiv.org/pdf/1904.01410v1.pdf | |
PWC | https://paperswithcode.com/paper/context-and-attribute-grounded-dense |
Repo | |
Framework | |
RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers
Title | RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers |
Authors | Bailin Wang, Richard Shin, Xiaodong Liu, Oleksandr Polozov, Matthew Richardson |
Abstract | When translating natural language questions into SQL queries to answer questions from a database, contemporary semantic parsing models struggle to generalize to unseen database schemas. The generalization challenge lies in (a) encoding the database relations in an accessible way for the semantic parser, and (b) modeling alignment between database columns and their mentions in a given query. We present a unified framework, based on the relation-aware self-attention mechanism, to address schema encoding, schema linking, and feature representation within a text-to-SQL encoder. On the challenging Spider dataset this framework boosts the exact match accuracy to 53.7%, compared to 47.4% for the state-of-the-art model unaugmented with BERT embeddings. In addition, we observe qualitative improvements in the model’s understanding of schema linking and alignment. |
Tasks | Semantic Parsing, Text-To-Sql |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.04942v1 |
https://arxiv.org/pdf/1911.04942v1.pdf | |
PWC | https://paperswithcode.com/paper/rat-sql-relation-aware-schema-encoding-and-1 |
Repo | |
Framework | |
Confidence Calibration for Convolutional Neural Networks Using Structured Dropout
Title | Confidence Calibration for Convolutional Neural Networks Using Structured Dropout |
Authors | Zhilu Zhang, Adrian V. Dalca, Mert R. Sabuncu |
Abstract | In classification applications, we often want probabilistic predictions to reflect confidence or uncertainty. Dropout, a commonly used training technique, has recently been linked to Bayesian inference, yielding an efficient way to quantify uncertainty in neural network models. However, as previously demonstrated, confidence estimates computed with a naive implementation of dropout can be poorly calibrated, particularly when using convolutional networks. In this paper, through the lens of ensemble learning, we associate calibration error with the correlation between the models sampled with dropout. Motivated by this, we explore the use of structured dropout to promote model diversity and improve confidence calibration. We use the SVHN, CIFAR-10 and CIFAR-100 datasets to empirically compare model diversity and confidence errors obtained using various dropout techniques. We also show the merit of structured dropout in a Bayesian active learning application. |
Tasks | Active Learning, Bayesian Inference, Calibration |
Published | 2019-06-23 |
URL | https://arxiv.org/abs/1906.09551v1 |
https://arxiv.org/pdf/1906.09551v1.pdf | |
PWC | https://paperswithcode.com/paper/confidence-calibration-for-convolutional |
Repo | |
Framework | |
A Deep Learning Approach Towards Prediction of Faults in Wind Turbines
Title | A Deep Learning Approach Towards Prediction of Faults in Wind Turbines |
Authors | Joyjit Chatterjee, Nina Dethlefs |
Abstract | With the rising costs of conventional sources of energy, the world is moving towards sustainable energy sources including wind energy. Wind turbines consist of several electrical and mechanical components and experience an enormous amount of irregular loads, making their operational behaviour at times inconsistent. Operations and Maintenance (O&M) is a key factor in monitoring such inconsistent behaviour of the turbines in order to predict and prevent any incipient faults which may occur in the near future. Machine learning has been applied to the domain of wind energy over the last decade for analysing, diagnosing and predicting wind turbine faults. In particular, we follow the idea of modelling a turbine’s performance as a power curve where any power outputs that fall off the curve can be seen as performance errors. Existing work using this idea has used data from a turbine’s Supervisory Control & Acquisition (SCADA) system to filter and analyse fault & alarm data using regression techniques. In contrast to previous work, we explore how deep learning can be applied to fault prediction from open access meteorological data only. |
Tasks | |
Published | 2019-12-12 |
URL | https://arxiv.org/abs/2001.03612v1 |
https://arxiv.org/pdf/2001.03612v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-approach-towards-prediction |
Repo | |
Framework | |
ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks
Title | ET-GAN: Cross-Language Emotion Transfer Based on Cycle-Consistent Generative Adversarial Networks |
Authors | Xiaoqi Jia, Jianwei Tai, Hang Zhou, Yakai Li, Weijuan Zhang, Haichao Du, Qingjia Huang |
Abstract | Despite the remarkable progress made in synthesizing emotional speech from text, it is still challenging to provide emotion information to existing speech segments. Previous methods mainly rely on parallel data, and few works have studied the generalization ability for one model to transfer emotion information across different languages. To cope with such problems, we propose an emotion transfer system named ET-GAN, for learning language-independent emotion transfer from one emotion to another without parallel training samples. Based on cycle-consistent generative adversarial network, our method ensures the transfer of only emotion information across speeches with simple loss designs. Besides, we introduce an approach for migrating emotion information across different languages by using transfer learning. The experiment results show that our method can efficiently generate high-quality emotional speech for any given emotion category, without aligned speech pairs. |
Tasks | Domain Adaptation, Speech Synthesis, Transfer Learning |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11173v3 |
https://arxiv.org/pdf/1905.11173v3.pdf | |
PWC | https://paperswithcode.com/paper/eg-gan-cross-language-emotion-gain-synthesis |
Repo | |
Framework | |
Graph-guided Architecture Search for Real-time Semantic Segmentation
Title | Graph-guided Architecture Search for Real-time Semantic Segmentation |
Authors | Peiwen Lin, Peng Sun, Guangliang Cheng, Sirui Xie, Xi Li, Jianping Shi |
Abstract | Designing a lightweight semantic segmentation network often requires researchers to find a trade-off between performance and speed, which is always empirical due to the limited interpretability of neural networks. In order to release researchers from these tedious mechanical trials, we propose a Graph-guided Architecture Search (GAS) pipeline to automatically search real-time semantic segmentation networks. Unlike previous works that use a simplified search space and stack a repeatable cell to form a network, we introduce a novel search mechanism with new search space where a lightweight model can be effectively explored through the cell-level diversity and latencyoriented constraint. Specifically, to produce the cell-level diversity, the cell-sharing constraint is eliminated through the cell-independent manner. Then a graph convolution network (GCN) is seamlessly integrated as a communication mechanism between cells. Finally, a latency-oriented constraint is endowed into the search process to balance the speed and performance. Extensive experiments on Cityscapes and CamVid datasets demonstrate that GAS achieves the new state-of-the-art trade-off between accuracy and speed. In particular, on Cityscapes dataset, GAS achieves the new best performance of 73.5% mIoU with speed of 108.4 FPS on Titan Xp. |
Tasks | Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-09-15 |
URL | https://arxiv.org/abs/1909.06793v2 |
https://arxiv.org/pdf/1909.06793v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-guided-architecture-search-for-real |
Repo | |
Framework | |