January 31, 2020

3068 words 15 mins read

Paper Group AWR 450

Paper Group AWR 450

Factor Graph Attention. Parameterized quantum circuits as machine learning models. Deep Audio Prior. NAMF: A Non-local Adaptive Mean Filter for Salt-and-Pepper Noise Removal. PoMo: Generating Entity-Specific Post-Modifiers in Context. Densely Residual Laplacian Super-Resolution. DPSNet: End-to-end Deep Plane Sweep Stereo. Dual Graph Attention Netwo …

Factor Graph Attention

Title Factor Graph Attention
Authors Idan Schwartz, Seunghak Yu, Tamir Hazan, Alexander Schwing
Abstract Dialog is an effective way to exchange information, but subtle details and nuances are extremely important. While significant progress has paved a path to address visual dialog with algorithms, details and nuances remain a challenge. Attention mechanisms have demonstrated compelling results to extract details in visual question answering and also provide a convincing framework for visual dialog due to their interpretability and effectiveness. However, the many data utilities that accompany visual dialog challenge existing attention techniques. We address this issue and develop a general attention mechanism for visual dialog which operates on any number of data utilities. To this end, we design a factor graph based attention mechanism which combines any number of utility representations. We illustrate the applicability of the proposed approach on the challenging and recently introduced VisDial datasets, outperforming recent state-of-the-art methods by 1.1% for VisDial0.9 and by 2% for VisDial1.0 on MRR. Our ensemble model improved the MRR score on VisDial1.0 by more than 6%.
Tasks Question Answering, Visual Dialog, Visual Question Answering
Published 2019-04-11
URL https://arxiv.org/abs/1904.05880v3
PDF https://arxiv.org/pdf/1904.05880v3.pdf
PWC https://paperswithcode.com/paper/factor-graph-attention
Repo https://github.com/idansc/fga
Framework none

Parameterized quantum circuits as machine learning models

Title Parameterized quantum circuits as machine learning models
Authors Marcello Benedetti, Erika Lloyd, Stefan Sack, Mattia Fiorentini
Abstract Hybrid quantum-classical systems make it possible to utilize existing quantum computers to their fullest extent. Within this framework, parameterized quantum circuits can be regarded as machine learning models with remarkable expressive power. This Review presents the components of these models and discusses their application to a variety of data-driven tasks, such as supervised learning and generative modeling. With an increasing number of experimental demonstrations carried out on actual quantum hardware and with software being actively developed, this rapidly growing field is poised to have a broad spectrum of real-world applications.
Tasks
Published 2019-06-18
URL https://arxiv.org/abs/1906.07682v2
PDF https://arxiv.org/pdf/1906.07682v2.pdf
PWC https://paperswithcode.com/paper/parameterized-quantum-circuits-as-machine
Repo https://github.com/UnofficialJuliaMirror/Yao.jl-5872b779-8223-5990-8dd0-5abbb0748c8c
Framework none

Deep Audio Prior

Title Deep Audio Prior
Authors Yapeng Tian, Chenliang Xu, Dingzeyu Li
Abstract Deep convolutional neural networks are known to specialize in distilling compact and robust prior from a large amount of data. We are interested in applying deep networks in the absence of training dataset. In this paper, we introduce deep audio prior (DAP) which leverages the structure of a network and the temporal information in a single audio file. Specifically, we demonstrate that a randomly-initialized neural network can be used with carefully designed audio prior to tackle challenging audio problems such as universal blind source separation, interactive audio editing, audio texture synthesis, and audio co-separation. To understand the robustness of the deep audio prior, we construct a benchmark dataset \emph{Universal-150} for universal sound source separation with a diverse set of sources. We show superior audio results than previous work on both qualitative and quantitative evaluations. We also perform thorough ablation study to validate our design choices.
Tasks Texture Synthesis
Published 2019-12-21
URL https://arxiv.org/abs/1912.10292v1
PDF https://arxiv.org/pdf/1912.10292v1.pdf
PWC https://paperswithcode.com/paper/deep-audio-prior-1
Repo https://github.com/adobe/Deep-Audio-Prior
Framework pytorch

NAMF: A Non-local Adaptive Mean Filter for Salt-and-Pepper Noise Removal

Title NAMF: A Non-local Adaptive Mean Filter for Salt-and-Pepper Noise Removal
Authors Houwang Zhang, Chong Wu, Hanying Zheng, Le Zhang
Abstract In this paper, a non-local adaptive mean filter (NAMF) is proposed, which can eliminate all levels of salt-and-pepper (SAP) noise. NAMF can be divided into two stages: (1) SAP noise detection; (2) SAP noise elimination. For a given pixel, firstly, we compare it with the maximum or minimum gray value of the noisy image, if it equals then we use a window with adaptive size to further determine whether it is noisy, and the noiseless pixel will be left. Secondly, the noisy pixel will be replaced by the combination of its neighboring pixels. And finally we use a SAP noise based non-local mean filter to further restore it. Our experimental results show that NAMF outperforms state-of-the-art methods in terms of quality for restoring image at all SAP noise levels.
Tasks Salt-And-Pepper Noise Removal
Published 2019-10-17
URL https://arxiv.org/abs/1910.07787v1
PDF https://arxiv.org/pdf/1910.07787v1.pdf
PWC https://paperswithcode.com/paper/namf-a-non-local-adaptive-mean-filter-for
Repo https://github.com/ProfHubert/NAMF
Framework none

PoMo: Generating Entity-Specific Post-Modifiers in Context

Title PoMo: Generating Entity-Specific Post-Modifiers in Context
Authors Jun Seok Kang, Robert L. Logan IV, Zewei Chu, Yang Chen, Dheeru Dua, Kevin Gimpel, Sameer Singh, Niranjan Balasubramanian
Abstract We introduce entity post-modifier generation as an instance of a collaborative writing task. Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity. For example, for the sentence, “Barack Obama, _______, supported the #MeToo movement.", the phrase “a father of two girls” is a contextually relevant post-modifier. To this end, we build PoMo, a post-modifier dataset created automatically from news articles reflecting a journalistic need for incorporating entity information that is relevant to a particular news event. PoMo consists of more than 231K sentences with post-modifiers and associated facts extracted from Wikidata for around 57K unique entities. We use crowdsourcing to show that modeling contextual relevance is necessary for accurate post-modifier generation. We adapt a number of existing generation approaches as baselines for this dataset. Our results show there is large room for improvement in terms of both identifying relevant facts to include (knowing which claims are relevant gives a >20% improvement in BLEU score), and generating appropriate post-modifier text for the context (providing relevant claims is not sufficient for accurate generation). We conduct an error analysis that suggests promising directions for future research.
Tasks
Published 2019-04-05
URL http://arxiv.org/abs/1904.03111v2
PDF http://arxiv.org/pdf/1904.03111v2.pdf
PWC https://paperswithcode.com/paper/pomo-generating-entity-specific-post
Repo https://github.com/StonyBrookNLP/PoMo
Framework none

Densely Residual Laplacian Super-Resolution

Title Densely Residual Laplacian Super-Resolution
Authors Saeed Anwar, Nick Barnes
Abstract Super-Resolution convolutional neural networks have recently demonstrated high-quality restoration for single images. However, existing algorithms often require very deep architectures and long training times. Furthermore, current convolutional neural networks for super-resolution are unable to exploit features at multiple scales and weigh them equally, limiting their learning capability. In this exposition, we present a compact and accurate super-resolution algorithm namely, Densely Residual Laplacian Network (DRLN). The proposed network employs cascading residual on the residual structure to allow the flow of low-frequency information to focus on learning high and mid-level features. In addition, deep supervision is achieved via the densely concatenated residual blocks settings, which also helps in learning from high-level complex features. Moreover, we propose Laplacian attention to model the crucial features to learn the inter and intra-level dependencies between the feature maps. Furthermore, comprehensive quantitative and qualitative evaluations on low-resolution, noisy low-resolution, and real historical image benchmark datasets illustrate that our DRLN algorithm performs favorably against the state-of-the-art methods visually and accurately.
Tasks Super-Resolution
Published 2019-06-28
URL https://arxiv.org/abs/1906.12021v2
PDF https://arxiv.org/pdf/1906.12021v2.pdf
PWC https://paperswithcode.com/paper/densely-residual-laplacian-super-resolution
Repo https://github.com/saeed-anwar/DRLN
Framework pytorch

DPSNet: End-to-end Deep Plane Sweep Stereo

Title DPSNet: End-to-end Deep Plane Sweep Stereo
Authors Sunghoon Im, Hae-Gon Jeon, Stephen Lin, In So Kweon
Abstract Multiview stereo aims to reconstruct scene depth from images acquired by a camera under arbitrary motion. Recent methods address this problem through deep learning, which can utilize semantic cues to deal with challenges such as textureless and reflective regions. In this paper, we present a convolutional neural network called DPSNet (Deep Plane Sweep Network) whose design is inspired by best practices of traditional geometry-based approaches for dense depth reconstruction. Rather than directly estimating depth and/or optical flow correspondence from image pairs as done in many previous deep learning methods, DPSNet takes a plane sweep approach that involves building a cost volume from deep features using the plane sweep algorithm, regularizing the cost volume via a context-aware cost aggregation, and regressing the dense depth map from the cost volume. The cost volume is constructed using a differentiable warping process that allows for end-to-end training of the network. Through the effective incorporation of conventional multiview stereo concepts within a deep learning framework, DPSNet achieves state-of-the-art reconstruction results on a variety of challenging datasets.
Tasks Optical Flow Estimation
Published 2019-05-02
URL http://arxiv.org/abs/1905.00538v1
PDF http://arxiv.org/pdf/1905.00538v1.pdf
PWC https://paperswithcode.com/paper/dpsnet-end-to-end-deep-plane-sweep-stereo-1
Repo https://github.com/sunghoonim/DPSNet
Framework pytorch

Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems

Title Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems
Authors Qitian Wu, Hengrui Zhang, Xiaofeng Gao, Peng He, Paul Weng, Han Gao, Guihai Chen
Abstract Social recommendation leverages social information to solve data sparsity and cold-start problems in traditional collaborative filtering methods. However, most existing models assume that social effects from friend users are static and under the forms of constant weights or fixed constraints. To relax this strong assumption, in this paper, we propose dual graph attention networks to collaboratively learn representations for two-fold social effects, where one is modeled by a user-specific attention weight and the other is modeled by a dynamic and context-aware attention weight. We also extend the social effects in user domain to item domain, so that information from related items can be leveraged to further alleviate the data sparsity problem. Furthermore, considering that different social effects in two domains could interact with each other and jointly influence user preferences for items, we propose a new policy-based fusion strategy based on contextual multi-armed bandit to weigh interactions of various social effects. Experiments on one benchmark dataset and a commercial dataset verify the efficacy of the key components in our model. The results show that our model achieves great improvement for recommendation accuracy compared with other state-of-the-art social recommendation methods.
Tasks Recommendation Systems
Published 2019-03-25
URL http://arxiv.org/abs/1903.10433v1
PDF http://arxiv.org/pdf/1903.10433v1.pdf
PWC https://paperswithcode.com/paper/dual-graph-attention-networks-for-deep-latent
Repo https://github.com/echo740/DANSER-WWW-19
Framework tf

Convolutional Neural Networks for Classification of Alzheimer’s Disease: Overview and Reproducible Evaluation

Title Convolutional Neural Networks for Classification of Alzheimer’s Disease: Overview and Reproducible Evaluation
Authors Junhao Wen, Elina Thibeau-Sutre, Mauricio Diaz-Melo, Jorge Samper-Gonzalez, Alexandre Routier, Simona Bottani, Didier Dormont, Stanley Durrleman, Ninon Burgos, Olivier Colliot
Abstract Over 30 papers have proposed to use convolutional neural network (CNN) for AD classification from anatomical MRI. However, the classification performance is difficult to compare across studies due to variations in components such as participant selection, image preprocessing or validation procedure. Moreover, these studies are hardly reproducible because their frameworks are not publicly accessible and because implementation details are lacking. Lastly, some of these papers may report a biased performance due to inadequate or unclear validation or model selection procedures. In the present work, we aim to address these limitations through three main contributions. First, we performed a systematic literature review and found that more than half of the surveyed papers may have suffered from data leakage. Our second contribution is the extension of our open-source framework for classification of AD using CNN and T1-weighted MRI. Finally, we used this framework to rigorously compare different CNN architectures. The data was split into training/validation/test sets at the very beginning and only the training/validation sets were used for model selection. To avoid any overfitting, the test sets were left untouched until the end of the peer-review process. Overall, the different 3D approaches (3D-subject, 3D-ROI, 3D-patch) achieved similar performances while that of the 2D slice approach was lower. Of note, the different CNN approaches did not perform better than a SVM with voxel-based features. The different approaches generalized well to similar populations but not to datasets with different inclusion criteria or demographical characteristics.
Tasks Model Selection, Transfer Learning
Published 2019-04-16
URL https://arxiv.org/abs/1904.07773v4
PDF https://arxiv.org/pdf/1904.07773v4.pdf
PWC https://paperswithcode.com/paper/convolutional-neural-networks-for-2
Repo https://github.com/SSinyu/p
Framework tf

ICface: Interpretable and Controllable Face Reenactment Using GANs

Title ICface: Interpretable and Controllable Face Reenactment Using GANs
Authors Soumya Tripathy, Juho Kannala, Esa Rahtu
Abstract This paper presents a generic face animator that is able to control the pose and expressions of a given face image. The animation is driven by human interpretable control signals consisting of head pose angles and the Action Unit (AU) values. The control information can be obtained from multiple sources including external driving videos and manual controls. Due to the interpretable nature of the driving signal, one can easily mix the information between multiple sources (e.g. pose from one image and expression from another) and apply selective post-production editing. The proposed face animator is implemented as a two-stage neural network model that is learned in a self-supervised manner using a large video collection. The proposed Interpretable and Controllable face reenactment network (ICface) is compared to the state-of-the-art neural network-based face animation techniques in multiple tasks. The results indicate that ICface produces better visual quality while being more versatile than most of the comparison methods. The introduced model could provide a lightweight and easy to use tool for a multitude of advanced image and video editing tasks.
Tasks Face Reenactment
Published 2019-04-03
URL https://arxiv.org/abs/1904.01909v2
PDF https://arxiv.org/pdf/1904.01909v2.pdf
PWC https://paperswithcode.com/paper/icface-interpretable-and-controllable-face
Repo https://github.com/Blade6570/icface
Framework pytorch

FAHT: An Adaptive Fairness-aware Decision Tree Classifier

Title FAHT: An Adaptive Fairness-aware Decision Tree Classifier
Authors Wenbin Zhang, Eirini Ntoutsi
Abstract Automated data-driven decision-making systems are ubiquitous across a wide spread of online as well as offline services. These systems, depend on sophisticated learning algorithms and available data, to optimize the service function for decision support assistance. However, there is a growing concern about the accountability and fairness of the employed models by the fact that often the available historic data is intrinsically discriminatory, i.e., the proportion of members sharing one or more sensitive attributes is higher than the proportion in the population as a whole when receiving positive classification, which leads to a lack of fairness in decision support system. A number of fairness-aware learning methods have been proposed to handle this concern. However, these methods tackle fairness as a static problem and do not take the evolution of the underlying stream population into consideration. In this paper, we introduce a learning mechanism to design a fair classifier for online stream based decision-making. Our learning model, FAHT (Fairness-Aware Hoeffding Tree), is an extension of the well-known Hoeffding Tree algorithm for decision tree induction over streams, that also accounts for fairness. Our experiments show that our algorithm is able to deal with discrimination in streaming environments, while maintaining a moderate predictive performance over the stream.
Tasks Decision Making
Published 2019-07-16
URL https://arxiv.org/abs/1907.07237v1
PDF https://arxiv.org/pdf/1907.07237v1.pdf
PWC https://paperswithcode.com/paper/faht-an-adaptive-fairness-aware-decision-tree
Repo https://github.com/vanbanTruong/FAHT
Framework none

A Simple Yet Effective Approach to Robust Optimization Over Time

Title A Simple Yet Effective Approach to Robust Optimization Over Time
Authors Lukáš Adam, Xin Yao
Abstract Robust optimization over time (ROOT) refers to an optimization problem where its performance is evaluated over a period of future time. Most of the existing algorithms use particle swarm optimization combined with another method which predicts future solutions to the optimization problem. We argue that this approach may perform subpar and suggest instead a method based on a random sampling of the search space. We prove its theoretical guarantees and show that it significantly outperforms the state-of-the-art methods for ROOT.
Tasks
Published 2019-07-22
URL https://arxiv.org/abs/1907.09248v3
PDF https://arxiv.org/pdf/1907.09248v3.pdf
PWC https://paperswithcode.com/paper/a-simple-yet-effective-approach-to-robust
Repo https://github.com/sadda/ROOT-Benchmark
Framework none

Human-grounded Evaluations of Explanation Methods for Text Classification

Title Human-grounded Evaluations of Explanation Methods for Text Classification
Authors Piyawat Lertvittayakumjorn, Francesca Toni
Abstract Due to the black-box nature of deep learning models, methods for explaining the models’ results are crucial to gain trust from humans and support collaboration between AIs and humans. In this paper, we consider several model-agnostic and model-specific explanation methods for CNNs for text classification and conduct three human-grounded evaluations, focusing on different purposes of explanations: (1) revealing model behavior, (2) justifying model predictions, and (3) helping humans investigate uncertain predictions. The results highlight dissimilar qualities of the various explanation methods we consider and show the degree to which these methods could serve for each purpose.
Tasks Text Classification
Published 2019-08-29
URL https://arxiv.org/abs/1908.11355v1
PDF https://arxiv.org/pdf/1908.11355v1.pdf
PWC https://paperswithcode.com/paper/human-grounded-evaluations-of-explanation
Repo https://github.com/plkumjorn/CNNAnalysis
Framework none

Pathology GAN: Learning deep representations of cancer tissue

Title Pathology GAN: Learning deep representations of cancer tissue
Authors Adalberto Claudio Quiros, Roderick Murray-Smith, Ke Yuan
Abstract We apply Generative Adversarial Networks (GANs) to the domain of digital pathology. Current machine learning research for digital pathology focuses on diagnosis, but we suggest a different approach and advocate that generative models could drive forward the understanding of morphological characteristics of cancer tissue. In this paper, we develop a framework which allows GANs to capture key tissue features and uses these characteristics to give structure to its latent space. To this end, we trained our model on 249K H&E breast cancer tissue images. We show that our model generates high quality images, with a Frechet Inception Distance (FID) of 16.65. We additionally assess the quality of the images with cancer tissue characteristics (e.g. count of cancer, lymphocytes, or stromal cells), using quantitative information to calculate the FID and showing consistent performance of 9.86. Additionally, the latent space of our model shows an interpretable structure and allows semantic vector operations that translate into tissue feature transformations. Furthermore, ratings from two expert pathologists found no significant difference between our generated tissue images from real ones.
Tasks
Published 2019-07-04
URL https://arxiv.org/abs/1907.02644v2
PDF https://arxiv.org/pdf/1907.02644v2.pdf
PWC https://paperswithcode.com/paper/pathology-gan-learning-deep-representations
Repo https://github.com/AdalbertoCq/Pathology-GAN
Framework tf

Orientation-aware Semantic Segmentation on Icosahedron Spheres

Title Orientation-aware Semantic Segmentation on Icosahedron Spheres
Authors Chao Zhang, Stephan Liwicki, William Smith, Roberto Cipolla
Abstract We address semantic segmentation on omnidirectional images, to leverage a holistic understanding of the surrounding scene for applications like autonomous driving systems. For the spherical domain, several methods recently adopt an icosahedron mesh, but systems are typically rotation invariant or require significant memory and parameters, thus enabling execution only at very low resolutions. In our work, we propose an orientation-aware CNN framework for the icosahedron mesh. Our representation allows for fast network operations, as our design simplifies to standard network operations of classical CNNs, but under consideration of north-aligned kernel convolutions for features on the sphere. We implement our representation and demonstrate its memory efficiency up-to a level-8 resolution mesh (equivalent to 640 x 1024 equirectangular images). Finally, since our kernels operate on the tangent of the sphere, standard feature weights, pretrained on perspective data, can be directly transferred with only small need for weight refinement. In our evaluation our orientation-aware CNN becomes a new state of the art for the recent 2D3DS dataset, and our Omni-SYNTHIA version of SYNTHIA. Rotation invariant classification and segmentation tasks are additionally presented for comparison to prior art.
Tasks Autonomous Driving, Semantic Segmentation
Published 2019-07-30
URL https://arxiv.org/abs/1907.12849v1
PDF https://arxiv.org/pdf/1907.12849v1.pdf
PWC https://paperswithcode.com/paper/orientation-aware-semantic-segmentation-on
Repo https://github.com/matsuren/HexRUNet_pytorch
Framework pytorch
comments powered by Disqus