January 31, 2020

3068 words 15 mins read

Paper Group AWR 450

Factor Graph Attention. Parameterized quantum circuits as machine learning models. Deep Audio Prior. NAMF: A Non-local Adaptive Mean Filter for Salt-and-Pepper Noise Removal. PoMo: Generating Entity-Specific Post-Modifiers in Context. Densely Residual Laplacian Super-Resolution. DPSNet: End-to-end Deep Plane Sweep Stereo. Dual Graph Attention Netwo …

Factor Graph Attention


Title	Factor Graph Attention
Authors	Idan Schwartz, Seunghak Yu, Tamir Hazan, Alexander Schwing
Abstract	Dialog is an effective way to exchange information, but subtle details and nuances are extremely important. While significant progress has paved a path to address visual dialog with algorithms, details and nuances remain a challenge. Attention mechanisms have demonstrated compelling results to extract details in visual question answering and also provide a convincing framework for visual dialog due to their interpretability and effectiveness. However, the many data utilities that accompany visual dialog challenge existing attention techniques. We address this issue and develop a general attention mechanism for visual dialog which operates on any number of data utilities. To this end, we design a factor graph based attention mechanism which combines any number of utility representations. We illustrate the applicability of the proposed approach on the challenging and recently introduced VisDial datasets, outperforming recent state-of-the-art methods by 1.1% for VisDial0.9 and by 2% for VisDial1.0 on MRR. Our ensemble model improved the MRR score on VisDial1.0 by more than 6%.
Tasks	Question Answering, Visual Dialog, Visual Question Answering
Published	2019-04-11
URL	https://arxiv.org/abs/1904.05880v3
PDF	https://arxiv.org/pdf/1904.05880v3.pdf
PWC	https://paperswithcode.com/paper/factor-graph-attention
Repo	https://github.com/idansc/fga
Framework	none

Parameterized quantum circuits as machine learning models


Title	Parameterized quantum circuits as machine learning models
Authors	Marcello Benedetti, Erika Lloyd, Stefan Sack, Mattia Fiorentini
Abstract	Hybrid quantum-classical systems make it possible to utilize existing quantum computers to their fullest extent. Within this framework, parameterized quantum circuits can be regarded as machine learning models with remarkable expressive power. This Review presents the components of these models and discusses their application to a variety of data-driven tasks, such as supervised learning and generative modeling. With an increasing number of experimental demonstrations carried out on actual quantum hardware and with software being actively developed, this rapidly growing field is poised to have a broad spectrum of real-world applications.
Tasks
Published	2019-06-18
URL	https://arxiv.org/abs/1906.07682v2
PDF	https://arxiv.org/pdf/1906.07682v2.pdf
PWC	https://paperswithcode.com/paper/parameterized-quantum-circuits-as-machine
Repo	https://github.com/UnofficialJuliaMirror/Yao.jl-5872b779-8223-5990-8dd0-5abbb0748c8c
Framework	none

Deep Audio Prior


Title	Deep Audio Prior
Authors	Yapeng Tian, Chenliang Xu, Dingzeyu Li
Abstract	Deep convolutional neural networks are known to specialize in distilling compact and robust prior from a large amount of data. We are interested in applying deep networks in the absence of training dataset. In this paper, we introduce deep audio prior (DAP) which leverages the structure of a network and the temporal information in a single audio file. Specifically, we demonstrate that a randomly-initialized neural network can be used with carefully designed audio prior to tackle challenging audio problems such as universal blind source separation, interactive audio editing, audio texture synthesis, and audio co-separation. To understand the robustness of the deep audio prior, we construct a benchmark dataset \emph{Universal-150} for universal sound source separation with a diverse set of sources. We show superior audio results than previous work on both qualitative and quantitative evaluations. We also perform thorough ablation study to validate our design choices.
Tasks	Texture Synthesis
Published	2019-12-21
URL	https://arxiv.org/abs/1912.10292v1
PDF	https://arxiv.org/pdf/1912.10292v1.pdf
PWC	https://paperswithcode.com/paper/deep-audio-prior-1
Repo	https://github.com/adobe/Deep-Audio-Prior
Framework	pytorch

NAMF: A Non-local Adaptive Mean Filter for Salt-and-Pepper Noise Removal


Title	NAMF: A Non-local Adaptive Mean Filter for Salt-and-Pepper Noise Removal
Authors	Houwang Zhang, Chong Wu, Hanying Zheng, Le Zhang
Abstract	In this paper, a non-local adaptive mean filter (NAMF) is proposed, which can eliminate all levels of salt-and-pepper (SAP) noise. NAMF can be divided into two stages: (1) SAP noise detection; (2) SAP noise elimination. For a given pixel, firstly, we compare it with the maximum or minimum gray value of the noisy image, if it equals then we use a window with adaptive size to further determine whether it is noisy, and the noiseless pixel will be left. Secondly, the noisy pixel will be replaced by the combination of its neighboring pixels. And finally we use a SAP noise based non-local mean filter to further restore it. Our experimental results show that NAMF outperforms state-of-the-art methods in terms of quality for restoring image at all SAP noise levels.
Tasks	Salt-And-Pepper Noise Removal
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07787v1
PDF	https://arxiv.org/pdf/1910.07787v1.pdf
PWC	https://paperswithcode.com/paper/namf-a-non-local-adaptive-mean-filter-for
Repo	https://github.com/ProfHubert/NAMF
Framework	none

PoMo: Generating Entity-Specific Post-Modifiers in Context


Title	PoMo: Generating Entity-Specific Post-Modifiers in Context
Authors	Jun Seok Kang, Robert L. Logan IV, Zewei Chu, Yang Chen, Dheeru Dua, Kevin Gimpel, Sameer Singh, Niranjan Balasubramanian
Abstract	We introduce entity post-modifier generation as an instance of a collaborative writing task. Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity. For example, for the sentence, “Barack Obama, _______, supported the #MeToo movement.", the phrase “a father of two girls” is a contextually relevant post-modifier. To this end, we build PoMo, a post-modifier dataset created automatically from news articles reflecting a journalistic need for incorporating entity information that is relevant to a particular news event. PoMo consists of more than 231K sentences with post-modifiers and associated facts extracted from Wikidata for around 57K unique entities. We use crowdsourcing to show that modeling contextual relevance is necessary for accurate post-modifier generation. We adapt a number of existing generation approaches as baselines for this dataset. Our results show there is large room for improvement in terms of both identifying relevant facts to include (knowing which claims are relevant gives a >20% improvement in BLEU score), and generating appropriate post-modifier text for the context (providing relevant claims is not sufficient for accurate generation). We conduct an error analysis that suggests promising directions for future research.
Tasks
Published	2019-04-05
URL	http://arxiv.org/abs/1904.03111v2
PDF	http://arxiv.org/pdf/1904.03111v2.pdf
PWC	https://paperswithcode.com/paper/pomo-generating-entity-specific-post
Repo	https://github.com/StonyBrookNLP/PoMo
Framework	none

Densely Residual Laplacian Super-Resolution


Title	Densely Residual Laplacian Super-Resolution
Authors	Saeed Anwar, Nick Barnes
Abstract	Super-Resolution convolutional neural networks have recently demonstrated high-quality restoration for single images. However, existing algorithms often require very deep architectures and long training times. Furthermore, current convolutional neural networks for super-resolution are unable to exploit features at multiple scales and weigh them equally, limiting their learning capability. In this exposition, we present a compact and accurate super-resolution algorithm namely, Densely Residual Laplacian Network (DRLN). The proposed network employs cascading residual on the residual structure to allow the flow of low-frequency information to focus on learning high and mid-level features. In addition, deep supervision is achieved via the densely concatenated residual blocks settings, which also helps in learning from high-level complex features. Moreover, we propose Laplacian attention to model the crucial features to learn the inter and intra-level dependencies between the feature maps. Furthermore, comprehensive quantitative and qualitative evaluations on low-resolution, noisy low-resolution, and real historical image benchmark datasets illustrate that our DRLN algorithm performs favorably against the state-of-the-art methods visually and accurately.
Tasks	Super-Resolution
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12021v2
PDF	https://arxiv.org/pdf/1906.12021v2.pdf
PWC	https://paperswithcode.com/paper/densely-residual-laplacian-super-resolution
Repo	https://github.com/saeed-anwar/DRLN
Framework	pytorch

DPSNet: End-to-end Deep Plane Sweep Stereo


Title	DPSNet: End-to-end Deep Plane Sweep Stereo
Authors	Sunghoon Im, Hae-Gon Jeon, Stephen Lin, In So Kweon
Abstract	Multiview stereo aims to reconstruct scene depth from images acquired by a camera under arbitrary motion. Recent methods address this problem through deep learning, which can utilize semantic cues to deal with challenges such as textureless and reflective regions. In this paper, we present a convolutional neural network called DPSNet (Deep Plane Sweep Network) whose design is inspired by best practices of traditional geometry-based approaches for dense depth reconstruction. Rather than directly estimating depth and/or optical flow correspondence from image pairs as done in many previous deep learning methods, DPSNet takes a plane sweep approach that involves building a cost volume from deep features using the plane sweep algorithm, regularizing the cost volume via a context-aware cost aggregation, and regressing the dense depth map from the cost volume. The cost volume is constructed using a differentiable warping process that allows for end-to-end training of the network. Through the effective incorporation of conventional multiview stereo concepts within a deep learning framework, DPSNet achieves state-of-the-art reconstruction results on a variety of challenging datasets.
Tasks	Optical Flow Estimation
Published	2019-05-02
URL	http://arxiv.org/abs/1905.00538v1
PDF	http://arxiv.org/pdf/1905.00538v1.pdf
PWC	https://paperswithcode.com/paper/dpsnet-end-to-end-deep-plane-sweep-stereo-1
Repo	https://github.com/sunghoonim/DPSNet
Framework	pytorch


Title	Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems
Authors	Qitian Wu, Hengrui Zhang, Xiaofeng Gao, Peng He, Paul Weng, Han Gao, Guihai Chen
Abstract	Social recommendation leverages social information to solve data sparsity and cold-start problems in traditional collaborative filtering methods. However, most existing models assume that social effects from friend users are static and under the forms of constant weights or fixed constraints. To relax this strong assumption, in this paper, we propose dual graph attention networks to collaboratively learn representations for two-fold social effects, where one is modeled by a user-specific attention weight and the other is modeled by a dynamic and context-aware attention weight. We also extend the social effects in user domain to item domain, so that information from related items can be leveraged to further alleviate the data sparsity problem. Furthermore, considering that different social effects in two domains could interact with each other and jointly influence user preferences for items, we propose a new policy-based fusion strategy based on contextual multi-armed bandit to weigh interactions of various social effects. Experiments on one benchmark dataset and a commercial dataset verify the efficacy of the key components in our model. The results show that our model achieves great improvement for recommendation accuracy compared with other state-of-the-art social recommendation methods.
Tasks	Recommendation Systems
Published	2019-03-25
URL	http://arxiv.org/abs/1903.10433v1
PDF	http://arxiv.org/pdf/1903.10433v1.pdf
PWC	https://paperswithcode.com/paper/dual-graph-attention-networks-for-deep-latent
Repo	https://github.com/echo740/DANSER-WWW-19
Framework	tf

Convolutional Neural Networks for Classification of Alzheimer’s Disease: Overview and Reproducible Evaluation


Title	Convolutional Neural Networks for Classification of Alzheimer’s Disease: Overview and Reproducible Evaluation
Authors	Junhao Wen, Elina Thibeau-Sutre, Mauricio Diaz-Melo, Jorge Samper-Gonzalez, Alexandre Routier, Simona Bottani, Didier Dormont, Stanley Durrleman, Ninon Burgos, Olivier Colliot
Abstract	Over 30 papers have proposed to use convolutional neural network (CNN) for AD classification from anatomical MRI. However, the classification performance is difficult to compare across studies due to variations in components such as participant selection, image preprocessing or validation procedure. Moreover, these studies are hardly reproducible because their frameworks are not publicly accessible and because implementation details are lacking. Lastly, some of these papers may report a biased performance due to inadequate or unclear validation or model selection procedures. In the present work, we aim to address these limitations through three main contributions. First, we performed a systematic literature review and found that more than half of the surveyed papers may have suffered from data leakage. Our second contribution is the extension of our open-source framework for classification of AD using CNN and T1-weighted MRI. Finally, we used this framework to rigorously compare different CNN architectures. The data was split into training/validation/test sets at the very beginning and only the training/validation sets were used for model selection. To avoid any overfitting, the test sets were left untouched until the end of the peer-review process. Overall, the different 3D approaches (3D-subject, 3D-ROI, 3D-patch) achieved similar performances while that of the 2D slice approach was lower. Of note, the different CNN approaches did not perform better than a SVM with voxel-based features. The different approaches generalized well to similar populations but not to datasets with different inclusion criteria or demographical characteristics.
Tasks	Model Selection, Transfer Learning
Published	2019-04-16
URL	https://arxiv.org/abs/1904.07773v4
PDF	https://arxiv.org/pdf/1904.07773v4.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-2
Repo	https://github.com/SSinyu/p
Framework	tf

ICface: Interpretable and Controllable Face Reenactment Using GANs


Title	ICface: Interpretable and Controllable Face Reenactment Using GANs
Authors	Soumya Tripathy, Juho Kannala, Esa Rahtu
Abstract	This paper presents a generic face animator that is able to control the pose and expressions of a given face image. The animation is driven by human interpretable control signals consisting of head pose angles and the Action Unit (AU) values. The control information can be obtained from multiple sources including external driving videos and manual controls. Due to the interpretable nature of the driving signal, one can easily mix the information between multiple sources (e.g. pose from one image and expression from another) and apply selective post-production editing. The proposed face animator is implemented as a two-stage neural network model that is learned in a self-supervised manner using a large video collection. The proposed Interpretable and Controllable face reenactment network (ICface) is compared to the state-of-the-art neural network-based face animation techniques in multiple tasks. The results indicate that ICface produces better visual quality while being more versatile than most of the comparison methods. The introduced model could provide a lightweight and easy to use tool for a multitude of advanced image and video editing tasks.
Tasks	Face Reenactment
Published	2019-04-03
URL	https://arxiv.org/abs/1904.01909v2
PDF	https://arxiv.org/pdf/1904.01909v2.pdf
PWC	https://paperswithcode.com/paper/icface-interpretable-and-controllable-face
Repo	https://github.com/Blade6570/icface
Framework	pytorch

FAHT: An Adaptive Fairness-aware Decision Tree Classifier


Title	FAHT: An Adaptive Fairness-aware Decision Tree Classifier
Authors	Wenbin Zhang, Eirini Ntoutsi
Abstract	Automated data-driven decision-making systems are ubiquitous across a wide spread of online as well as offline services. These systems, depend on sophisticated learning algorithms and available data, to optimize the service function for decision support assistance. However, there is a growing concern about the accountability and fairness of the employed models by the fact that often the available historic data is intrinsically discriminatory, i.e., the proportion of members sharing one or more sensitive attributes is higher than the proportion in the population as a whole when receiving positive classification, which leads to a lack of fairness in decision support system. A number of fairness-aware learning methods have been proposed to handle this concern. However, these methods tackle fairness as a static problem and do not take the evolution of the underlying stream population into consideration. In this paper, we introduce a learning mechanism to design a fair classifier for online stream based decision-making. Our learning model, FAHT (Fairness-Aware Hoeffding Tree), is an extension of the well-known Hoeffding Tree algorithm for decision tree induction over streams, that also accounts for fairness. Our experiments show that our algorithm is able to deal with discrimination in streaming environments, while maintaining a moderate predictive performance over the stream.
Tasks	Decision Making
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07237v1
PDF	https://arxiv.org/pdf/1907.07237v1.pdf
PWC	https://paperswithcode.com/paper/faht-an-adaptive-fairness-aware-decision-tree
Repo	https://github.com/vanbanTruong/FAHT
Framework	none

A Simple Yet Effective Approach to Robust Optimization Over Time


Title	A Simple Yet Effective Approach to Robust Optimization Over Time
Authors	Lukáš Adam, Xin Yao
Abstract	Robust optimization over time (ROOT) refers to an optimization problem where its performance is evaluated over a period of future time. Most of the existing algorithms use particle swarm optimization combined with another method which predicts future solutions to the optimization problem. We argue that this approach may perform subpar and suggest instead a method based on a random sampling of the search space. We prove its theoretical guarantees and show that it significantly outperforms the state-of-the-art methods for ROOT.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09248v3
PDF	https://arxiv.org/pdf/1907.09248v3.pdf
PWC	https://paperswithcode.com/paper/a-simple-yet-effective-approach-to-robust
Repo	https://github.com/sadda/ROOT-Benchmark
Framework	none

Human-grounded Evaluations of Explanation Methods for Text Classification


Title	Human-grounded Evaluations of Explanation Methods for Text Classification
Authors	Piyawat Lertvittayakumjorn, Francesca Toni
Abstract	Due to the black-box nature of deep learning models, methods for explaining the models’ results are crucial to gain trust from humans and support collaboration between AIs and humans. In this paper, we consider several model-agnostic and model-specific explanation methods for CNNs for text classification and conduct three human-grounded evaluations, focusing on different purposes of explanations: (1) revealing model behavior, (2) justifying model predictions, and (3) helping humans investigate uncertain predictions. The results highlight dissimilar qualities of the various explanation methods we consider and show the degree to which these methods could serve for each purpose.
Tasks	Text Classification
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11355v1
PDF	https://arxiv.org/pdf/1908.11355v1.pdf
PWC	https://paperswithcode.com/paper/human-grounded-evaluations-of-explanation
Repo	https://github.com/plkumjorn/CNNAnalysis
Framework	none

Pathology GAN: Learning deep representations of cancer tissue


Title	Pathology GAN: Learning deep representations of cancer tissue
Authors	Adalberto Claudio Quiros, Roderick Murray-Smith, Ke Yuan
Abstract	We apply Generative Adversarial Networks (GANs) to the domain of digital pathology. Current machine learning research for digital pathology focuses on diagnosis, but we suggest a different approach and advocate that generative models could drive forward the understanding of morphological characteristics of cancer tissue. In this paper, we develop a framework which allows GANs to capture key tissue features and uses these characteristics to give structure to its latent space. To this end, we trained our model on 249K H&E breast cancer tissue images. We show that our model generates high quality images, with a Frechet Inception Distance (FID) of 16.65. We additionally assess the quality of the images with cancer tissue characteristics (e.g. count of cancer, lymphocytes, or stromal cells), using quantitative information to calculate the FID and showing consistent performance of 9.86. Additionally, the latent space of our model shows an interpretable structure and allows semantic vector operations that translate into tissue feature transformations. Furthermore, ratings from two expert pathologists found no significant difference between our generated tissue images from real ones.
Tasks
Published	2019-07-04
URL	https://arxiv.org/abs/1907.02644v2
PDF	https://arxiv.org/pdf/1907.02644v2.pdf
PWC	https://paperswithcode.com/paper/pathology-gan-learning-deep-representations
Repo	https://github.com/AdalbertoCq/Pathology-GAN
Framework	tf

Orientation-aware Semantic Segmentation on Icosahedron Spheres


Title	Orientation-aware Semantic Segmentation on Icosahedron Spheres
Authors	Chao Zhang, Stephan Liwicki, William Smith, Roberto Cipolla
Abstract	We address semantic segmentation on omnidirectional images, to leverage a holistic understanding of the surrounding scene for applications like autonomous driving systems. For the spherical domain, several methods recently adopt an icosahedron mesh, but systems are typically rotation invariant or require significant memory and parameters, thus enabling execution only at very low resolutions. In our work, we propose an orientation-aware CNN framework for the icosahedron mesh. Our representation allows for fast network operations, as our design simplifies to standard network operations of classical CNNs, but under consideration of north-aligned kernel convolutions for features on the sphere. We implement our representation and demonstrate its memory efficiency up-to a level-8 resolution mesh (equivalent to 640 x 1024 equirectangular images). Finally, since our kernels operate on the tangent of the sphere, standard feature weights, pretrained on perspective data, can be directly transferred with only small need for weight refinement. In our evaluation our orientation-aware CNN becomes a new state of the art for the recent 2D3DS dataset, and our Omni-SYNTHIA version of SYNTHIA. Rotation invariant classification and segmentation tasks are additionally presented for comparison to prior art.
Tasks	Autonomous Driving, Semantic Segmentation
Published	2019-07-30
URL	https://arxiv.org/abs/1907.12849v1
PDF	https://arxiv.org/pdf/1907.12849v1.pdf
PWC	https://paperswithcode.com/paper/orientation-aware-semantic-segmentation-on
Repo	https://github.com/matsuren/HexRUNet_pytorch
Framework	pytorch