October 19, 2019

2660 words 13 mins read

Paper Group ANR 174

The Conversation: Deep Audio-Visual Speech Enhancement. Artistic Instance-Aware Image Filtering by Convolutional Neural Networks. Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME. Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization. An Iterative Boundary Random Walks Algorithm for Interactive Ima …

The Conversation: Deep Audio-Visual Speech Enhancement


Title	The Conversation: Deep Audio-Visual Speech Enhancement
Authors	Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman
Abstract	Our goal is to isolate individual speakers from multi-talker simultaneous speech in videos. Existing works in this area have focussed on trying to separate utterances from known speakers in controlled environments. In this paper, we propose a deep audio-visual speech enhancement network that is able to separate a speaker’s voice given lip regions in the corresponding video, by predicting both the magnitude and the phase of the target signal. The method is applicable to speakers unheard and unseen during training, and for unconstrained environments. We demonstrate strong quantitative and qualitative results, isolating extremely challenging real-world examples.
Tasks	Speech Enhancement
Published	2018-04-11
URL	http://arxiv.org/abs/1804.04121v2
PDF	http://arxiv.org/pdf/1804.04121v2.pdf
PWC	https://paperswithcode.com/paper/the-conversation-deep-audio-visual-speech
Repo
Framework

Artistic Instance-Aware Image Filtering by Convolutional Neural Networks


Title	Artistic Instance-Aware Image Filtering by Convolutional Neural Networks
Authors	Milad Tehrani, Mahnoosh Bagheri, Mahdi Ahmadi, Alireza Norouzi, Nader Karimi, Shadrokh Samavi
Abstract	In the recent years, public use of artistic effects for editing and beautifying images has encouraged researchers to look for new approaches to this task. Most of the existing methods apply artistic effects to the whole image. Exploitation of neural network vision technologies like object detection and semantic segmentation could be a new viewpoint in this area. In this paper, we utilize an instance segmentation neural network to obtain a class mask for separately filtering the background and foreground of an image. We implement a top prior-mask selection to let us select an object class for filtering purpose. Different artistic effects are used in the filtering process to meet the requirements of a vast variety of users. Also, our method is flexible enough to allow the addition of new filters. We use pre-trained Mask R-CNN instance segmentation on the COCO dataset as the segmentation network. Experimental results on the use of different filters are performed. System’s output results show that this novel approach can create satisfying artistic images with fast operation and simple interface.
Tasks	Instance Segmentation, Object Detection, Semantic Segmentation
Published	2018-09-22
URL	http://arxiv.org/abs/1809.08448v1
PDF	http://arxiv.org/pdf/1809.08448v1.pdf
PWC	https://paperswithcode.com/paper/artistic-instance-aware-image-filtering-by
Repo
Framework

Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME


Title	Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME
Authors	Farhad Shakerin, Gopal Gupta
Abstract	We present a heuristic based algorithm to induce \textit{nonmonotonic} logic programs that will explain the behavior of XGBoost trained classifiers. We use the technique based on the LIME approach to locally select the most important features contributing to the classification decision. Then, in order to explain the model’s global behavior, we propose the LIME-FOLD algorithm —a heuristic-based inductive logic programming (ILP) algorithm capable of learning non-monotonic logic programs—that we apply to a transformed dataset produced by LIME. Our proposed approach is agnostic to the choice of the ILP algorithm. Our experiments with UCI standard benchmarks suggest a significant improvement in terms of classification evaluation metrics. Meanwhile, the number of induced rules dramatically decreases compared to ALEPH, a state-of-the-art ILP system.
Tasks
Published	2018-08-02
URL	http://arxiv.org/abs/1808.00629v2
PDF	http://arxiv.org/pdf/1808.00629v2.pdf
PWC	https://paperswithcode.com/paper/induction-of-non-monotonic-logic-programs-to
Repo
Framework

Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization


Title	Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization
Authors	Zhihui Zhu, Xiao Li, Kai Liu, Qiuwei Li
Abstract	Symmetric nonnegative matrix factorization (NMF), a special but important class of the general NMF, is demonstrated to be useful for data analysis and in particular for various clustering tasks. Unfortunately, designing fast algorithms for Symmetric NMF is not as easy as for the nonsymmetric counterpart, the latter admitting the splitting property that allows efficient alternating-type algorithms. To overcome this issue, we transfer the symmetric NMF to a nonsymmetric one, then we can adopt the idea from the state-of-the-art algorithms for nonsymmetric NMF to design fast algorithms solving symmetric NMF. We rigorously establish that solving nonsymmetric reformulation returns a solution for symmetric NMF and then apply fast alternating based algorithms for the corresponding reformulated problem. Furthermore, we show these fast algorithms admit strong convergence guarantee in the sense that the generated sequence is convergent at least at a sublinear rate and it converges globally to a critical point of the symmetric NMF. We conduct experiments on both synthetic data and image clustering to support our result.
Tasks	Image Clustering
Published	2018-11-14
URL	http://arxiv.org/abs/1811.05642v1
PDF	http://arxiv.org/pdf/1811.05642v1.pdf
PWC	https://paperswithcode.com/paper/dropping-symmetry-for-fast-symmetric
Repo
Framework

An Iterative Boundary Random Walks Algorithm for Interactive Image Segmentation


Title	An Iterative Boundary Random Walks Algorithm for Interactive Image Segmentation
Authors	Xiaofeng Xie, ZhuLiang Yu, Zhenghui Gu, Yuanqing Li
Abstract	The interactive image segmentation algorithm can provide an intelligent ways to understand the intention of user input. Many interactive methods have the problem of that ask for large number of user input. To efficient produce intuitive segmentation under limited user input is important for industrial application. In this paper, we reveal a positive feedback system on image segmentation to show the pixels of self-learning. Two approaches, iterative random walks and boundary random walks, are proposed for segmentation potential, which is the key step in feedback system. Experiment results on image segmentation indicates that proposed algorithms can obtain more efficient input to random walks. And higher segmentation performance can be obtained by applying the iterative boundary random walks algorithm.
Tasks	Semantic Segmentation
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03002v1
PDF	http://arxiv.org/pdf/1808.03002v1.pdf
PWC	https://paperswithcode.com/paper/an-iterative-boundary-random-walks-algorithm
Repo
Framework


Title	A Deep Information Sharing Network for Multi-contrast Compressed Sensing MRI Reconstruction
Authors	Liyan Sun, Zhiwen Fan, Yue Huang, Xinghao Ding, John Paisley
Abstract	In multi-contrast magnetic resonance imaging (MRI), compressed sensing theory can accelerate imaging by sampling fewer measurements within each contrast. The conventional optimization-based models suffer several limitations: strict assumption of shared sparse support, time-consuming optimization and “shallow” models with difficulties in encoding the rich patterns hiding in massive MRI data. In this paper, we propose the first deep learning model for multi-contrast MRI reconstruction. We achieve information sharing through feature sharing units, which significantly reduces the number of parameters. The feature sharing unit is combined with a data fidelity unit to comprise an inference block. These inference blocks are cascaded with dense connections, which allows for information transmission across different depths of the network efficiently. Our extensive experiments on various multi-contrast MRI datasets show that proposed model outperforms both state-of-the-art single-contrast and multi-contrast MRI methods in accuracy and efficiency. We show the improved reconstruction quality can bring great benefits for the later medical image analysis stage. Furthermore, the robustness of the proposed model to the non-registration environment shows its potential in real MRI applications.
Tasks
Published	2018-04-10
URL	http://arxiv.org/abs/1804.03596v1
PDF	http://arxiv.org/pdf/1804.03596v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-information-sharing-network-for-multi
Repo
Framework

Frustrated with Replicating Claims of a Shared Model? A Solution


Title	Frustrated with Replicating Claims of a Shared Model? A Solution
Authors	Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-Mei Hwu
Abstract	Machine Learning (ML) and Deep Learning (DL) innovations are being introduced at such a rapid pace that model owners and evaluators are hard-pressed analyzing and studying them. This is exacerbated by the complicated procedures for evaluation. The lack of standard systems and efficient techniques for specifying and provisioning ML/DL evaluation is the main cause of this “pain point”. This work discusses common pitfalls for replicating DL model evaluation, and shows that these subtle pitfalls can affect both accuracy and performance. It then proposes a solution to remedy these pitfalls called MLModelScope, a specification for repeatable model evaluation and a runtime to provision and measure experiments. We show that by easing the model specification and evaluation process, MLModelScope facilitates rapid adoption of ML/DL innovations.
Tasks
Published	2018-11-24
URL	https://arxiv.org/abs/1811.09737v2
PDF	https://arxiv.org/pdf/1811.09737v2.pdf
PWC	https://paperswithcode.com/paper/mlmodelscope-evaluate-and-measure-ml-models
Repo
Framework

Universality of the stochastic block model


Title	Universality of the stochastic block model
Authors	Jean-Gabriel Young, Guillaume St-Onge, Patrick Desrosiers, Louis J. Dubé
Abstract	Mesoscopic pattern extraction (MPE) is the problem of finding a partition of the nodes of a complex network that maximizes some objective function. Many well-known network inference problems fall in this category, including, for instance, community detection, core-periphery identification, and imperfect graph coloring. In this paper, we show that the most popular algorithms designed to solve MPE problems can in fact be understood as special cases of the maximum likelihood formulation of the stochastic block model (SBM), or one of its direct generalizations. These equivalence relations show that the SBM is nearly universal with respect to MPE problems.
Tasks	Community Detection
Published	2018-06-11
URL	http://arxiv.org/abs/1806.04214v2
PDF	http://arxiv.org/pdf/1806.04214v2.pdf
PWC	https://paperswithcode.com/paper/universality-of-the-stochastic-block-model
Repo
Framework

TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays


Title	TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Authors	Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Ronald M. Summers
Abstract	Chest X-rays are one of the most common radiological examinations in daily clinical routines. Reporting thorax diseases using chest X-rays is often an entry-level task for radiologist trainees. Yet, reading a chest X-ray image remains a challenging job for learning-oriented machine intelligence, due to (1) shortage of large-scale machine-learnable medical image datasets, and (2) lack of techniques that can mimic the high-level reasoning of human radiologists that requires years of knowledge accumulation and professional training. In this paper, we show the clinical free-text radiological reports can be utilized as a priori knowledge for tackling these two key problems. We propose a novel Text-Image Embedding network (TieNet) for extracting the distinctive image and text representations. Multi-level attention models are integrated into an end-to-end trainable CNN-RNN architecture for highlighting the meaningful text words and image regions. We first apply TieNet to classify the chest X-rays by using both image features and text embeddings extracted from associated reports. The proposed auto-annotation framework achieves high accuracy (over 0.9 on average in AUCs) in assigning disease labels for our hand-label evaluation dataset. Furthermore, we transform the TieNet into a chest X-ray reporting system. It simulates the reporting process and can output disease classification and a preliminary report together. The classification results are significantly improved (6% increase on average in AUCs) compared to the state-of-the-art baseline on an unseen and hand-labeled dataset (OpenI).
Tasks
Published	2018-01-12
URL	http://arxiv.org/abs/1801.04334v1
PDF	http://arxiv.org/pdf/1801.04334v1.pdf
PWC	https://paperswithcode.com/paper/tienet-text-image-embedding-network-for
Repo
Framework

Hierarchical VampPrior Variational Fair Auto-Encoder


Title	Hierarchical VampPrior Variational Fair Auto-Encoder
Authors	Philip Botros, Jakub M. Tomczak
Abstract	Decision making is a process that is extremely prone to different biases. In this paper we consider learning fair representations that aim at removing nuisance (sensitive) information from the decision process. For this purpose, we propose to use deep generative modeling and adapt a hierarchical Variational Auto-Encoder to learn these fair representations. Moreover, we utilize the mutual information as a useful regularizer for enforcing fairness of a representation. In experiments on two benchmark datasets and two scenarios where the sensitive variables are fully and partially observable, we show that the proposed approach either outperforms or performs on par with the current best model.
Tasks	Decision Making
Published	2018-06-26
URL	http://arxiv.org/abs/1806.09918v2
PDF	http://arxiv.org/pdf/1806.09918v2.pdf
PWC	https://paperswithcode.com/paper/hierarchical-vampprior-variational-fair-auto
Repo
Framework

Closed Form Variational Objectives For Bayesian Neural Networks with a Single Hidden Layer


Title	Closed Form Variational Objectives For Bayesian Neural Networks with a Single Hidden Layer
Authors	Martin Jankowiak
Abstract	In this note we consider setups in which variational objectives for Bayesian neural networks can be computed in closed form. In particular we focus on single-layer networks in which the activation function is piecewise polynomial (e.g. ReLU). In this case we show that for a Normal likelihood and structured Normal variational distributions one can compute a variational lower bound in closed form. In addition we compute the predictive mean and variance in closed form. Finally, we also show how to compute approximate lower bounds for other likelihoods (e.g. softmax classification). In experiments we show how the resulting variational objectives can help improve training and provide fast test time predictions.
Tasks
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00686v2
PDF	http://arxiv.org/pdf/1811.00686v2.pdf
PWC	https://paperswithcode.com/paper/closed-form-variational-objectives-for
Repo
Framework

Automatic classification of trees using a UAV onboard camera and deep learning


Title	Automatic classification of trees using a UAV onboard camera and deep learning
Authors	Masanori Onishi, Takeshi Ise
Abstract	Automatic classification of trees using remotely sensed data has been a dream of many scientists and land use managers. Recently, Unmanned aerial vehicles (UAV) has been expected to be an easy-to-use, cost-effective tool for remote sensing of forests, and deep learning has attracted attention for its ability concerning machine vision. In this study, using a commercially available UAV and a publicly available package for deep learning, we constructed a machine vision system for the automatic classification of trees. In our method, we segmented a UAV photography image of forest into individual tree crowns and carried out object-based deep learning. As a result, the system was able to classify 7 tree types at 89.0% accuracy. This performance is notable because we only used basic RGB images from a standard UAV. In contrast, most of previous studies used expensive hardware such as multispectral imagers to improve the performance. This result means that our method has the potential to classify individual trees in a cost-effective manner. This can be a usable tool for many forest researchers and managements.
Tasks
Published	2018-04-27
URL	http://arxiv.org/abs/1804.10390v1
PDF	http://arxiv.org/pdf/1804.10390v1.pdf
PWC	https://paperswithcode.com/paper/automatic-classification-of-trees-using-a-uav
Repo
Framework

Validating WordNet Meronymy Relations using Adimen-SUMO


Title	Validating WordNet Meronymy Relations using Adimen-SUMO
Authors	Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Abstract	In this paper, we report on the practical application of a novel approach for validating the knowledge of WordNet using Adimen-SUMO. In particular, this paper focuses on cross-checking the WordNet meronymy relations against the knowledge encoded in Adimen-SUMO. Our validation approach tests a large set of competency questions (CQs), which are derived (semi)-automatically from the knowledge encoded in WordNet, SUMO and their mapping, by applying efficient first-order logic automated theorem provers. Unfortunately, despite of being created manually, these knowledge resources are not free of errors and discrepancies. In consequence, some of the resulting CQs are not plausible according to the knowledge included in Adimen-SUMO. Thus, first we focus on (semi)-automatically improving the alignment between these knowledge resources, and second, we perform a minimal set of corrections in the ontology. Our aim is to minimize the manual effort required for an extensive validation process. We report on the strategies followed, the changes made, the effort needed and its impact when validating the WordNet meronymy relations using improved versions of the mapping and the ontology. Based on the new results, we discuss the implications of the appropriate corrections and the need of future enhancements.
Tasks
Published	2018-05-20
URL	http://arxiv.org/abs/1805.07824v1
PDF	http://arxiv.org/pdf/1805.07824v1.pdf
PWC	https://paperswithcode.com/paper/validating-wordnet-meronymy-relations-using
Repo
Framework

Efficient inference in stochastic block models with vertex labels


Title	Efficient inference in stochastic block models with vertex labels
Authors	Clara Stegehuis, Laurent Massoulié
Abstract	We study the stochastic block model with two communities where vertices contain side information in the form of a vertex label. These vertex labels may have arbitrary label distributions, depending on the community memberships. We analyze a linearized version of the popular belief propagation algorithm. We show that this algorithm achieves the highest accuracy possible whenever a certain function of the network parameters has a unique fixed point. Whenever this function has multiple fixed points, the belief propagation algorithm may not perform optimally. We show that increasing the information in the vertex labels may reduce the number of fixed points and hence lead to optimality of belief propagation.
Tasks
Published	2018-06-20
URL	http://arxiv.org/abs/1806.07562v2
PDF	http://arxiv.org/pdf/1806.07562v2.pdf
PWC	https://paperswithcode.com/paper/efficient-inference-in-stochastic-block
Repo
Framework

Paranom: A Parallel Anomaly Dataset Generator


Title	Paranom: A Parallel Anomaly Dataset Generator
Authors	Justin Gottschlich
Abstract	In this paper, we present Paranom, a parallel anomaly dataset generator. We discuss its design and provide brief experimental results demonstrating its usefulness in improving the classification correctness of LSTM-AD, a state-of-the-art anomaly detection model.
Tasks	Anomaly Detection
Published	2018-01-09
URL	http://arxiv.org/abs/1801.03164v1
PDF	http://arxiv.org/pdf/1801.03164v1.pdf
PWC	https://paperswithcode.com/paper/paranom-a-parallel-anomaly-dataset-generator
Repo
Framework