Paper Group ANR 174
The Conversation: Deep Audio-Visual Speech Enhancement. Artistic Instance-Aware Image Filtering by Convolutional Neural Networks. Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME. Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization. An Iterative Boundary Random Walks Algorithm for Interactive Ima …
The Conversation: Deep Audio-Visual Speech Enhancement
Title | The Conversation: Deep Audio-Visual Speech Enhancement |
Authors | Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman |
Abstract | Our goal is to isolate individual speakers from multi-talker simultaneous speech in videos. Existing works in this area have focussed on trying to separate utterances from known speakers in controlled environments. In this paper, we propose a deep audio-visual speech enhancement network that is able to separate a speaker’s voice given lip regions in the corresponding video, by predicting both the magnitude and the phase of the target signal. The method is applicable to speakers unheard and unseen during training, and for unconstrained environments. We demonstrate strong quantitative and qualitative results, isolating extremely challenging real-world examples. |
Tasks | Speech Enhancement |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.04121v2 |
http://arxiv.org/pdf/1804.04121v2.pdf | |
PWC | https://paperswithcode.com/paper/the-conversation-deep-audio-visual-speech |
Repo | |
Framework | |
Artistic Instance-Aware Image Filtering by Convolutional Neural Networks
Title | Artistic Instance-Aware Image Filtering by Convolutional Neural Networks |
Authors | Milad Tehrani, Mahnoosh Bagheri, Mahdi Ahmadi, Alireza Norouzi, Nader Karimi, Shadrokh Samavi |
Abstract | In the recent years, public use of artistic effects for editing and beautifying images has encouraged researchers to look for new approaches to this task. Most of the existing methods apply artistic effects to the whole image. Exploitation of neural network vision technologies like object detection and semantic segmentation could be a new viewpoint in this area. In this paper, we utilize an instance segmentation neural network to obtain a class mask for separately filtering the background and foreground of an image. We implement a top prior-mask selection to let us select an object class for filtering purpose. Different artistic effects are used in the filtering process to meet the requirements of a vast variety of users. Also, our method is flexible enough to allow the addition of new filters. We use pre-trained Mask R-CNN instance segmentation on the COCO dataset as the segmentation network. Experimental results on the use of different filters are performed. System’s output results show that this novel approach can create satisfying artistic images with fast operation and simple interface. |
Tasks | Instance Segmentation, Object Detection, Semantic Segmentation |
Published | 2018-09-22 |
URL | http://arxiv.org/abs/1809.08448v1 |
http://arxiv.org/pdf/1809.08448v1.pdf | |
PWC | https://paperswithcode.com/paper/artistic-instance-aware-image-filtering-by |
Repo | |
Framework | |
Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME
Title | Induction of Non-Monotonic Logic Programs to Explain Boosted Tree Models Using LIME |
Authors | Farhad Shakerin, Gopal Gupta |
Abstract | We present a heuristic based algorithm to induce \textit{nonmonotonic} logic programs that will explain the behavior of XGBoost trained classifiers. We use the technique based on the LIME approach to locally select the most important features contributing to the classification decision. Then, in order to explain the model’s global behavior, we propose the LIME-FOLD algorithm —a heuristic-based inductive logic programming (ILP) algorithm capable of learning non-monotonic logic programs—that we apply to a transformed dataset produced by LIME. Our proposed approach is agnostic to the choice of the ILP algorithm. Our experiments with UCI standard benchmarks suggest a significant improvement in terms of classification evaluation metrics. Meanwhile, the number of induced rules dramatically decreases compared to ALEPH, a state-of-the-art ILP system. |
Tasks | |
Published | 2018-08-02 |
URL | http://arxiv.org/abs/1808.00629v2 |
http://arxiv.org/pdf/1808.00629v2.pdf | |
PWC | https://paperswithcode.com/paper/induction-of-non-monotonic-logic-programs-to |
Repo | |
Framework | |
Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization
Title | Dropping Symmetry for Fast Symmetric Nonnegative Matrix Factorization |
Authors | Zhihui Zhu, Xiao Li, Kai Liu, Qiuwei Li |
Abstract | Symmetric nonnegative matrix factorization (NMF), a special but important class of the general NMF, is demonstrated to be useful for data analysis and in particular for various clustering tasks. Unfortunately, designing fast algorithms for Symmetric NMF is not as easy as for the nonsymmetric counterpart, the latter admitting the splitting property that allows efficient alternating-type algorithms. To overcome this issue, we transfer the symmetric NMF to a nonsymmetric one, then we can adopt the idea from the state-of-the-art algorithms for nonsymmetric NMF to design fast algorithms solving symmetric NMF. We rigorously establish that solving nonsymmetric reformulation returns a solution for symmetric NMF and then apply fast alternating based algorithms for the corresponding reformulated problem. Furthermore, we show these fast algorithms admit strong convergence guarantee in the sense that the generated sequence is convergent at least at a sublinear rate and it converges globally to a critical point of the symmetric NMF. We conduct experiments on both synthetic data and image clustering to support our result. |
Tasks | Image Clustering |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.05642v1 |
http://arxiv.org/pdf/1811.05642v1.pdf | |
PWC | https://paperswithcode.com/paper/dropping-symmetry-for-fast-symmetric |
Repo | |
Framework | |
An Iterative Boundary Random Walks Algorithm for Interactive Image Segmentation
Title | An Iterative Boundary Random Walks Algorithm for Interactive Image Segmentation |
Authors | Xiaofeng Xie, ZhuLiang Yu, Zhenghui Gu, Yuanqing Li |
Abstract | The interactive image segmentation algorithm can provide an intelligent ways to understand the intention of user input. Many interactive methods have the problem of that ask for large number of user input. To efficient produce intuitive segmentation under limited user input is important for industrial application. In this paper, we reveal a positive feedback system on image segmentation to show the pixels of self-learning. Two approaches, iterative random walks and boundary random walks, are proposed for segmentation potential, which is the key step in feedback system. Experiment results on image segmentation indicates that proposed algorithms can obtain more efficient input to random walks. And higher segmentation performance can be obtained by applying the iterative boundary random walks algorithm. |
Tasks | Semantic Segmentation |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03002v1 |
http://arxiv.org/pdf/1808.03002v1.pdf | |
PWC | https://paperswithcode.com/paper/an-iterative-boundary-random-walks-algorithm |
Repo | |
Framework | |
A Deep Information Sharing Network for Multi-contrast Compressed Sensing MRI Reconstruction
Title | A Deep Information Sharing Network for Multi-contrast Compressed Sensing MRI Reconstruction |
Authors | Liyan Sun, Zhiwen Fan, Yue Huang, Xinghao Ding, John Paisley |
Abstract | In multi-contrast magnetic resonance imaging (MRI), compressed sensing theory can accelerate imaging by sampling fewer measurements within each contrast. The conventional optimization-based models suffer several limitations: strict assumption of shared sparse support, time-consuming optimization and “shallow” models with difficulties in encoding the rich patterns hiding in massive MRI data. In this paper, we propose the first deep learning model for multi-contrast MRI reconstruction. We achieve information sharing through feature sharing units, which significantly reduces the number of parameters. The feature sharing unit is combined with a data fidelity unit to comprise an inference block. These inference blocks are cascaded with dense connections, which allows for information transmission across different depths of the network efficiently. Our extensive experiments on various multi-contrast MRI datasets show that proposed model outperforms both state-of-the-art single-contrast and multi-contrast MRI methods in accuracy and efficiency. We show the improved reconstruction quality can bring great benefits for the later medical image analysis stage. Furthermore, the robustness of the proposed model to the non-registration environment shows its potential in real MRI applications. |
Tasks | |
Published | 2018-04-10 |
URL | http://arxiv.org/abs/1804.03596v1 |
http://arxiv.org/pdf/1804.03596v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-information-sharing-network-for-multi |
Repo | |
Framework | |
Frustrated with Replicating Claims of a Shared Model? A Solution
Title | Frustrated with Replicating Claims of a Shared Model? A Solution |
Authors | Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-Mei Hwu |
Abstract | Machine Learning (ML) and Deep Learning (DL) innovations are being introduced at such a rapid pace that model owners and evaluators are hard-pressed analyzing and studying them. This is exacerbated by the complicated procedures for evaluation. The lack of standard systems and efficient techniques for specifying and provisioning ML/DL evaluation is the main cause of this “pain point”. This work discusses common pitfalls for replicating DL model evaluation, and shows that these subtle pitfalls can affect both accuracy and performance. It then proposes a solution to remedy these pitfalls called MLModelScope, a specification for repeatable model evaluation and a runtime to provision and measure experiments. We show that by easing the model specification and evaluation process, MLModelScope facilitates rapid adoption of ML/DL innovations. |
Tasks | |
Published | 2018-11-24 |
URL | https://arxiv.org/abs/1811.09737v2 |
https://arxiv.org/pdf/1811.09737v2.pdf | |
PWC | https://paperswithcode.com/paper/mlmodelscope-evaluate-and-measure-ml-models |
Repo | |
Framework | |
Universality of the stochastic block model
Title | Universality of the stochastic block model |
Authors | Jean-Gabriel Young, Guillaume St-Onge, Patrick Desrosiers, Louis J. Dubé |
Abstract | Mesoscopic pattern extraction (MPE) is the problem of finding a partition of the nodes of a complex network that maximizes some objective function. Many well-known network inference problems fall in this category, including, for instance, community detection, core-periphery identification, and imperfect graph coloring. In this paper, we show that the most popular algorithms designed to solve MPE problems can in fact be understood as special cases of the maximum likelihood formulation of the stochastic block model (SBM), or one of its direct generalizations. These equivalence relations show that the SBM is nearly universal with respect to MPE problems. |
Tasks | Community Detection |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.04214v2 |
http://arxiv.org/pdf/1806.04214v2.pdf | |
PWC | https://paperswithcode.com/paper/universality-of-the-stochastic-block-model |
Repo | |
Framework | |
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Title | TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays |
Authors | Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Ronald M. Summers |
Abstract | Chest X-rays are one of the most common radiological examinations in daily clinical routines. Reporting thorax diseases using chest X-rays is often an entry-level task for radiologist trainees. Yet, reading a chest X-ray image remains a challenging job for learning-oriented machine intelligence, due to (1) shortage of large-scale machine-learnable medical image datasets, and (2) lack of techniques that can mimic the high-level reasoning of human radiologists that requires years of knowledge accumulation and professional training. In this paper, we show the clinical free-text radiological reports can be utilized as a priori knowledge for tackling these two key problems. We propose a novel Text-Image Embedding network (TieNet) for extracting the distinctive image and text representations. Multi-level attention models are integrated into an end-to-end trainable CNN-RNN architecture for highlighting the meaningful text words and image regions. We first apply TieNet to classify the chest X-rays by using both image features and text embeddings extracted from associated reports. The proposed auto-annotation framework achieves high accuracy (over 0.9 on average in AUCs) in assigning disease labels for our hand-label evaluation dataset. Furthermore, we transform the TieNet into a chest X-ray reporting system. It simulates the reporting process and can output disease classification and a preliminary report together. The classification results are significantly improved (6% increase on average in AUCs) compared to the state-of-the-art baseline on an unseen and hand-labeled dataset (OpenI). |
Tasks | |
Published | 2018-01-12 |
URL | http://arxiv.org/abs/1801.04334v1 |
http://arxiv.org/pdf/1801.04334v1.pdf | |
PWC | https://paperswithcode.com/paper/tienet-text-image-embedding-network-for |
Repo | |
Framework | |
Hierarchical VampPrior Variational Fair Auto-Encoder
Title | Hierarchical VampPrior Variational Fair Auto-Encoder |
Authors | Philip Botros, Jakub M. Tomczak |
Abstract | Decision making is a process that is extremely prone to different biases. In this paper we consider learning fair representations that aim at removing nuisance (sensitive) information from the decision process. For this purpose, we propose to use deep generative modeling and adapt a hierarchical Variational Auto-Encoder to learn these fair representations. Moreover, we utilize the mutual information as a useful regularizer for enforcing fairness of a representation. In experiments on two benchmark datasets and two scenarios where the sensitive variables are fully and partially observable, we show that the proposed approach either outperforms or performs on par with the current best model. |
Tasks | Decision Making |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.09918v2 |
http://arxiv.org/pdf/1806.09918v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-vampprior-variational-fair-auto |
Repo | |
Framework | |
Closed Form Variational Objectives For Bayesian Neural Networks with a Single Hidden Layer
Title | Closed Form Variational Objectives For Bayesian Neural Networks with a Single Hidden Layer |
Authors | Martin Jankowiak |
Abstract | In this note we consider setups in which variational objectives for Bayesian neural networks can be computed in closed form. In particular we focus on single-layer networks in which the activation function is piecewise polynomial (e.g. ReLU). In this case we show that for a Normal likelihood and structured Normal variational distributions one can compute a variational lower bound in closed form. In addition we compute the predictive mean and variance in closed form. Finally, we also show how to compute approximate lower bounds for other likelihoods (e.g. softmax classification). In experiments we show how the resulting variational objectives can help improve training and provide fast test time predictions. |
Tasks | |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00686v2 |
http://arxiv.org/pdf/1811.00686v2.pdf | |
PWC | https://paperswithcode.com/paper/closed-form-variational-objectives-for |
Repo | |
Framework | |
Automatic classification of trees using a UAV onboard camera and deep learning
Title | Automatic classification of trees using a UAV onboard camera and deep learning |
Authors | Masanori Onishi, Takeshi Ise |
Abstract | Automatic classification of trees using remotely sensed data has been a dream of many scientists and land use managers. Recently, Unmanned aerial vehicles (UAV) has been expected to be an easy-to-use, cost-effective tool for remote sensing of forests, and deep learning has attracted attention for its ability concerning machine vision. In this study, using a commercially available UAV and a publicly available package for deep learning, we constructed a machine vision system for the automatic classification of trees. In our method, we segmented a UAV photography image of forest into individual tree crowns and carried out object-based deep learning. As a result, the system was able to classify 7 tree types at 89.0% accuracy. This performance is notable because we only used basic RGB images from a standard UAV. In contrast, most of previous studies used expensive hardware such as multispectral imagers to improve the performance. This result means that our method has the potential to classify individual trees in a cost-effective manner. This can be a usable tool for many forest researchers and managements. |
Tasks | |
Published | 2018-04-27 |
URL | http://arxiv.org/abs/1804.10390v1 |
http://arxiv.org/pdf/1804.10390v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-classification-of-trees-using-a-uav |
Repo | |
Framework | |
Validating WordNet Meronymy Relations using Adimen-SUMO
Title | Validating WordNet Meronymy Relations using Adimen-SUMO |
Authors | Javier Álvez, Itziar Gonzalez-Dios, German Rigau |
Abstract | In this paper, we report on the practical application of a novel approach for validating the knowledge of WordNet using Adimen-SUMO. In particular, this paper focuses on cross-checking the WordNet meronymy relations against the knowledge encoded in Adimen-SUMO. Our validation approach tests a large set of competency questions (CQs), which are derived (semi)-automatically from the knowledge encoded in WordNet, SUMO and their mapping, by applying efficient first-order logic automated theorem provers. Unfortunately, despite of being created manually, these knowledge resources are not free of errors and discrepancies. In consequence, some of the resulting CQs are not plausible according to the knowledge included in Adimen-SUMO. Thus, first we focus on (semi)-automatically improving the alignment between these knowledge resources, and second, we perform a minimal set of corrections in the ontology. Our aim is to minimize the manual effort required for an extensive validation process. We report on the strategies followed, the changes made, the effort needed and its impact when validating the WordNet meronymy relations using improved versions of the mapping and the ontology. Based on the new results, we discuss the implications of the appropriate corrections and the need of future enhancements. |
Tasks | |
Published | 2018-05-20 |
URL | http://arxiv.org/abs/1805.07824v1 |
http://arxiv.org/pdf/1805.07824v1.pdf | |
PWC | https://paperswithcode.com/paper/validating-wordnet-meronymy-relations-using |
Repo | |
Framework | |
Efficient inference in stochastic block models with vertex labels
Title | Efficient inference in stochastic block models with vertex labels |
Authors | Clara Stegehuis, Laurent Massoulié |
Abstract | We study the stochastic block model with two communities where vertices contain side information in the form of a vertex label. These vertex labels may have arbitrary label distributions, depending on the community memberships. We analyze a linearized version of the popular belief propagation algorithm. We show that this algorithm achieves the highest accuracy possible whenever a certain function of the network parameters has a unique fixed point. Whenever this function has multiple fixed points, the belief propagation algorithm may not perform optimally. We show that increasing the information in the vertex labels may reduce the number of fixed points and hence lead to optimality of belief propagation. |
Tasks | |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07562v2 |
http://arxiv.org/pdf/1806.07562v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-inference-in-stochastic-block |
Repo | |
Framework | |
Paranom: A Parallel Anomaly Dataset Generator
Title | Paranom: A Parallel Anomaly Dataset Generator |
Authors | Justin Gottschlich |
Abstract | In this paper, we present Paranom, a parallel anomaly dataset generator. We discuss its design and provide brief experimental results demonstrating its usefulness in improving the classification correctness of LSTM-AD, a state-of-the-art anomaly detection model. |
Tasks | Anomaly Detection |
Published | 2018-01-09 |
URL | http://arxiv.org/abs/1801.03164v1 |
http://arxiv.org/pdf/1801.03164v1.pdf | |
PWC | https://paperswithcode.com/paper/paranom-a-parallel-anomaly-dataset-generator |
Repo | |
Framework | |