October 20, 2019

2944 words 14 mins read

Paper Group AWR 201

Paper Group AWR 201

Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning. Unsupervised Attention-guided Image to Image Translation. Semantic Edge Detection with Diverse Deep Supervision. Zero-Shot Sketch-Image Hashing. Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extractio …

Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning

Title Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning
Authors Gil Lederman, Markus N. Rabe, Edward A. Lee, Sanjit A. Seshia
Abstract We demonstrate how to learn efficient heuristics for automated reasoning algorithms for quantified Boolean formulas through deep reinforcement learning. We focus on a backtracking search algorithm, which can already solve formulas of impressive size - up to hundreds of thousands of variables. The main challenge is to find a representation of these formulas that lends itself to making predictions in a scalable way. For a family of challenging problems, we learned a heuristic that solves significantly more formulas compared to the existing handwritten heuristics.
Published 2018-07-20
URL https://arxiv.org/abs/1807.08058v3
PDF https://arxiv.org/pdf/1807.08058v3.pdf
PWC https://paperswithcode.com/paper/learning-heuristics-for-automated-reasoning
Repo https://github.com/MarkusRabe/cadet
Framework none

Unsupervised Attention-guided Image to Image Translation

Title Unsupervised Attention-guided Image to Image Translation
Authors Youssef A. Mejjati, Christian Richardt, James Tompkin, Darren Cosker, Kwang In Kim
Abstract Current unsupervised image-to-image translation techniques struggle to focus their attention on individual objects without altering the background or the way multiple objects interact within a scene. Motivated by the important role of attention in human perception, we tackle this limitation by introducing unsupervised attention mechanisms that are jointly adversarialy trained with the generators and discriminators. We demonstrate qualitatively and quantitatively that our approach is able to attend to relevant regions in the image without requiring supervision, and that by doing so it achieves more realistic mappings compared to recent approaches.
Tasks Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published 2018-06-06
URL http://arxiv.org/abs/1806.02311v3
PDF http://arxiv.org/pdf/1806.02311v3.pdf
PWC https://paperswithcode.com/paper/unsupervised-attention-guided-image-to-image
Repo https://github.com/AlamiMejjati/Unsupervised-Attention-guided-Image-to-Image-Translation
Framework tf

Semantic Edge Detection with Diverse Deep Supervision

Title Semantic Edge Detection with Diverse Deep Supervision
Authors Yun Liu, Ming-Ming Cheng, Deng-Ping Fan, Le Zhang, JiaWang Bian, Dacheng Tao
Abstract Semantic edge detection (SED), which aims at jointly extracting edges as well as their category information, has far-reaching applications in domains such as semantic segmentation, object proposal generation, and object recognition. SED naturally requires achieving two distinct supervision targets: locating fine detailed edges and identifying high-level semantics. We shed light on how such distracted supervision targets prevent state-of-the-art SED methods from effectively using deep supervision to improve results. In this paper, we propose a novel fully convolutional neural network using diverse deep supervision (DDS) within a multi-task framework where lower layers aim at generating category-agnostic edges, while higher layers are responsible for the detection of category-aware semantic edges. To overcome the distracted supervision challenge, a novel information converter unit is introduced, whose effectiveness has been extensively evaluated in several popular benchmark datasets, including SBD, Cityscapes, and PASCAL VOC2012. Source code will be released upon paper acceptance.
Tasks Edge Detection, Object Proposal Generation, Object Recognition, Semantic Segmentation
Published 2018-04-09
URL https://arxiv.org/abs/1804.02864v3
PDF https://arxiv.org/pdf/1804.02864v3.pdf
PWC https://paperswithcode.com/paper/semantic-edge-detection-with-diverse-deep
Repo https://github.com/arsenal9971/shearlet_semantic_edge
Framework pytorch

Zero-Shot Sketch-Image Hashing

Title Zero-Shot Sketch-Image Hashing
Authors Yuming Shen, Li Liu, Fumin Shen, Ling Shao
Abstract Recent studies show that large-scale sketch-based image retrieval (SBIR) can be efficiently tackled by cross-modal binary representation learning methods, where Hamming distance matching significantly speeds up the process of similarity search. Providing training and test data subjected to a fixed set of pre-defined categories, the cutting-edge SBIR and cross-modal hashing works obtain acceptable retrieval performance. However, most of the existing methods fail when the categories of query sketches have never been seen during training. In this paper, the above problem is briefed as a novel but realistic zero-shot SBIR hashing task. We elaborate the challenges of this special task and accordingly propose a zero-shot sketch-image hashing (ZSIH) model. An end-to-end three-network architecture is built, two of which are treated as the binary encoders. The third network mitigates the sketch-image heterogeneity and enhances the semantic relations among data by utilizing the Kronecker fusion layer and graph convolution, respectively. As an important part of ZSIH, we formulate a generative hashing scheme in reconstructing semantic knowledge representations for zero-shot retrieval. To the best of our knowledge, ZSIH is the first zero-shot hashing work suitable for SBIR and cross-modal search. Comprehensive experiments are conducted on two extended datasets, i.e., Sketchy and TU-Berlin with a novel zero-shot train-test split. The proposed model remarkably outperforms related works.
Tasks Image Retrieval, Representation Learning, Sketch-Based Image Retrieval
Published 2018-03-06
URL http://arxiv.org/abs/1803.02284v1
PDF http://arxiv.org/pdf/1803.02284v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-sketch-image-hashing
Repo https://github.com/ymcidence/Zero-Shot-Sketch-Image-Hashing
Framework tf

Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction

Title Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction
Authors Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H. S. Torr
Abstract In this paper, we aim at solving pixel-wise binary problems, including salient object segmentation, skeleton extraction, and edge detection, by introducing a unified architecture. Previous works have proposed tailored methods for solving each of the three tasks independently. Here, we show that these tasks share some similarities that can be exploited for developing a unified framework. In particular, we introduce a horizontal cascade, each component of which is densely connected to the outputs of previous component. Stringing these components together allows us to effectively exploit features across different levels hierarchically to effectively address the multiple pixel-wise binary regression tasks. To assess the performance of our proposed network on these tasks, we carry out exhaustive evaluations on multiple representative datasets. Although these tasks are inherently very different, we show that our unified approach performs very well on all of them and works far better than current single-purpose state-of-the-art methods. All the code in this paper will be publicly available.
Tasks Edge Detection, Semantic Segmentation
Published 2018-03-27
URL http://arxiv.org/abs/1803.09860v2
PDF http://arxiv.org/pdf/1803.09860v2.pdf
PWC https://paperswithcode.com/paper/three-birds-one-stone-a-unified-framework-for
Repo https://github.com/shawnyuen/ContourDetectPaperCollection
Framework none

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss

Title Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss
Authors Qi Dou, Cheng Ouyang, Cheng Chen, Hao Chen, Pheng-Ann Heng
Abstract Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.
Tasks Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation
Published 2018-04-29
URL http://arxiv.org/abs/1804.10916v2
PDF http://arxiv.org/pdf/1804.10916v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-cross-modality-domain-adaptation
Repo https://github.com/carrenD/Med-CMDA
Framework tf

Semi-unsupervised Learning of Human Activity using Deep Generative Models

Title Semi-unsupervised Learning of Human Activity using Deep Generative Models
Authors Matthew Willetts, Aiden Doherty, Stephen Roberts, Chris Holmes
Abstract We introduce ‘semi-unsupervised learning’, a problem regime related to transfer learning and zero-shot learning where, in the training data, some classes are sparsely labelled and others entirely unlabelled. Models able to learn from training data of this type are potentially of great use as many real-world datasets are like this. Here we demonstrate a new deep generative model for classification in this regime. Our model, a Gaussian mixture deep generative model, demonstrates superior semi-unsupervised classification performance on MNIST to model M2 from Kingma and Welling (2014). We apply the model to human accelerometer data, performing activity classification and structure discovery on windows of time series data.
Tasks Time Series, Transfer Learning, Zero-Shot Learning
Published 2018-10-29
URL http://arxiv.org/abs/1810.12176v2
PDF http://arxiv.org/pdf/1810.12176v2.pdf
PWC https://paperswithcode.com/paper/semi-unsupervised-learning-of-human-activity
Repo https://github.com/MatthewWilletts/GM-DGM
Framework tf

iW-Net: an automatic and minimalistic interactive lung nodule segmentation deep network

Title iW-Net: an automatic and minimalistic interactive lung nodule segmentation deep network
Authors Guilherme Aresta, Colin Jacobs, Teresa Araújo, António Cunha, Isabel Ramos, Bram van Ginneken, Aurélio Campilho
Abstract We propose iW-Net, a deep learning model that allows for both automatic and interactive segmentation of lung nodules in computed tomography images. iW-Net is composed of two blocks: the first one provides an automatic segmentation and the second one allows to correct it by analyzing 2 points introduced by the user in the nodule’s boundary. For this purpose, a physics inspired weight map that takes the user input into account is proposed, which is used both as a feature map and in the system’s loss function. Our approach is extensively evaluated on the public LIDC-IDRI dataset, where we achieve a state-of-the-art performance of 0.55 intersection over union vs the 0.59 inter-observer agreement. Also, we show that iW-Net allows to correct the segmentation of small nodules, essential for proper patient referral decision, as well as improve the segmentation of the challenging non-solid nodules and thus may be an important tool for increasing the early diagnosis of lung cancer.
Tasks Interactive Segmentation, Lung Nodule Segmentation
Published 2018-11-30
URL http://arxiv.org/abs/1811.12789v1
PDF http://arxiv.org/pdf/1811.12789v1.pdf
PWC https://paperswithcode.com/paper/iw-net-an-automatic-and-minimalistic
Repo https://github.com/gmaresta/iW-Net
Framework none

Modeling Multi-turn Conversation with Deep Utterance Aggregation

Title Modeling Multi-turn Conversation with Deep Utterance Aggregation
Authors Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu, Hai Zhao, Gongshen Liu
Abstract Multi-turn conversation understanding is a major challenge for building intelligent dialogue systems. This work focuses on retrieval-based response matching for multi-turn conversation whose related work simply concatenates the conversation utterances, ignoring the interactions among previous utterances for context modeling. In this paper, we formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation. In detail, a self-matching attention is first introduced to route the vital information in each utterance. Then the model matches a response with each refined utterance and the final matching score is obtained after attentive turns aggregation. Experimental results show our model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.
Tasks Conversational Response Selection
Published 2018-06-24
URL http://arxiv.org/abs/1806.09102v2
PDF http://arxiv.org/pdf/1806.09102v2.pdf
PWC https://paperswithcode.com/paper/modeling-multi-turn-conversation-with-deep
Repo https://github.com/cooelf/DeepUtteranceAggregation
Framework none

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation

Title Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation
Authors Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
Abstract We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing. PWC-Net is 17 times smaller in size, 2 times faster in inference, and 11% more accurate on Sintel final than the recent FlowNet2 model. It is the winning entry in the optical flow competition of the robust vision challenge. Next, we experimentally analyze the sources of our performance gains. In particular, we use the same training procedure of PWC-Net to retrain FlowNetC, a sub-network of FlowNet2. The retrained FlowNetC is 56% more accurate on Sintel final than the previously trained one and even 5% more accurate than the FlowNet2 model. We further improve the training procedure and increase the accuracy of PWC-Net on Sintel by 10% and on KITTI 2012 and 2015 by 20%. Our newly trained model parameters and training protocols will be available on https://github.com/NVlabs/PWC-Net
Tasks Optical Flow Estimation
Published 2018-09-14
URL http://arxiv.org/abs/1809.05571v1
PDF http://arxiv.org/pdf/1809.05571v1.pdf
PWC https://paperswithcode.com/paper/models-matter-so-does-training-an-empirical
Repo https://github.com/NVlabs/PWC-Net
Framework pytorch

mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion

Title mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion
Authors Emmi Jokinen, Markus Heinonen, Harri Lähdesmäki
Abstract Proteins are commonly used by biochemical industry for numerous processes. Refining these proteins’ properties via mutations causes stability effects as well. Accurate computational method to predict how mutations affect protein stability are necessary to facilitate efficient protein design. However, accuracy of predictive models is ultimately constrained by the limited availability of experimental data. We have developed mGPfusion, a novel Gaussian process (GP) method for predicting protein’s stability changes upon single and multiple mutations. This method complements the limited experimental data with large amounts of molecular simulation data. We introduce a Bayesian data fusion model that re-calibrates the experimental and in silico data sources and then learns a predictive GP model from the combined data. Our protein-specific model requires experimental data only regarding the protein of interest and performs well even with few experimental measurements. The mGPfusion models proteins by contact maps and infers the stability effects caused by mutations with a mixture of graph kernels. Our results show that mGPfusion outperforms state-of-the-art methods in predicting protein stability on a dataset of 15 different proteins and that incorporating molecular simulation data improves the model learning and prediction accuracy.
Published 2018-02-08
URL http://arxiv.org/abs/1802.02852v2
PDF http://arxiv.org/pdf/1802.02852v2.pdf
PWC https://paperswithcode.com/paper/mgpfusion-predicting-protein-stability
Repo https://github.com/emmijokinen/mgpfusion
Framework none

Visual Domain Adaptation with Manifold Embedded Distribution Alignment

Title Visual Domain Adaptation with Manifold Embedded Distribution Alignment
Authors Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, Philip S. Yu
Abstract Visual domain adaptation aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain. Existing methods either attempt to align the cross-domain distributions, or perform manifold subspace learning. However, there are two significant challenges: (1) degenerated feature transformation, which means that distribution alignment is often performed in the original feature space, where feature distortions are hard to overcome. On the other hand, subspace learning is not sufficient to reduce the distribution divergence. (2) unevaluated distribution alignment, which means that existing distribution alignment methods only align the marginal and conditional distributions with equal importance, while they fail to evaluate the different importance of these two distributions in real applications. In this paper, we propose a Manifold Embedded Distribution Alignment (MEDA) approach to address these challenges. MEDA learns a domain-invariant classifier in Grassmann manifold with structural risk minimization, while performing dynamic distribution alignment to quantitatively account for the relative importance of marginal and conditional distributions. To the best of our knowledge, MEDA is the first attempt to perform dynamic distribution alignment for manifold domain adaptation. Extensive experiments demonstrate that MEDA shows significant improvements in classification accuracy compared to state-of-the-art traditional and deep methods.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2018-07-19
URL http://arxiv.org/abs/1807.07258v2
PDF http://arxiv.org/pdf/1807.07258v2.pdf
PWC https://paperswithcode.com/paper/visual-domain-adaptation-with-manifold
Repo https://github.com/jindongwang/transferlearning
Framework pytorch

A Multi-task Deep Learning Architecture for Maritime Surveillance using AIS Data Streams

Title A Multi-task Deep Learning Architecture for Maritime Surveillance using AIS Data Streams
Authors Duong Nguyen, Rodolphe Vadaine, Guillaume Hajduch, René Garello, Ronan Fablet
Abstract In a world of global trading, maritime safety, security and efficiency are crucial issues. We propose a multi-task deep learning framework for vessel monitoring using Automatic Identification System (AIS) data streams. We combine recurrent neural networks with latent variable modeling and an embedding of AIS messages to a new representation space to jointly address key issues to be dealt with when considering AIS data streams: massive amount of streaming data, noisy data and irregular timesampling. We demonstrate the relevance of the proposed deep learning framework on real AIS datasets for a three-task setting, namely trajectory reconstruction, anomaly detection and vessel type identification.
Tasks Anomaly Detection
Published 2018-06-06
URL http://arxiv.org/abs/1806.03972v3
PDF http://arxiv.org/pdf/1806.03972v3.pdf
PWC https://paperswithcode.com/paper/a-multi-task-deep-learning-architecture-for
Repo https://github.com/dnguyengithub/MultitaskAIS
Framework tf

Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation

Title Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation
Authors Panagiotis Meletis, Gijs Dubbelman
Abstract We propose a convolutional network with hierarchical classifiers for per-pixel semantic segmentation, which is able to be trained on multiple, heterogeneous datasets and exploit their semantic hierarchy. Our network is the first to be simultaneously trained on three different datasets from the intelligent vehicles domain, i.e. Cityscapes, GTSDB and Mapillary Vistas, and is able to handle different semantic level-of-detail, class imbalances, and different annotation types, i.e. dense per-pixel and sparse bounding-box labels. We assess our hierarchical approach, by comparing against flat, non-hierarchical classifiers and we show improvements in mean pixel accuracy of 13.0% for Cityscapes classes and 2.4% for Vistas classes and 32.3% for GTSDB classes. Our implementation achieves inference rates of 17 fps at a resolution of 520x706 for 108 classes running on a GPU.
Tasks Semantic Segmentation
Published 2018-03-15
URL http://arxiv.org/abs/1803.05675v2
PDF http://arxiv.org/pdf/1803.05675v2.pdf
PWC https://paperswithcode.com/paper/training-of-convolutional-networks-on
Repo https://github.com/pmeletis/IV2018-hierarchical-semantic-segmentation-for-heterogeneous-datasets
Framework tf

Jump to better conclusions: SCAN both left and right

Title Jump to better conclusions: SCAN both left and right
Authors Joost Bastings, Marco Baroni, Jason Weston, Kyunghyun Cho, Douwe Kiela
Abstract Lake and Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the strong generalization abilities of recurrent sequence-to-sequence models. Their initial experiments suggested that such models may fail because they lack the ability to extract systematic rules. Here, we take a closer look at SCAN and show that it does not always capture the kind of generalization that it was designed for. To mitigate this we propose a complementary dataset, which requires mapping actions back to the original commands, called NACS. We show that models that do well on SCAN do not necessarily do well on NACS, and that NACS exhibits properties more closely aligned with realistic use-cases for sequence-to-sequence models.
Published 2018-09-12
URL http://arxiv.org/abs/1809.04640v1
PDF http://arxiv.org/pdf/1809.04640v1.pdf
PWC https://paperswithcode.com/paper/jump-to-better-conclusions-scan-both-left-and
Repo https://github.com/facebookresearch/NACS
Framework pytorch
comments powered by Disqus