October 20, 2019

2944 words 14 mins read

Paper Group AWR 201

Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning. Unsupervised Attention-guided Image to Image Translation. Semantic Edge Detection with Diverse Deep Supervision. Zero-Shot Sketch-Image Hashing. Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extractio …

Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning


Title	Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning
Authors	Gil Lederman, Markus N. Rabe, Edward A. Lee, Sanjit A. Seshia
Abstract	We demonstrate how to learn efficient heuristics for automated reasoning algorithms for quantified Boolean formulas through deep reinforcement learning. We focus on a backtracking search algorithm, which can already solve formulas of impressive size - up to hundreds of thousands of variables. The main challenge is to find a representation of these formulas that lends itself to making predictions in a scalable way. For a family of challenging problems, we learned a heuristic that solves significantly more formulas compared to the existing handwritten heuristics.
Tasks
Published	2018-07-20
URL	https://arxiv.org/abs/1807.08058v3
PDF	https://arxiv.org/pdf/1807.08058v3.pdf
PWC	https://paperswithcode.com/paper/learning-heuristics-for-automated-reasoning
Repo	https://github.com/MarkusRabe/cadet
Framework	none

Unsupervised Attention-guided Image to Image Translation


Title	Unsupervised Attention-guided Image to Image Translation
Authors	Youssef A. Mejjati, Christian Richardt, James Tompkin, Darren Cosker, Kwang In Kim
Abstract	Current unsupervised image-to-image translation techniques struggle to focus their attention on individual objects without altering the background or the way multiple objects interact within a scene. Motivated by the important role of attention in human perception, we tackle this limitation by introducing unsupervised attention mechanisms that are jointly adversarialy trained with the generators and discriminators. We demonstrate qualitatively and quantitatively that our approach is able to attend to relevant regions in the image without requiring supervision, and that by doing so it achieves more realistic mappings compared to recent approaches.
Tasks	Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02311v3
PDF	http://arxiv.org/pdf/1806.02311v3.pdf
PWC	https://paperswithcode.com/paper/unsupervised-attention-guided-image-to-image
Repo	https://github.com/AlamiMejjati/Unsupervised-Attention-guided-Image-to-Image-Translation
Framework	tf

Semantic Edge Detection with Diverse Deep Supervision


Title	Semantic Edge Detection with Diverse Deep Supervision
Authors	Yun Liu, Ming-Ming Cheng, Deng-Ping Fan, Le Zhang, JiaWang Bian, Dacheng Tao
Abstract	Semantic edge detection (SED), which aims at jointly extracting edges as well as their category information, has far-reaching applications in domains such as semantic segmentation, object proposal generation, and object recognition. SED naturally requires achieving two distinct supervision targets: locating fine detailed edges and identifying high-level semantics. We shed light on how such distracted supervision targets prevent state-of-the-art SED methods from effectively using deep supervision to improve results. In this paper, we propose a novel fully convolutional neural network using diverse deep supervision (DDS) within a multi-task framework where lower layers aim at generating category-agnostic edges, while higher layers are responsible for the detection of category-aware semantic edges. To overcome the distracted supervision challenge, a novel information converter unit is introduced, whose effectiveness has been extensively evaluated in several popular benchmark datasets, including SBD, Cityscapes, and PASCAL VOC2012. Source code will be released upon paper acceptance.
Tasks	Edge Detection, Object Proposal Generation, Object Recognition, Semantic Segmentation
Published	2018-04-09
URL	https://arxiv.org/abs/1804.02864v3
PDF	https://arxiv.org/pdf/1804.02864v3.pdf
PWC	https://paperswithcode.com/paper/semantic-edge-detection-with-diverse-deep
Repo	https://github.com/arsenal9971/shearlet_semantic_edge
Framework	pytorch

Zero-Shot Sketch-Image Hashing


Title	Zero-Shot Sketch-Image Hashing
Authors	Yuming Shen, Li Liu, Fumin Shen, Ling Shao
Abstract	Recent studies show that large-scale sketch-based image retrieval (SBIR) can be efficiently tackled by cross-modal binary representation learning methods, where Hamming distance matching significantly speeds up the process of similarity search. Providing training and test data subjected to a fixed set of pre-defined categories, the cutting-edge SBIR and cross-modal hashing works obtain acceptable retrieval performance. However, most of the existing methods fail when the categories of query sketches have never been seen during training. In this paper, the above problem is briefed as a novel but realistic zero-shot SBIR hashing task. We elaborate the challenges of this special task and accordingly propose a zero-shot sketch-image hashing (ZSIH) model. An end-to-end three-network architecture is built, two of which are treated as the binary encoders. The third network mitigates the sketch-image heterogeneity and enhances the semantic relations among data by utilizing the Kronecker fusion layer and graph convolution, respectively. As an important part of ZSIH, we formulate a generative hashing scheme in reconstructing semantic knowledge representations for zero-shot retrieval. To the best of our knowledge, ZSIH is the first zero-shot hashing work suitable for SBIR and cross-modal search. Comprehensive experiments are conducted on two extended datasets, i.e., Sketchy and TU-Berlin with a novel zero-shot train-test split. The proposed model remarkably outperforms related works.
Tasks	Image Retrieval, Representation Learning, Sketch-Based Image Retrieval
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02284v1
PDF	http://arxiv.org/pdf/1803.02284v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-sketch-image-hashing
Repo	https://github.com/ymcidence/Zero-Shot-Sketch-Image-Hashing
Framework	tf

Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction


Title	Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction
Authors	Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H. S. Torr
Abstract	In this paper, we aim at solving pixel-wise binary problems, including salient object segmentation, skeleton extraction, and edge detection, by introducing a unified architecture. Previous works have proposed tailored methods for solving each of the three tasks independently. Here, we show that these tasks share some similarities that can be exploited for developing a unified framework. In particular, we introduce a horizontal cascade, each component of which is densely connected to the outputs of previous component. Stringing these components together allows us to effectively exploit features across different levels hierarchically to effectively address the multiple pixel-wise binary regression tasks. To assess the performance of our proposed network on these tasks, we carry out exhaustive evaluations on multiple representative datasets. Although these tasks are inherently very different, we show that our unified approach performs very well on all of them and works far better than current single-purpose state-of-the-art methods. All the code in this paper will be publicly available.
Tasks	Edge Detection, Semantic Segmentation
Published	2018-03-27
URL	http://arxiv.org/abs/1803.09860v2
PDF	http://arxiv.org/pdf/1803.09860v2.pdf
PWC	https://paperswithcode.com/paper/three-birds-one-stone-a-unified-framework-for
Repo	https://github.com/shawnyuen/ContourDetectPaperCollection
Framework	none

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss


Title	Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss
Authors	Qi Dou, Cheng Ouyang, Cheng Chen, Hao Chen, Pheng-Ann Heng
Abstract	Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.
Tasks	Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation
Published	2018-04-29
URL	http://arxiv.org/abs/1804.10916v2
PDF	http://arxiv.org/pdf/1804.10916v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-cross-modality-domain-adaptation
Repo	https://github.com/carrenD/Med-CMDA
Framework	tf

Semi-unsupervised Learning of Human Activity using Deep Generative Models


Title	Semi-unsupervised Learning of Human Activity using Deep Generative Models
Authors	Matthew Willetts, Aiden Doherty, Stephen Roberts, Chris Holmes
Abstract	We introduce ‘semi-unsupervised learning’, a problem regime related to transfer learning and zero-shot learning where, in the training data, some classes are sparsely labelled and others entirely unlabelled. Models able to learn from training data of this type are potentially of great use as many real-world datasets are like this. Here we demonstrate a new deep generative model for classification in this regime. Our model, a Gaussian mixture deep generative model, demonstrates superior semi-unsupervised classification performance on MNIST to model M2 from Kingma and Welling (2014). We apply the model to human accelerometer data, performing activity classification and structure discovery on windows of time series data.
Tasks	Time Series, Transfer Learning, Zero-Shot Learning
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12176v2
PDF	http://arxiv.org/pdf/1810.12176v2.pdf
PWC	https://paperswithcode.com/paper/semi-unsupervised-learning-of-human-activity
Repo	https://github.com/MatthewWilletts/GM-DGM
Framework	tf

iW-Net: an automatic and minimalistic interactive lung nodule segmentation deep network


Title	iW-Net: an automatic and minimalistic interactive lung nodule segmentation deep network
Authors	Guilherme Aresta, Colin Jacobs, Teresa Araújo, António Cunha, Isabel Ramos, Bram van Ginneken, Aurélio Campilho
Abstract	We propose iW-Net, a deep learning model that allows for both automatic and interactive segmentation of lung nodules in computed tomography images. iW-Net is composed of two blocks: the first one provides an automatic segmentation and the second one allows to correct it by analyzing 2 points introduced by the user in the nodule’s boundary. For this purpose, a physics inspired weight map that takes the user input into account is proposed, which is used both as a feature map and in the system’s loss function. Our approach is extensively evaluated on the public LIDC-IDRI dataset, where we achieve a state-of-the-art performance of 0.55 intersection over union vs the 0.59 inter-observer agreement. Also, we show that iW-Net allows to correct the segmentation of small nodules, essential for proper patient referral decision, as well as improve the segmentation of the challenging non-solid nodules and thus may be an important tool for increasing the early diagnosis of lung cancer.
Tasks	Interactive Segmentation, Lung Nodule Segmentation
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12789v1
PDF	http://arxiv.org/pdf/1811.12789v1.pdf
PWC	https://paperswithcode.com/paper/iw-net-an-automatic-and-minimalistic
Repo	https://github.com/gmaresta/iW-Net
Framework	none

Modeling Multi-turn Conversation with Deep Utterance Aggregation


Title	Modeling Multi-turn Conversation with Deep Utterance Aggregation
Authors	Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu, Hai Zhao, Gongshen Liu
Abstract	Multi-turn conversation understanding is a major challenge for building intelligent dialogue systems. This work focuses on retrieval-based response matching for multi-turn conversation whose related work simply concatenates the conversation utterances, ignoring the interactions among previous utterances for context modeling. In this paper, we formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation. In detail, a self-matching attention is first introduced to route the vital information in each utterance. Then the model matches a response with each refined utterance and the final matching score is obtained after attentive turns aggregation. Experimental results show our model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.
Tasks	Conversational Response Selection
Published	2018-06-24
URL	http://arxiv.org/abs/1806.09102v2
PDF	http://arxiv.org/pdf/1806.09102v2.pdf
PWC	https://paperswithcode.com/paper/modeling-multi-turn-conversation-with-deep
Repo	https://github.com/cooelf/DeepUtteranceAggregation
Framework	none

Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation


Title	Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation
Authors	Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz
Abstract	We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing. PWC-Net is 17 times smaller in size, 2 times faster in inference, and 11% more accurate on Sintel final than the recent FlowNet2 model. It is the winning entry in the optical flow competition of the robust vision challenge. Next, we experimentally analyze the sources of our performance gains. In particular, we use the same training procedure of PWC-Net to retrain FlowNetC, a sub-network of FlowNet2. The retrained FlowNetC is 56% more accurate on Sintel final than the previously trained one and even 5% more accurate than the FlowNet2 model. We further improve the training procedure and increase the accuracy of PWC-Net on Sintel by 10% and on KITTI 2012 and 2015 by 20%. Our newly trained model parameters and training protocols will be available on https://github.com/NVlabs/PWC-Net
Tasks	Optical Flow Estimation
Published	2018-09-14
URL	http://arxiv.org/abs/1809.05571v1
PDF	http://arxiv.org/pdf/1809.05571v1.pdf
PWC	https://paperswithcode.com/paper/models-matter-so-does-training-an-empirical
Repo	https://github.com/NVlabs/PWC-Net
Framework	pytorch

mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion


Title	mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion
Authors	Emmi Jokinen, Markus Heinonen, Harri Lähdesmäki
Abstract	Proteins are commonly used by biochemical industry for numerous processes. Refining these proteins’ properties via mutations causes stability effects as well. Accurate computational method to predict how mutations affect protein stability are necessary to facilitate efficient protein design. However, accuracy of predictive models is ultimately constrained by the limited availability of experimental data. We have developed mGPfusion, a novel Gaussian process (GP) method for predicting protein’s stability changes upon single and multiple mutations. This method complements the limited experimental data with large amounts of molecular simulation data. We introduce a Bayesian data fusion model that re-calibrates the experimental and in silico data sources and then learns a predictive GP model from the combined data. Our protein-specific model requires experimental data only regarding the protein of interest and performs well even with few experimental measurements. The mGPfusion models proteins by contact maps and infers the stability effects caused by mutations with a mixture of graph kernels. Our results show that mGPfusion outperforms state-of-the-art methods in predicting protein stability on a dataset of 15 different proteins and that incorporating molecular simulation data improves the model learning and prediction accuracy.
Tasks
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02852v2
PDF	http://arxiv.org/pdf/1802.02852v2.pdf
PWC	https://paperswithcode.com/paper/mgpfusion-predicting-protein-stability
Repo	https://github.com/emmijokinen/mgpfusion
Framework	none

Visual Domain Adaptation with Manifold Embedded Distribution Alignment


Title	Visual Domain Adaptation with Manifold Embedded Distribution Alignment
Authors	Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, Philip S. Yu
Abstract	Visual domain adaptation aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain. Existing methods either attempt to align the cross-domain distributions, or perform manifold subspace learning. However, there are two significant challenges: (1) degenerated feature transformation, which means that distribution alignment is often performed in the original feature space, where feature distortions are hard to overcome. On the other hand, subspace learning is not sufficient to reduce the distribution divergence. (2) unevaluated distribution alignment, which means that existing distribution alignment methods only align the marginal and conditional distributions with equal importance, while they fail to evaluate the different importance of these two distributions in real applications. In this paper, we propose a Manifold Embedded Distribution Alignment (MEDA) approach to address these challenges. MEDA learns a domain-invariant classifier in Grassmann manifold with structural risk minimization, while performing dynamic distribution alignment to quantitatively account for the relative importance of marginal and conditional distributions. To the best of our knowledge, MEDA is the first attempt to perform dynamic distribution alignment for manifold domain adaptation. Extensive experiments demonstrate that MEDA shows significant improvements in classification accuracy compared to state-of-the-art traditional and deep methods.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2018-07-19
URL	http://arxiv.org/abs/1807.07258v2
PDF	http://arxiv.org/pdf/1807.07258v2.pdf
PWC	https://paperswithcode.com/paper/visual-domain-adaptation-with-manifold
Repo	https://github.com/jindongwang/transferlearning
Framework	pytorch

A Multi-task Deep Learning Architecture for Maritime Surveillance using AIS Data Streams


Title	A Multi-task Deep Learning Architecture for Maritime Surveillance using AIS Data Streams
Authors	Duong Nguyen, Rodolphe Vadaine, Guillaume Hajduch, René Garello, Ronan Fablet
Abstract	In a world of global trading, maritime safety, security and efficiency are crucial issues. We propose a multi-task deep learning framework for vessel monitoring using Automatic Identification System (AIS) data streams. We combine recurrent neural networks with latent variable modeling and an embedding of AIS messages to a new representation space to jointly address key issues to be dealt with when considering AIS data streams: massive amount of streaming data, noisy data and irregular timesampling. We demonstrate the relevance of the proposed deep learning framework on real AIS datasets for a three-task setting, namely trajectory reconstruction, anomaly detection and vessel type identification.
Tasks	Anomaly Detection
Published	2018-06-06
URL	http://arxiv.org/abs/1806.03972v3
PDF	http://arxiv.org/pdf/1806.03972v3.pdf
PWC	https://paperswithcode.com/paper/a-multi-task-deep-learning-architecture-for
Repo	https://github.com/dnguyengithub/MultitaskAIS
Framework	tf

Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation


Title	Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation
Authors	Panagiotis Meletis, Gijs Dubbelman
Abstract	We propose a convolutional network with hierarchical classifiers for per-pixel semantic segmentation, which is able to be trained on multiple, heterogeneous datasets and exploit their semantic hierarchy. Our network is the first to be simultaneously trained on three different datasets from the intelligent vehicles domain, i.e. Cityscapes, GTSDB and Mapillary Vistas, and is able to handle different semantic level-of-detail, class imbalances, and different annotation types, i.e. dense per-pixel and sparse bounding-box labels. We assess our hierarchical approach, by comparing against flat, non-hierarchical classifiers and we show improvements in mean pixel accuracy of 13.0% for Cityscapes classes and 2.4% for Vistas classes and 32.3% for GTSDB classes. Our implementation achieves inference rates of 17 fps at a resolution of 520x706 for 108 classes running on a GPU.
Tasks	Semantic Segmentation
Published	2018-03-15
URL	http://arxiv.org/abs/1803.05675v2
PDF	http://arxiv.org/pdf/1803.05675v2.pdf
PWC	https://paperswithcode.com/paper/training-of-convolutional-networks-on
Repo	https://github.com/pmeletis/IV2018-hierarchical-semantic-segmentation-for-heterogeneous-datasets
Framework	tf

Jump to better conclusions: SCAN both left and right


Title	Jump to better conclusions: SCAN both left and right
Authors	Joost Bastings, Marco Baroni, Jason Weston, Kyunghyun Cho, Douwe Kiela
Abstract	Lake and Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the strong generalization abilities of recurrent sequence-to-sequence models. Their initial experiments suggested that such models may fail because they lack the ability to extract systematic rules. Here, we take a closer look at SCAN and show that it does not always capture the kind of generalization that it was designed for. To mitigate this we propose a complementary dataset, which requires mapping actions back to the original commands, called NACS. We show that models that do well on SCAN do not necessarily do well on NACS, and that NACS exhibits properties more closely aligned with realistic use-cases for sequence-to-sequence models.
Tasks
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04640v1
PDF	http://arxiv.org/pdf/1809.04640v1.pdf
PWC	https://paperswithcode.com/paper/jump-to-better-conclusions-scan-both-left-and
Repo	https://github.com/facebookresearch/NACS
Framework	pytorch