Paper Group AWR 201
Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning. Unsupervised Attention-guided Image to Image Translation. Semantic Edge Detection with Diverse Deep Supervision. Zero-Shot Sketch-Image Hashing. Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extractio …
Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning
Title | Learning Heuristics for Quantified Boolean Formulas through Deep Reinforcement Learning |
Authors | Gil Lederman, Markus N. Rabe, Edward A. Lee, Sanjit A. Seshia |
Abstract | We demonstrate how to learn efficient heuristics for automated reasoning algorithms for quantified Boolean formulas through deep reinforcement learning. We focus on a backtracking search algorithm, which can already solve formulas of impressive size - up to hundreds of thousands of variables. The main challenge is to find a representation of these formulas that lends itself to making predictions in a scalable way. For a family of challenging problems, we learned a heuristic that solves significantly more formulas compared to the existing handwritten heuristics. |
Tasks | |
Published | 2018-07-20 |
URL | https://arxiv.org/abs/1807.08058v3 |
https://arxiv.org/pdf/1807.08058v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-heuristics-for-automated-reasoning |
Repo | https://github.com/MarkusRabe/cadet |
Framework | none |
Unsupervised Attention-guided Image to Image Translation
Title | Unsupervised Attention-guided Image to Image Translation |
Authors | Youssef A. Mejjati, Christian Richardt, James Tompkin, Darren Cosker, Kwang In Kim |
Abstract | Current unsupervised image-to-image translation techniques struggle to focus their attention on individual objects without altering the background or the way multiple objects interact within a scene. Motivated by the important role of attention in human perception, we tackle this limitation by introducing unsupervised attention mechanisms that are jointly adversarialy trained with the generators and discriminators. We demonstrate qualitatively and quantitatively that our approach is able to attend to relevant regions in the image without requiring supervision, and that by doing so it achieves more realistic mappings compared to recent approaches. |
Tasks | Image-to-Image Translation, Unsupervised Image-To-Image Translation |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02311v3 |
http://arxiv.org/pdf/1806.02311v3.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-attention-guided-image-to-image |
Repo | https://github.com/AlamiMejjati/Unsupervised-Attention-guided-Image-to-Image-Translation |
Framework | tf |
Semantic Edge Detection with Diverse Deep Supervision
Title | Semantic Edge Detection with Diverse Deep Supervision |
Authors | Yun Liu, Ming-Ming Cheng, Deng-Ping Fan, Le Zhang, JiaWang Bian, Dacheng Tao |
Abstract | Semantic edge detection (SED), which aims at jointly extracting edges as well as their category information, has far-reaching applications in domains such as semantic segmentation, object proposal generation, and object recognition. SED naturally requires achieving two distinct supervision targets: locating fine detailed edges and identifying high-level semantics. We shed light on how such distracted supervision targets prevent state-of-the-art SED methods from effectively using deep supervision to improve results. In this paper, we propose a novel fully convolutional neural network using diverse deep supervision (DDS) within a multi-task framework where lower layers aim at generating category-agnostic edges, while higher layers are responsible for the detection of category-aware semantic edges. To overcome the distracted supervision challenge, a novel information converter unit is introduced, whose effectiveness has been extensively evaluated in several popular benchmark datasets, including SBD, Cityscapes, and PASCAL VOC2012. Source code will be released upon paper acceptance. |
Tasks | Edge Detection, Object Proposal Generation, Object Recognition, Semantic Segmentation |
Published | 2018-04-09 |
URL | https://arxiv.org/abs/1804.02864v3 |
https://arxiv.org/pdf/1804.02864v3.pdf | |
PWC | https://paperswithcode.com/paper/semantic-edge-detection-with-diverse-deep |
Repo | https://github.com/arsenal9971/shearlet_semantic_edge |
Framework | pytorch |
Zero-Shot Sketch-Image Hashing
Title | Zero-Shot Sketch-Image Hashing |
Authors | Yuming Shen, Li Liu, Fumin Shen, Ling Shao |
Abstract | Recent studies show that large-scale sketch-based image retrieval (SBIR) can be efficiently tackled by cross-modal binary representation learning methods, where Hamming distance matching significantly speeds up the process of similarity search. Providing training and test data subjected to a fixed set of pre-defined categories, the cutting-edge SBIR and cross-modal hashing works obtain acceptable retrieval performance. However, most of the existing methods fail when the categories of query sketches have never been seen during training. In this paper, the above problem is briefed as a novel but realistic zero-shot SBIR hashing task. We elaborate the challenges of this special task and accordingly propose a zero-shot sketch-image hashing (ZSIH) model. An end-to-end three-network architecture is built, two of which are treated as the binary encoders. The third network mitigates the sketch-image heterogeneity and enhances the semantic relations among data by utilizing the Kronecker fusion layer and graph convolution, respectively. As an important part of ZSIH, we formulate a generative hashing scheme in reconstructing semantic knowledge representations for zero-shot retrieval. To the best of our knowledge, ZSIH is the first zero-shot hashing work suitable for SBIR and cross-modal search. Comprehensive experiments are conducted on two extended datasets, i.e., Sketchy and TU-Berlin with a novel zero-shot train-test split. The proposed model remarkably outperforms related works. |
Tasks | Image Retrieval, Representation Learning, Sketch-Based Image Retrieval |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02284v1 |
http://arxiv.org/pdf/1803.02284v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-sketch-image-hashing |
Repo | https://github.com/ymcidence/Zero-Shot-Sketch-Image-Hashing |
Framework | tf |
Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction
Title | Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction |
Authors | Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H. S. Torr |
Abstract | In this paper, we aim at solving pixel-wise binary problems, including salient object segmentation, skeleton extraction, and edge detection, by introducing a unified architecture. Previous works have proposed tailored methods for solving each of the three tasks independently. Here, we show that these tasks share some similarities that can be exploited for developing a unified framework. In particular, we introduce a horizontal cascade, each component of which is densely connected to the outputs of previous component. Stringing these components together allows us to effectively exploit features across different levels hierarchically to effectively address the multiple pixel-wise binary regression tasks. To assess the performance of our proposed network on these tasks, we carry out exhaustive evaluations on multiple representative datasets. Although these tasks are inherently very different, we show that our unified approach performs very well on all of them and works far better than current single-purpose state-of-the-art methods. All the code in this paper will be publicly available. |
Tasks | Edge Detection, Semantic Segmentation |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.09860v2 |
http://arxiv.org/pdf/1803.09860v2.pdf | |
PWC | https://paperswithcode.com/paper/three-birds-one-stone-a-unified-framework-for |
Repo | https://github.com/shawnyuen/ContourDetectPaperCollection |
Framework | none |
Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss
Title | Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss |
Authors | Qi Dou, Cheng Ouyang, Cheng Chen, Hao Chen, Pheng-Ann Heng |
Abstract | Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results. |
Tasks | Domain Adaptation, Transfer Learning, Unsupervised Domain Adaptation |
Published | 2018-04-29 |
URL | http://arxiv.org/abs/1804.10916v2 |
http://arxiv.org/pdf/1804.10916v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-cross-modality-domain-adaptation |
Repo | https://github.com/carrenD/Med-CMDA |
Framework | tf |
Semi-unsupervised Learning of Human Activity using Deep Generative Models
Title | Semi-unsupervised Learning of Human Activity using Deep Generative Models |
Authors | Matthew Willetts, Aiden Doherty, Stephen Roberts, Chris Holmes |
Abstract | We introduce ‘semi-unsupervised learning’, a problem regime related to transfer learning and zero-shot learning where, in the training data, some classes are sparsely labelled and others entirely unlabelled. Models able to learn from training data of this type are potentially of great use as many real-world datasets are like this. Here we demonstrate a new deep generative model for classification in this regime. Our model, a Gaussian mixture deep generative model, demonstrates superior semi-unsupervised classification performance on MNIST to model M2 from Kingma and Welling (2014). We apply the model to human accelerometer data, performing activity classification and structure discovery on windows of time series data. |
Tasks | Time Series, Transfer Learning, Zero-Shot Learning |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12176v2 |
http://arxiv.org/pdf/1810.12176v2.pdf | |
PWC | https://paperswithcode.com/paper/semi-unsupervised-learning-of-human-activity |
Repo | https://github.com/MatthewWilletts/GM-DGM |
Framework | tf |
iW-Net: an automatic and minimalistic interactive lung nodule segmentation deep network
Title | iW-Net: an automatic and minimalistic interactive lung nodule segmentation deep network |
Authors | Guilherme Aresta, Colin Jacobs, Teresa Araújo, António Cunha, Isabel Ramos, Bram van Ginneken, Aurélio Campilho |
Abstract | We propose iW-Net, a deep learning model that allows for both automatic and interactive segmentation of lung nodules in computed tomography images. iW-Net is composed of two blocks: the first one provides an automatic segmentation and the second one allows to correct it by analyzing 2 points introduced by the user in the nodule’s boundary. For this purpose, a physics inspired weight map that takes the user input into account is proposed, which is used both as a feature map and in the system’s loss function. Our approach is extensively evaluated on the public LIDC-IDRI dataset, where we achieve a state-of-the-art performance of 0.55 intersection over union vs the 0.59 inter-observer agreement. Also, we show that iW-Net allows to correct the segmentation of small nodules, essential for proper patient referral decision, as well as improve the segmentation of the challenging non-solid nodules and thus may be an important tool for increasing the early diagnosis of lung cancer. |
Tasks | Interactive Segmentation, Lung Nodule Segmentation |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12789v1 |
http://arxiv.org/pdf/1811.12789v1.pdf | |
PWC | https://paperswithcode.com/paper/iw-net-an-automatic-and-minimalistic |
Repo | https://github.com/gmaresta/iW-Net |
Framework | none |
Modeling Multi-turn Conversation with Deep Utterance Aggregation
Title | Modeling Multi-turn Conversation with Deep Utterance Aggregation |
Authors | Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu, Hai Zhao, Gongshen Liu |
Abstract | Multi-turn conversation understanding is a major challenge for building intelligent dialogue systems. This work focuses on retrieval-based response matching for multi-turn conversation whose related work simply concatenates the conversation utterances, ignoring the interactions among previous utterances for context modeling. In this paper, we formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation. In detail, a self-matching attention is first introduced to route the vital information in each utterance. Then the model matches a response with each refined utterance and the final matching score is obtained after attentive turns aggregation. Experimental results show our model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus. |
Tasks | Conversational Response Selection |
Published | 2018-06-24 |
URL | http://arxiv.org/abs/1806.09102v2 |
http://arxiv.org/pdf/1806.09102v2.pdf | |
PWC | https://paperswithcode.com/paper/modeling-multi-turn-conversation-with-deep |
Repo | https://github.com/cooelf/DeepUtteranceAggregation |
Framework | none |
Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation
Title | Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation |
Authors | Deqing Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz |
Abstract | We investigate two crucial and closely related aspects of CNNs for optical flow estimation: models and training. First, we design a compact but effective CNN model, called PWC-Net, according to simple and well-established principles: pyramidal processing, warping, and cost volume processing. PWC-Net is 17 times smaller in size, 2 times faster in inference, and 11% more accurate on Sintel final than the recent FlowNet2 model. It is the winning entry in the optical flow competition of the robust vision challenge. Next, we experimentally analyze the sources of our performance gains. In particular, we use the same training procedure of PWC-Net to retrain FlowNetC, a sub-network of FlowNet2. The retrained FlowNetC is 56% more accurate on Sintel final than the previously trained one and even 5% more accurate than the FlowNet2 model. We further improve the training procedure and increase the accuracy of PWC-Net on Sintel by 10% and on KITTI 2012 and 2015 by 20%. Our newly trained model parameters and training protocols will be available on https://github.com/NVlabs/PWC-Net |
Tasks | Optical Flow Estimation |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05571v1 |
http://arxiv.org/pdf/1809.05571v1.pdf | |
PWC | https://paperswithcode.com/paper/models-matter-so-does-training-an-empirical |
Repo | https://github.com/NVlabs/PWC-Net |
Framework | pytorch |
mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion
Title | mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion |
Authors | Emmi Jokinen, Markus Heinonen, Harri Lähdesmäki |
Abstract | Proteins are commonly used by biochemical industry for numerous processes. Refining these proteins’ properties via mutations causes stability effects as well. Accurate computational method to predict how mutations affect protein stability are necessary to facilitate efficient protein design. However, accuracy of predictive models is ultimately constrained by the limited availability of experimental data. We have developed mGPfusion, a novel Gaussian process (GP) method for predicting protein’s stability changes upon single and multiple mutations. This method complements the limited experimental data with large amounts of molecular simulation data. We introduce a Bayesian data fusion model that re-calibrates the experimental and in silico data sources and then learns a predictive GP model from the combined data. Our protein-specific model requires experimental data only regarding the protein of interest and performs well even with few experimental measurements. The mGPfusion models proteins by contact maps and infers the stability effects caused by mutations with a mixture of graph kernels. Our results show that mGPfusion outperforms state-of-the-art methods in predicting protein stability on a dataset of 15 different proteins and that incorporating molecular simulation data improves the model learning and prediction accuracy. |
Tasks | |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.02852v2 |
http://arxiv.org/pdf/1802.02852v2.pdf | |
PWC | https://paperswithcode.com/paper/mgpfusion-predicting-protein-stability |
Repo | https://github.com/emmijokinen/mgpfusion |
Framework | none |
Visual Domain Adaptation with Manifold Embedded Distribution Alignment
Title | Visual Domain Adaptation with Manifold Embedded Distribution Alignment |
Authors | Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, Philip S. Yu |
Abstract | Visual domain adaptation aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain. Existing methods either attempt to align the cross-domain distributions, or perform manifold subspace learning. However, there are two significant challenges: (1) degenerated feature transformation, which means that distribution alignment is often performed in the original feature space, where feature distortions are hard to overcome. On the other hand, subspace learning is not sufficient to reduce the distribution divergence. (2) unevaluated distribution alignment, which means that existing distribution alignment methods only align the marginal and conditional distributions with equal importance, while they fail to evaluate the different importance of these two distributions in real applications. In this paper, we propose a Manifold Embedded Distribution Alignment (MEDA) approach to address these challenges. MEDA learns a domain-invariant classifier in Grassmann manifold with structural risk minimization, while performing dynamic distribution alignment to quantitatively account for the relative importance of marginal and conditional distributions. To the best of our knowledge, MEDA is the first attempt to perform dynamic distribution alignment for manifold domain adaptation. Extensive experiments demonstrate that MEDA shows significant improvements in classification accuracy compared to state-of-the-art traditional and deep methods. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07258v2 |
http://arxiv.org/pdf/1807.07258v2.pdf | |
PWC | https://paperswithcode.com/paper/visual-domain-adaptation-with-manifold |
Repo | https://github.com/jindongwang/transferlearning |
Framework | pytorch |
A Multi-task Deep Learning Architecture for Maritime Surveillance using AIS Data Streams
Title | A Multi-task Deep Learning Architecture for Maritime Surveillance using AIS Data Streams |
Authors | Duong Nguyen, Rodolphe Vadaine, Guillaume Hajduch, René Garello, Ronan Fablet |
Abstract | In a world of global trading, maritime safety, security and efficiency are crucial issues. We propose a multi-task deep learning framework for vessel monitoring using Automatic Identification System (AIS) data streams. We combine recurrent neural networks with latent variable modeling and an embedding of AIS messages to a new representation space to jointly address key issues to be dealt with when considering AIS data streams: massive amount of streaming data, noisy data and irregular timesampling. We demonstrate the relevance of the proposed deep learning framework on real AIS datasets for a three-task setting, namely trajectory reconstruction, anomaly detection and vessel type identification. |
Tasks | Anomaly Detection |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.03972v3 |
http://arxiv.org/pdf/1806.03972v3.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-task-deep-learning-architecture-for |
Repo | https://github.com/dnguyengithub/MultitaskAIS |
Framework | tf |
Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation
Title | Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation |
Authors | Panagiotis Meletis, Gijs Dubbelman |
Abstract | We propose a convolutional network with hierarchical classifiers for per-pixel semantic segmentation, which is able to be trained on multiple, heterogeneous datasets and exploit their semantic hierarchy. Our network is the first to be simultaneously trained on three different datasets from the intelligent vehicles domain, i.e. Cityscapes, GTSDB and Mapillary Vistas, and is able to handle different semantic level-of-detail, class imbalances, and different annotation types, i.e. dense per-pixel and sparse bounding-box labels. We assess our hierarchical approach, by comparing against flat, non-hierarchical classifiers and we show improvements in mean pixel accuracy of 13.0% for Cityscapes classes and 2.4% for Vistas classes and 32.3% for GTSDB classes. Our implementation achieves inference rates of 17 fps at a resolution of 520x706 for 108 classes running on a GPU. |
Tasks | Semantic Segmentation |
Published | 2018-03-15 |
URL | http://arxiv.org/abs/1803.05675v2 |
http://arxiv.org/pdf/1803.05675v2.pdf | |
PWC | https://paperswithcode.com/paper/training-of-convolutional-networks-on |
Repo | https://github.com/pmeletis/IV2018-hierarchical-semantic-segmentation-for-heterogeneous-datasets |
Framework | tf |
Jump to better conclusions: SCAN both left and right
Title | Jump to better conclusions: SCAN both left and right |
Authors | Joost Bastings, Marco Baroni, Jason Weston, Kyunghyun Cho, Douwe Kiela |
Abstract | Lake and Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the strong generalization abilities of recurrent sequence-to-sequence models. Their initial experiments suggested that such models may fail because they lack the ability to extract systematic rules. Here, we take a closer look at SCAN and show that it does not always capture the kind of generalization that it was designed for. To mitigate this we propose a complementary dataset, which requires mapping actions back to the original commands, called NACS. We show that models that do well on SCAN do not necessarily do well on NACS, and that NACS exhibits properties more closely aligned with realistic use-cases for sequence-to-sequence models. |
Tasks | |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04640v1 |
http://arxiv.org/pdf/1809.04640v1.pdf | |
PWC | https://paperswithcode.com/paper/jump-to-better-conclusions-scan-both-left-and |
Repo | https://github.com/facebookresearch/NACS |
Framework | pytorch |