Paper Group ANR 913
Incorporating Structured Commonsense Knowledge in Story Completion. Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input. Accurate 3-D Reconstruction with RGB-D Cameras using Depth Map Fusion and Pose Refinement. Adversarial Example Decomposition. Uncertainty Quantification in CNN-Based Surface Prediction Using Shap …
Incorporating Structured Commonsense Knowledge in Story Completion
Title | Incorporating Structured Commonsense Knowledge in Story Completion |
Authors | Jiaao Chen, Jianshu Chen, Zhou Yu |
Abstract | The ability to select an appropriate story ending is the first step towards perfect narrative comprehension. Story ending prediction requires not only the explicit clues within the context, but also the implicit knowledge (such as commonsense) to construct a reasonable and consistent story. However, most previous approaches do not explicitly use background commonsense knowledge. We present a neural story ending selection model that integrates three types of information: narrative sequence, sentiment evolution and commonsense knowledge. Experiments show that our model outperforms state-of-the-art approaches on a public dataset, ROCStory Cloze Task , and the performance gain from adding the additional commonsense knowledge is significant. |
Tasks | Story Completion |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00625v1 |
http://arxiv.org/pdf/1811.00625v1.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-structured-commonsense |
Repo | |
Framework | |
Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input
Title | Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input |
Authors | Youmna Farag, Helen Yannakoudakis, Ted Briscoe |
Abstract | We demonstrate that current state-of-the-art approaches to Automated Essay Scoring (AES) are not well-suited to capturing adversarially crafted input of grammatical but incoherent sequences of sentences. We develop a neural model of local coherence that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the local coherence model with a state-of-the-art AES model. We evaluate our approach against a number of baselines and experimentally demonstrate its effectiveness on both the AES task and the task of flagging adversarial input, further contributing to the development of an approach that strengthens the validity of neural essay scoring models. |
Tasks | |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.06898v2 |
http://arxiv.org/pdf/1804.06898v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-automated-essay-scoring-and-coherence |
Repo | |
Framework | |
Accurate 3-D Reconstruction with RGB-D Cameras using Depth Map Fusion and Pose Refinement
Title | Accurate 3-D Reconstruction with RGB-D Cameras using Depth Map Fusion and Pose Refinement |
Authors | Markus Ylimäki, Juho Kannala, Janne Heikkilä |
Abstract | Depth map fusion is an essential part in both stereo and RGB-D based 3-D reconstruction pipelines. Whether produced with a passive stereo reconstruction or using an active depth sensor, such as Microsoft Kinect, the depth maps have noise and may have poor initial registration. In this paper, we introduce a method which is capable of handling outliers, and especially, even significant registration errors. The proposed method first fuses a sequence of depth maps into a single non-redundant point cloud so that the redundant points are merged together by giving more weight to more certain measurements. Then, the original depth maps are re-registered to the fused point cloud to refine the original camera extrinsic parameters. The fusion is then performed again with the refined extrinsic parameters. This procedure is repeated until the result is satisfying or no significant changes happen between iterations. The method is robust to outliers and erroneous depth measurements as well as even significant depth map registration errors due to inaccurate initial camera poses. |
Tasks | |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.08912v1 |
http://arxiv.org/pdf/1804.08912v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-3-d-reconstruction-with-rgb-d |
Repo | |
Framework | |
Adversarial Example Decomposition
Title | Adversarial Example Decomposition |
Authors | Horace He, Aaron Lou, Qingxuan Jiang, Isay Katsman, Serge Belongie, Ser-Nam Lim |
Abstract | Research has shown that widely used deep neural networks are vulnerable to carefully crafted adversarial perturbations. Moreover, these adversarial perturbations often transfer across models. We hypothesize that adversarial weakness is composed of three sources of bias: architecture, dataset, and random initialization. We show that one can decompose adversarial examples into an architecture-dependent component, data-dependent component, and noise-dependent component and that these components behave intuitively. For example, noise-dependent components transfer poorly to all other models, while architecture-dependent components transfer better to retrained models with the same architecture. In addition, we demonstrate that these components can be recombined to improve transferability without sacrificing efficacy on the original model. |
Tasks | |
Published | 2018-12-04 |
URL | https://arxiv.org/abs/1812.01198v2 |
https://arxiv.org/pdf/1812.01198v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-example-decomposition |
Repo | |
Framework | |
Uncertainty Quantification in CNN-Based Surface Prediction Using Shape Priors
Title | Uncertainty Quantification in CNN-Based Surface Prediction Using Shape Priors |
Authors | Katarína Tóthová, Sarah Parisot, Matthew C. H. Lee, Esther Puyol-Antón, Lisa M. Koch, Andrew P. King, Ender Konukoglu, Marc Pollefeys |
Abstract | Surface reconstruction is a vital tool in a wide range of areas of medical image analysis and clinical research. Despite the fact that many methods have proposed solutions to the reconstruction problem, most, due to their deterministic nature, do not directly address the issue of quantifying uncertainty associated with their predictions. We remedy this by proposing a novel probabilistic deep learning approach capable of simultaneous surface reconstruction and associated uncertainty prediction. The method incorporates prior shape information in the form of a principal component analysis (PCA) model. Experiments using the UK Biobank data show that our probabilistic approach outperforms an analogous deterministic PCA-based method in the task of 2D organ delineation and quantifies uncertainty by formulating distributions over predicted surface vertex positions. |
Tasks | |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11272v1 |
http://arxiv.org/pdf/1807.11272v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-quantification-in-cnn-based |
Repo | |
Framework | |
Paraphrasing Complex Network: Network Compression via Factor Transfer
Title | Paraphrasing Complex Network: Network Compression via Factor Transfer |
Authors | Jangho Kim, SeoungUK Park, Nojun Kwak |
Abstract | Deep neural networks (DNN) have recently shown promising performances in various areas. Although DNNs are very powerful, a large number of network parameters requires substantial storage and memory bandwidth which hinders them from being applied to actual embedded systems. Many researchers have sought ways of model compression to reduce the size of a network with minimal performance degradation. Among them, a method called knowledge transfer is to train the student network with a stronger teacher network. In this paper, we propose a method to overcome the limitations of conventional knowledge transfer methods and improve the performance of a student network. An auto-encoder is used in an unsupervised manner to extract compact factors which are defined as compressed feature maps of the teacher network. When using the factors to train the student network, we observed that the performance of the student network becomes better than the ones with other conventional knowledge transfer methods because factors contain paraphrased compact information of the teacher network that is easy for the student network to understand. |
Tasks | Model Compression, Transfer Learning |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04977v2 |
http://arxiv.org/pdf/1802.04977v2.pdf | |
PWC | https://paperswithcode.com/paper/paraphrasing-complex-network-network |
Repo | |
Framework | |
Sequential Attention GAN for Interactive Image Editing via Dialogue
Title | Sequential Attention GAN for Interactive Image Editing via Dialogue |
Authors | Yu Cheng, Zhe Gan, Yitong Li, Jingjing Liu, Jianfeng Gao |
Abstract | We introduce a new task - Interactive Image Editing via conversational language, where users can guide an agent to edit images via multi-turn dialogue. In each dialogue turn, the agent takes a source image and a natural language description as the input, and generates a modified image following the textual description. Two new datasets are introduced for this task (Zap-Seq and DeepFashion-Seq), which contain multi-turn dialog sessions with crowdsourced image-description sequences. The main challenges in this sequential and interactive image generation task are two-fold: 1) contextual consistency between a generated image and the given textual description; 2) step-by-step region-level modification to maintain visual consistency across the image sequence. To address these challenges, we propose a novel Sequential Attention Generative Adversarial Network (SeqAttnGAN) framework, which applies a neural state tracker to encode the previous image and the textual description in each dialogue turn, and uses a GAN framework to generate a modified version of the image that is consistent with the dialogue context and preceding images. To achieve better region-specific refinement, we also introduce a sequential attention mechanism into the model. Experiments on Zap-Seq and DeepFashion-Seq datasets show that the proposed SeqAttnGAN model outperforms state-of-the-art approaches on the interactive image editing task across all evaluation metrics on visual quality, image sequence coherence and text-image consistency. |
Tasks | Image Generation, Text-to-Image Generation |
Published | 2018-12-20 |
URL | https://arxiv.org/abs/1812.08352v3 |
https://arxiv.org/pdf/1812.08352v3.pdf | |
PWC | https://paperswithcode.com/paper/sequential-attention-gan-for-interactive |
Repo | |
Framework | |
Dissimilarity Coefficient based Weakly Supervised Object Detection
Title | Dissimilarity Coefficient based Weakly Supervised Object Detection |
Authors | Aditya Arun, C. V. Jawahar, M. Pawan Kumar |
Abstract | We consider the problem of weakly supervised object detection, where the training samples are annotated using only image-level labels that indicate the presence or absence of an object category. In order to model the uncertainty in the location of the objects, we employ a dissimilarity coefficient based probabilistic learning objective. The learning objective minimizes the difference between an annotation agnostic prediction distribution and an annotation aware conditional distribution. The main computational challenge is the complex nature of the conditional distribution, which consists of terms over hundreds or thousands of variables. The complexity of the conditional distribution rules out the possibility of explicitly modeling it. Instead, we exploit the fact that deep learning frameworks rely on stochastic optimization. This allows us to use a state of the art discrete generative model that can provide annotation consistent samples from the conditional distribution. Extensive experiments on PASCAL VOC 2007 and 2012 data sets demonstrate the efficacy of our proposed approach. |
Tasks | Object Detection, Stochastic Optimization, Weakly Supervised Object Detection |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.10016v1 |
http://arxiv.org/pdf/1811.10016v1.pdf | |
PWC | https://paperswithcode.com/paper/dissimilarity-coefficient-based-weakly |
Repo | |
Framework | |
On Computation and Generalization of GANs with Spectrum Control
Title | On Computation and Generalization of GANs with Spectrum Control |
Authors | Haoming Jiang, Zhehui Chen, Minshuo Chen, Feng Liu, Dingding Wang, Tuo Zhao |
Abstract | Generative Adversarial Networks (GANs), though powerful, is hard to train. Several recent works (brock2016neural,miyato2018spectral) suggest that controlling the spectra of weight matrices in the discriminator can significantly improve the training of GANs. Motivated by their discovery, we propose a new framework for training GANs, which allows more flexible spectrum control (e.g., making the weight matrices of the discriminator have slow singular value decays). Specifically, we propose a new reparameterization approach for the weight matrices of the discriminator in GANs, which allows us to directly manipulate the spectra of the weight matrices through various regularizers and constraints, without intensively computing singular value decompositions. Theoretically, we further show that the spectrum control improves the generalization ability of GANs. Our experiments on CIFAR-10, STL-10, and ImageNet datasets confirm that compared to other methods, our proposed method is capable of generating images with competitive quality by utilizing spectral normalization and encouraging the slow singular value decay. |
Tasks | |
Published | 2018-12-28 |
URL | http://arxiv.org/abs/1812.10912v2 |
http://arxiv.org/pdf/1812.10912v2.pdf | |
PWC | https://paperswithcode.com/paper/on-computation-and-generalization-of-gans |
Repo | |
Framework | |
Handwriting Trajectory Recovery using End-to-End Deep Encoder-Decoder Network
Title | Handwriting Trajectory Recovery using End-to-End Deep Encoder-Decoder Network |
Authors | Ayan Kumar Bhunia, Abir Bhowmick, Ankan Kumar Bhunia, Aishik Konwer, Prithaj Banerjee, Partha Pratim Roy, Umapada Pal |
Abstract | In this paper, we introduce a novel technique to recover the pen trajectory of offline characters which is a crucial step for handwritten character recognition. Generally, online acquisition approach has more advantage than its offline counterpart as the online technique keeps track of the pen movement. Hence, pen tip trajectory retrieval from offline text can bridge the gap between online and offline methods. Our proposed framework employs sequence to sequence model which consists of an encoder-decoder LSTM module. Our encoder module consists of Convolutional LSTM network, which takes an offline character image as the input and encodes the feature sequence to a hidden representation. The output of the encoder is fed to a decoder LSTM and we get the successive coordinate points from every time step of the decoder LSTM. Although the sequence to sequence model is a popular paradigm in various computer vision and language translation tasks, the main contribution of our work lies in designing an end-to-end network for a decade old popular problem in Document Image Analysis community. Tamil, Telugu and Devanagari characters of LIPI Toolkit dataset are used for our experiments. Our proposed method has achieved superior performance compared to the other conventional approaches. |
Tasks | |
Published | 2018-01-22 |
URL | http://arxiv.org/abs/1801.07211v4 |
http://arxiv.org/pdf/1801.07211v4.pdf | |
PWC | https://paperswithcode.com/paper/handwriting-trajectory-recovery-using-end-to |
Repo | |
Framework | |
Gated Feedback Refinement Network for Coarse-to-Fine Dense Semantic Image Labeling
Title | Gated Feedback Refinement Network for Coarse-to-Fine Dense Semantic Image Labeling |
Authors | Md Amirul Islam, Mrigank Rochan, Shujon Naha, Neil D. B. Bruce, Yang Wang |
Abstract | Effective integration of local and global contextual information is crucial for semantic segmentation and dense image labeling. We develop two encoder-decoder based deep learning architectures to address this problem. We first propose a network architecture called Label Refinement Network (LRN) that predicts segmentation labels in a coarse-to-fine fashion at several spatial resolutions. In this network, we also define loss functions at several stages to provide supervision at different stages of training. However, there are limits to the quality of refinement possible if ambiguous information is passed forward. In order to address this issue, we also propose Gated Feedback Refinement Network (G-FRNet) that addresses this limitation. Initially, G-FRNet makes a coarse-grained prediction which it progressively refines to recover details by effectively integrating local and global contextual information during the refinement stages. This is achieved by gate units proposed in this work, that control information passed forward in order to resolve the ambiguity. Experiments were conducted on four challenging dense labeling datasets (CamVid, PASCAL VOC 2012, Horse-Cow Parsing, PASCAL-Person-Part, and SUN-RGBD). G-FRNet achieves state-of-the-art semantic segmentation results on the CamVid and Horse-Cow Parsing datasets and produces results competitive with the best performing approaches that appear in the literature for the other three datasets. |
Tasks | Semantic Segmentation |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1806.11266v1 |
http://arxiv.org/pdf/1806.11266v1.pdf | |
PWC | https://paperswithcode.com/paper/gated-feedback-refinement-network-for-coarse |
Repo | |
Framework | |
On Deep Ensemble Learning from a Function Approximation Perspective
Title | On Deep Ensemble Learning from a Function Approximation Perspective |
Authors | Jiawei Zhang, Limeng Cui, Fisher B. Gouza |
Abstract | In this paper, we propose to provide a general ensemble learning framework based on deep learning models. Given a group of unit models, the proposed deep ensemble learning framework will effectively combine their learning results via a multilayered ensemble model. In the case when the unit model mathematical mappings are bounded, sigmoidal and discriminatory, we demonstrate that the deep ensemble learning framework can achieve a universal approximation of any functions from the input space to the output space. Meanwhile, to achieve such a performance, the deep ensemble learning framework also impose a strict constraint on the number of involved unit models. According to the theoretic proof provided in this paper, given the input feature space of dimension d, the required unit model number will be 2d, if the ensemble model involves one single layer. Furthermore, as the ensemble component goes deeper, the number of required unit model is proved to be lowered down exponentially. |
Tasks | |
Published | 2018-05-19 |
URL | http://arxiv.org/abs/1805.07502v1 |
http://arxiv.org/pdf/1805.07502v1.pdf | |
PWC | https://paperswithcode.com/paper/on-deep-ensemble-learning-from-a-function |
Repo | |
Framework | |
CanvasGAN: A simple baseline for text to image generation by incrementally patching a canvas
Title | CanvasGAN: A simple baseline for text to image generation by incrementally patching a canvas |
Authors | Amanpreet Singh, Sharan Agrawal |
Abstract | We propose a new recurrent generative model for generating images from text captions while attending on specific parts of text captions. Our model creates images by incrementally adding patches on a “canvas” while attending on words from text caption at each timestep. Finally, the canvas is passed through an upscaling network to generate images. We also introduce a new method for generating visual-semantic sentence embeddings based on self-attention over text. We compare our model’s generated images with those generated Reed et. al.‘s model and show that our model is a stronger baseline for text to image generation tasks. |
Tasks | Image Generation, Sentence Embeddings, Text-to-Image Generation |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.02833v1 |
http://arxiv.org/pdf/1810.02833v1.pdf | |
PWC | https://paperswithcode.com/paper/canvasgan-a-simple-baseline-for-text-to-image |
Repo | |
Framework | |
Semantically Invariant Text-to-Image Generation
Title | Semantically Invariant Text-to-Image Generation |
Authors | Shagan Sah, Dheeraj Peri, Ameya Shringi, Chi Zhang, Miguel Dominguez, Andreas Savakis, Ray Ptucha |
Abstract | Image captioning has demonstrated models that are capable of generating plausible text given input images or videos. Further, recent work in image generation has shown significant improvements in image quality when text is used as a prior. Our work ties these concepts together by creating an architecture that can enable bidirectional generation of images and text. We call this network Multi-Modal Vector Representation (MMVR). Along with MMVR, we propose two improvements to the text conditioned image generation. Firstly, a n-gram metric based cost function is introduced that generalizes the caption with respect to the image. Secondly, multiple semantically similar sentences are shown to help in generating better images. Qualitative and quantitative evaluations demonstrate that MMVR improves upon existing text conditioned image generation results by over 20%, while integrating visual and text modalities. |
Tasks | Image Captioning, Image Generation, Text-to-Image Generation |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10274v1 |
http://arxiv.org/pdf/1809.10274v1.pdf | |
PWC | https://paperswithcode.com/paper/semantically-invariant-text-to-image |
Repo | |
Framework | |
Formal Ways for Measuring Relations between Concepts in Conceptual Spaces
Title | Formal Ways for Measuring Relations between Concepts in Conceptual Spaces |
Authors | Lucas Bechberger, Kai-Uwe Kühnberger |
Abstract | The highly influential framework of conceptual spaces provides a geometric way of representing knowledge. Instances are represented by points in a high-dimensional space and concepts are represented by regions in this space. In this article, we extend our recent mathematical formalization of this framework by providing quantitative mathematical definitions for measuring relations between concepts: We develop formal ways for computing concept size, subsethood, implication, similarity, and betweenness. This considerably increases the representational capabilities of our formalization and makes it the most thorough and comprehensive formalization of conceptual spaces developed so far. |
Tasks | |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02393v1 |
http://arxiv.org/pdf/1804.02393v1.pdf | |
PWC | https://paperswithcode.com/paper/formal-ways-for-measuring-relations-between |
Repo | |
Framework | |