Paper Group AWR 192
![Paper Group AWR 192](/2017/images/pwc/paper-arxiv_hu144ec288a26b3e360d673e256787de3e_28623_900x500_fit_q75_box.jpg)
Robust Registration of Gaussian Mixtures for Colour Transfer. Transfer Learning for Speech Recognition on a Budget. Get To The Point: Summarization with Pointer-Generator Networks. Eye In-Painting with Exemplar Generative Adversarial Networks. Progressive Learning for Systematic Design of Large Neural Networks. Learning to Find Good Correspondences …
Robust Registration of Gaussian Mixtures for Colour Transfer
Title | Robust Registration of Gaussian Mixtures for Colour Transfer |
Authors | Mairéad Grogan, Rozenn Dahyot |
Abstract | We present a flexible approach to colour transfer inspired by techniques recently proposed for shape registration. Colour distributions of the palette and target images are modelled with Gaussian Mixture Models (GMMs) that are robustly registered to infer a non linear parametric transfer function. We show experimentally that our approach compares well to current techniques both quantitatively and qualitatively. Moreover, our technique is computationally the fastest and can take efficient advantage of parallel processing architectures for recolouring images and videos. Our transfer function is parametric and hence can be stored in memory for later usage and also combined with other computed transfer functions to create interesting visual effects. Overall this paper provides a fast user friendly approach to recolouring of image and video materials. |
Tasks | |
Published | 2017-05-17 |
URL | http://arxiv.org/abs/1705.06091v1 |
http://arxiv.org/pdf/1705.06091v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-registration-of-gaussian-mixtures-for |
Repo | https://github.com/V-Sense/LFToolbox_Recolouring_HPR |
Framework | none |
Transfer Learning for Speech Recognition on a Budget
Title | Transfer Learning for Speech Recognition on a Budget |
Authors | Julius Kunze, Louis Kirsch, Ilia Kurenkov, Andreas Krug, Jens Johannsmeier, Sebastian Stober |
Abstract | End-to-end training of automated speech recognition (ASR) systems requires massive data and compute resources. We explore transfer learning based on model adaptation as an approach for training ASR models under constrained GPU memory, throughput and training data. We conduct several systematic experiments adapting a Wav2Letter convolutional neural network originally trained for English ASR to the German language. We show that this technique allows faster training on consumer-grade resources while requiring less training data in order to achieve the same accuracy, thereby lowering the cost of training ASR models in other languages. Model introspection revealed that small adaptations to the network’s weights were sufficient for good performance, especially for inner layers. |
Tasks | Speech Recognition, Transfer Learning |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.00290v1 |
http://arxiv.org/pdf/1706.00290v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-for-speech-recognition-on-a |
Repo | https://github.com/transfer-learning-asr/transfer-learning-asr |
Framework | tf |
Get To The Point: Summarization with Pointer-Generator Networks
Title | Get To The Point: Summarization with Pointer-Generator Networks |
Authors | Abigail See, Peter J. Liu, Christopher D. Manning |
Abstract | Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text). However, these models have two shortcomings: they are liable to reproduce factual details inaccurately, and they tend to repeat themselves. In this work we propose a novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways. First, we use a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator. Second, we use coverage to keep track of what has been summarized, which discourages repetition. We apply our model to the CNN / Daily Mail summarization task, outperforming the current abstractive state-of-the-art by at least 2 ROUGE points. |
Tasks | Abstractive Text Summarization, Text Summarization |
Published | 2017-04-14 |
URL | http://arxiv.org/abs/1704.04368v2 |
http://arxiv.org/pdf/1704.04368v2.pdf | |
PWC | https://paperswithcode.com/paper/get-to-the-point-summarization-with-pointer |
Repo | https://github.com/sblayush/summarization |
Framework | tf |
Eye In-Painting with Exemplar Generative Adversarial Networks
Title | Eye In-Painting with Exemplar Generative Adversarial Networks |
Authors | Brian Dolhansky, Cristian Canton Ferrer |
Abstract | This paper introduces a novel approach to in-painting where the identity of the object to remove or change is preserved and accounted for at inference time: Exemplar GANs (ExGANs). ExGANs are a type of conditional GAN that utilize exemplar information to produce high-quality, personalized in painting results. We propose using exemplar information in the form of a reference image of the region to in-paint, or a perceptual code describing that object. Unlike previous conditional GAN formulations, this extra information can be inserted at multiple points within the adversarial network, thus increasing its descriptive power. We show that ExGANs can produce photo-realistic personalized in-painting results that are both perceptually and semantically plausible by applying them to the task of closed to-open eye in-painting in natural pictures. A new benchmark dataset is also introduced for the task of eye in-painting for future comparisons. |
Tasks | |
Published | 2017-12-11 |
URL | http://arxiv.org/abs/1712.03999v1 |
http://arxiv.org/pdf/1712.03999v1.pdf | |
PWC | https://paperswithcode.com/paper/eye-in-painting-with-exemplar-generative |
Repo | https://github.com/bdol/exemplar_gans |
Framework | none |
Progressive Learning for Systematic Design of Large Neural Networks
Title | Progressive Learning for Systematic Design of Large Neural Networks |
Authors | Saikat Chatterjee, Alireza M. Javid, Mostafa Sadeghi, Partha P. Mitra, Mikael Skoglund |
Abstract | We develop an algorithm for systematic design of a large artificial neural network using a progression property. We find that some non-linear functions, such as the rectifier linear unit and its derivatives, hold the property. The systematic design addresses the choice of network size and regularization of parameters. The number of nodes and layers in network increases in progression with the objective of consistently reducing an appropriate cost. Each layer is optimized at a time, where appropriate parameters are learned using convex optimization. Regularization parameters for convex optimization do not need a significant manual effort for tuning. We also use random instances for some weight matrices, and that helps to reduce the number of parameters we learn. The developed network is expected to show good generalization power due to appropriate regularization and use of random weights in the layers. This expectation is verified by extensive experiments for classification and regression problems, using standard databases. |
Tasks | |
Published | 2017-10-23 |
URL | http://arxiv.org/abs/1710.08177v1 |
http://arxiv.org/pdf/1710.08177v1.pdf | |
PWC | https://paperswithcode.com/paper/progressive-learning-for-systematic-design-of |
Repo | https://github.com/viebboy/POPmem |
Framework | none |
Learning to Find Good Correspondences
Title | Learning to Find Good Correspondences |
Authors | Kwang Moo Yi, Eduard Trulls, Yuki Ono, Vincent Lepetit, Mathieu Salzmann, Pascal Fua |
Abstract | We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while imbuing it with global information, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little training data. |
Tasks | |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.05971v2 |
http://arxiv.org/pdf/1711.05971v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-find-good-correspondences |
Repo | https://github.com/vcg-uvic/image-matching-benchmark |
Framework | none |
Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme
Title | Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme |
Authors | Suncong Zheng, Feng Wang, Hongyun Bao, Yuexing Hao, Peng Zhou, Bo Xu |
Abstract | Joint extraction of entities and relations is an important task in information extraction. To tackle this problem, we firstly propose a novel tagging scheme that can convert the joint extraction task to a tagging problem. Then, based on our tagging scheme, we study different end-to-end models to extract entities and their relations directly, without identifying entities and relations separately. We conduct experiments on a public dataset produced by distant supervision method and the experimental results show that the tagging based methods are better than most of the existing pipelined and joint learning methods. What’s more, the end-to-end model proposed in this paper, achieves the best results on the public dataset. |
Tasks | Joint Entity and Relation Extraction, Relation Extraction |
Published | 2017-06-07 |
URL | http://arxiv.org/abs/1706.05075v1 |
http://arxiv.org/pdf/1706.05075v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-extraction-of-entities-and-relations |
Repo | https://github.com/kyzhouhzau/CCLNER |
Framework | tf |
A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction
Title | A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction |
Authors | Hui Zeng, Lei Zhang, Alan C. Bovik |
Abstract | Blind image quality assessment (BIQA) remains a very challenging problem due to the unavailability of a reference image. Deep learning based BIQA methods have been attracting increasing attention in recent years, yet it remains a difficult task to train a robust deep BIQA model because of the very limited number of training samples with human subjective scores. Most existing methods learn a regression network to minimize the prediction error of a scalar image quality score. However, such a scheme ignores the fact that an image will receive divergent subjective scores from different subjects, which cannot be adequately represented by a single scalar number. This is particularly true on complex, real-world distorted images. Moreover, images may broadly differ in their distributions of assigned subjective scores. Recognizing this, we propose a new representation of perceptual image quality, called probabilistic quality representation (PQR), to describe the image subjective score distribution, whereby a more robust loss function can be employed to train a deep BIQA model. The proposed PQR method is shown to not only speed up the convergence of deep model training, but to also greatly improve the achievable level of quality prediction accuracy relative to scalar quality score regression methods. The source code is available at https://github.com/HuiZeng/BIQA_Toolbox. |
Tasks | Blind Image Quality Assessment, Image Quality Assessment |
Published | 2017-08-28 |
URL | http://arxiv.org/abs/1708.08190v2 |
http://arxiv.org/pdf/1708.08190v2.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-quality-representation |
Repo | https://github.com/HuiZeng/BIQA_Toolbox |
Framework | none |
Pose Guided Structured Region Ensemble Network for Cascaded Hand Pose Estimation
Title | Pose Guided Structured Region Ensemble Network for Cascaded Hand Pose Estimation |
Authors | Xinghao Chen, Guijin Wang, Hengkai Guo, Cairong Zhang |
Abstract | Hand pose estimation from a single depth image is an essential topic in computer vision and human computer interaction. Despite recent advancements in this area promoted by convolutional neural network, accurate hand pose estimation is still a challenging problem. In this paper we propose a Pose guided structured Region Ensemble Network (Pose-REN) to boost the performance of hand pose estimation. The proposed method extracts regions from the feature maps of convolutional neural network under the guide of an initially estimated pose, generating more optimal and representative features for hand pose estimation. The extracted feature regions are then integrated hierarchically according to the topology of hand joints by employing tree-structured fully connections. A refined estimation of hand pose is directly regressed by the proposed network and the final hand pose is obtained by utilizing an iterative cascaded method. Comprehensive experiments on public hand pose datasets demonstrate that our proposed method outperforms state-of-the-art algorithms. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2017-08-11 |
URL | http://arxiv.org/abs/1708.03416v2 |
http://arxiv.org/pdf/1708.03416v2.pdf | |
PWC | https://paperswithcode.com/paper/pose-guided-structured-region-ensemble |
Repo | https://github.com/xinghaochen/Pose-REN |
Framework | caffe2 |
Sparse Representation-based Open Set Recognition
Title | Sparse Representation-based Open Set Recognition |
Authors | He Zhang, Vishal M. Patel |
Abstract | We propose a generalized Sparse Representation- based Classification (SRC) algorithm for open set recognition where not all classes presented during testing are known during training. The SRC algorithm uses class reconstruction errors for classification. As most of the discriminative information for open set recognition is hidden in the tail part of the matched and sum of non-matched reconstruction error distributions, we model the tail of those two error distributions using the statistical Extreme Value Theory (EVT). Then we simplify the open set recognition problem into a set of hypothesis testing problems. The confidence scores corresponding to the tail distributions of a novel test sample are then fused to determine its identity. The effectiveness of the proposed method is demonstrated using four publicly available image and object classification datasets and it is shown that this method can perform significantly better than many competitive open set recognition algorithms. Code is public available: https://github.com/hezhangsprinter/SROSR |
Tasks | Object Classification, Open Set Learning, Sparse Representation-based Classification |
Published | 2017-05-06 |
URL | http://arxiv.org/abs/1705.02431v1 |
http://arxiv.org/pdf/1705.02431v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-representation-based-open-set |
Repo | https://github.com/hezhangsprinter/SROSR |
Framework | none |
Improved Regularization of Convolutional Neural Networks with Cutout
Title | Improved Regularization of Convolutional Neural Networks with Cutout |
Authors | Terrance DeVries, Graham W. Taylor |
Abstract | Convolutional neural networks are capable of learning powerful representational spaces, which are necessary for tackling complex learning tasks. However, due to the model capacity required to capture such representations, they are often susceptible to overfitting and therefore require proper regularization in order to generalize well. In this paper, we show that the simple regularization technique of randomly masking out square regions of input during training, which we call cutout, can be used to improve the robustness and overall performance of convolutional neural networks. Not only is this method extremely easy to implement, but we also demonstrate that it can be used in conjunction with existing forms of data augmentation and other regularizers to further improve model performance. We evaluate this method by applying it to current state-of-the-art architectures on the CIFAR-10, CIFAR-100, and SVHN datasets, yielding new state-of-the-art results of 2.56%, 15.20%, and 1.30% test error respectively. Code is available at https://github.com/uoguelph-mlrg/Cutout |
Tasks | Data Augmentation, Image Augmentation, Image Classification, Semi-Supervised Image Classification |
Published | 2017-08-15 |
URL | http://arxiv.org/abs/1708.04552v2 |
http://arxiv.org/pdf/1708.04552v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-regularization-of-convolutional |
Repo | https://github.com/uoguelph-mlrg/Cutout |
Framework | pytorch |
Does William Shakespeare REALLY Write Hamlet? Knowledge Representation Learning with Confidence
Title | Does William Shakespeare REALLY Write Hamlet? Knowledge Representation Learning with Confidence |
Authors | Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin |
Abstract | Knowledge graphs (KGs), which could provide essential relational information between entities, have been widely utilized in various knowledge-driven applications. Since the overall human knowledge is innumerable that still grows explosively and changes frequently, knowledge construction and update inevitably involve automatic mechanisms with less human supervision, which usually bring in plenty of noises and conflicts to KGs. However, most conventional knowledge representation learning methods assume that all triple facts in existing KGs share the same significance without any noises. To address this problem, we propose a novel confidence-aware knowledge representation learning framework (CKRL), which detects possible noises in KGs while learning knowledge representations with confidence simultaneously. Specifically, we introduce the triple confidence to conventional translation-based methods for knowledge representation learning. To make triple confidence more flexible and universal, we only utilize the internal structural information in KGs, and propose three kinds of triple confidences considering both local and global structural information. In experiments, We evaluate our models on knowledge graph noise detection, knowledge graph completion and triple classification. Experimental results demonstrate that our confidence-aware models achieve significant and consistent improvements on all tasks, which confirms the capability of CKRL modeling confidence with structural information in both KG noise detection and knowledge representation learning. |
Tasks | Knowledge Graph Completion, Knowledge Graphs, Representation Learning |
Published | 2017-05-09 |
URL | http://arxiv.org/abs/1705.03202v2 |
http://arxiv.org/pdf/1705.03202v2.pdf | |
PWC | https://paperswithcode.com/paper/does-william-shakespeare-really-write-hamlet |
Repo | https://github.com/thunlp/CKRL |
Framework | none |
Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs
Title | Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs |
Authors | Martin Simonovsky, Nikos Komodakis |
Abstract | A number of problems can be formulated as prediction on graph-structured data. In this work, we generalize the convolution operator from regular grids to arbitrary graphs while avoiding the spectral domain, which allows us to handle graphs of varying size and connectivity. To move beyond a simple diffusion, filter weights are conditioned on the specific edge labels in the neighborhood of a vertex. Together with the proper choice of graph coarsening, we explore constructing deep neural networks for graph classification. In particular, we demonstrate the generality of our formulation in point cloud classification, where we set the new state of the art, and on a graph classification dataset, where we outperform other deep learning approaches. The source code is available at https://github.com/mys007/ecc |
Tasks | Graph Classification |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.02901v3 |
http://arxiv.org/pdf/1704.02901v3.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-edge-conditioned-filters-in |
Repo | https://github.com/mys007/ecc |
Framework | pytorch |
Single Image Super-Resolution with Dilated Convolution based Multi-Scale Information Learning Inception Module
Title | Single Image Super-Resolution with Dilated Convolution based Multi-Scale Information Learning Inception Module |
Authors | Wuzhen Shi, Feng Jiang, Debin Zhao |
Abstract | Traditional works have shown that patches in a natural image tend to redundantly recur many times inside the image, both within the same scale, as well as across different scales. Make full use of these multi-scale information can improve the image restoration performance. However, the current proposed deep learning based restoration methods do not take the multi-scale information into account. In this paper, we propose a dilated convolution based inception module to learn multi-scale information and design a deep network for single image super-resolution. Different dilated convolution learns different scale feature, then the inception module concatenates all these features to fuse multi-scale information. In order to increase the reception field of our network to catch more contextual information, we cascade multiple inception modules to constitute a deep network to conduct single image super-resolution. With the novel dilated convolution based inception module, the proposed end-to-end single image super-resolution network can take advantage of multi-scale information to improve image super-resolution performance. Experimental results show that our proposed method outperforms many state-of-the-art single image super-resolution methods. |
Tasks | Image Restoration, Image Super-Resolution, Super-Resolution |
Published | 2017-07-22 |
URL | http://arxiv.org/abs/1707.07128v1 |
http://arxiv.org/pdf/1707.07128v1.pdf | |
PWC | https://paperswithcode.com/paper/single-image-super-resolution-with-dilated |
Repo | https://github.com/wzhshi/MSSRNet |
Framework | none |
Label-driven weakly-supervised learning for multimodal deformable image registration
Title | Label-driven weakly-supervised learning for multimodal deformable image registration |
Authors | Yipeng Hu, Marc Modat, Eli Gibson, Nooshin Ghavami, Ester Bonmati, Caroline M. Moore, Mark Emberton, J. Alison Noble, Dean C. Barratt, Tom Vercauteren |
Abstract | Spatially aligning medical images from different modalities remains a challenging task, especially for intraoperative applications that require fast and robust algorithms. We propose a weakly-supervised, label-driven formulation for learning 3D voxel correspondence from higher-level label correspondence, thereby bypassing classical intensity-based image similarity measures. During training, a convolutional neural network is optimised by outputting a dense displacement field (DDF) that warps a set of available anatomical labels from the moving image to match their corresponding counterparts in the fixed image. These label pairs, including solid organs, ducts, vessels, point landmarks and other ad hoc structures, are only required at training time and can be spatially aligned by minimising a cross-entropy function of the warped moving label and the fixed label. During inference, the trained network takes a new image pair to predict an optimal DDF, resulting in a fully-automatic, label-free, real-time and deformable registration. For interventional applications where large global transformation prevails, we also propose a neural network architecture to jointly optimise the global- and local displacements. Experiment results are presented based on cross-validating registrations of 111 pairs of T2-weighted magnetic resonance images and 3D transrectal ultrasound images from prostate cancer patients with a total of over 4000 anatomical labels, yielding a median target registration error of 4.2 mm on landmark centroids and a median Dice of 0.88 on prostate glands. |
Tasks | Image Registration |
Published | 2017-11-05 |
URL | http://arxiv.org/abs/1711.01666v2 |
http://arxiv.org/pdf/1711.01666v2.pdf | |
PWC | https://paperswithcode.com/paper/label-driven-weakly-supervised-learning-for |
Repo | https://github.com/Duoduo-Qian/Medical-image-registration-Resources |
Framework | pytorch |