July 27, 2019

3385 words 16 mins read

Paper Group ANR 481

Paper Group ANR 481

Top-down Transformation Choice. MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features. Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks. Dynamics Transfer GAN: Generating Video by Transferring Arbitrary Temporal Dynamics from a Source Video to a Single Target Image. …

Top-down Transformation Choice

Title Top-down Transformation Choice
Authors Torsten Hothorn
Abstract Simple models are preferred over complex models, but over-simplistic models could lead to erroneous interpretations. The classical approach is to start with a simple model, whose shortcomings are assessed in residual-based model diagnostics. Eventually, one increases the complexity of this initial overly simple model and obtains a better-fitting model. I illustrate how transformation analysis can be used as an alternative approach to model choice. Instead of adding complexity to simple models, step-wise complexity reduction is used to help identify simpler and better-interpretable models. As an example, body mass index distributions in Switzerland are modelled by means of transformation models to understand the impact of sex, age, smoking and other lifestyle factors on a person’s body mass index. In this process, I searched for a compromise between model fit and model interpretability. Special emphasis is given to the understanding of the connections between transformation models of increasing complexity. The models used in this analysis ranged from evergreens, such as the normal linear regression model with constant variance, to novel models with extremely flexible conditional distribution functions, such as transformation trees and transformation forests.
Tasks
Published 2017-06-26
URL http://arxiv.org/abs/1706.08269v2
PDF http://arxiv.org/pdf/1706.08269v2.pdf
PWC https://paperswithcode.com/paper/top-down-transformation-choice
Repo
Framework

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

Title MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
Authors Liang-Chieh Chen, Alexander Hermans, George Papandreou, Florian Schroff, Peng Wang, Hartwig Adam
Abstract In this work, we tackle the problem of instance segmentation, the task of simultaneously solving object detection and semantic segmentation. Towards this goal, we present a model, called MaskLab, which produces three outputs: box detection, semantic segmentation, and direction prediction. Building on top of the Faster-RCNN object detector, the predicted boxes provide accurate localization of object instances. Within each region of interest, MaskLab performs foreground/background segmentation by combining semantic and direction prediction. Semantic segmentation assists the model in distinguishing between objects of different semantic classes including background, while the direction prediction, estimating each pixel’s direction towards its corresponding center, allows separating instances of the same semantic class. Moreover, we explore the effect of incorporating recent successful methods from both segmentation and detection (i.e. atrous convolution and hypercolumn). Our proposed model is evaluated on the COCO instance segmentation benchmark and shows comparable performance with other state-of-art models.
Tasks Instance Segmentation, Object Detection, Semantic Segmentation
Published 2017-12-13
URL http://arxiv.org/abs/1712.04837v1
PDF http://arxiv.org/pdf/1712.04837v1.pdf
PWC https://paperswithcode.com/paper/masklab-instance-segmentation-by-refining
Repo
Framework

Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks

Title Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks
Authors Stephan Baier, Sigurd Spieckermann, Volker Tresp
Abstract With the rising number of interconnected devices and sensors, modeling distributed sensor networks is of increasing interest. Recurrent neural networks (RNN) are considered particularly well suited for modeling sensory and streaming data. When predicting future behavior, incorporating information from neighboring sensor stations is often beneficial. We propose a new RNN based architecture for context specific information fusion across multiple spatially distributed sensor stations. Hereby, latent representations of multiple local models, each modeling one sensor station, are jointed and weighted, according to their importance for the prediction. The particular importance is assessed depending on the current context using a separate attention function. We demonstrate the effectiveness of our model on three different real-world sensor network datasets.
Tasks
Published 2017-11-13
URL http://arxiv.org/abs/1711.04679v1
PDF http://arxiv.org/pdf/1711.04679v1.pdf
PWC https://paperswithcode.com/paper/attention-based-information-fusion-using
Repo
Framework

Dynamics Transfer GAN: Generating Video by Transferring Arbitrary Temporal Dynamics from a Source Video to a Single Target Image

Title Dynamics Transfer GAN: Generating Video by Transferring Arbitrary Temporal Dynamics from a Source Video to a Single Target Image
Authors Wissam J. Baddar, Geonmo Gu, Sangmin Lee, Yong Man Ro
Abstract In this paper, we propose Dynamics Transfer GAN; a new method for generating video sequences based on generative adversarial learning. The spatial constructs of a generated video sequence are acquired from the target image. The dynamics of the generated video sequence are imported from a source video sequence, with arbitrary motion, and imposed onto the target image. To preserve the spatial construct of the target image, the appearance of the source video sequence is suppressed and only the dynamics are obtained before being imposed onto the target image. That is achieved using the proposed appearance suppressed dynamics feature. Moreover, the spatial and temporal consistencies of the generated video sequence are verified via two discriminator networks. One discriminator validates the fidelity of the generated frames appearance, while the other validates the dynamic consistency of the generated video sequence. Experiments have been conducted to verify the quality of the video sequences generated by the proposed method. The results verified that Dynamics Transfer GAN successfully transferred arbitrary dynamics of the source video sequence onto a target image when generating the output video sequence. The experimental results also showed that Dynamics Transfer GAN maintained the spatial constructs (appearance) of the target image while generating spatially and temporally consistent video sequences.
Tasks
Published 2017-12-10
URL http://arxiv.org/abs/1712.03534v1
PDF http://arxiv.org/pdf/1712.03534v1.pdf
PWC https://paperswithcode.com/paper/dynamics-transfer-gan-generating-video-by
Repo
Framework

Acquisition of Translation Lexicons for Historically Unwritten Languages via Bridging Loanwords

Title Acquisition of Translation Lexicons for Historically Unwritten Languages via Bridging Loanwords
Authors Michael Bloodgood, Benjamin Strauss
Abstract With the advent of informal electronic communications such as social media, colloquial languages that were historically unwritten are being written for the first time in heavily code-switched environments. We present a method for inducing portions of translation lexicons through the use of expert knowledge in these settings where there are approximately zero resources available other than a language informant, potentially not even large amounts of monolingual data. We investigate inducing a Moroccan Darija-English translation lexicon via French loanwords bridging into English and find that a useful lexicon is induced for human-assisted translation and statistical machine translation.
Tasks Machine Translation
Published 2017-06-06
URL http://arxiv.org/abs/1706.01570v2
PDF http://arxiv.org/pdf/1706.01570v2.pdf
PWC https://paperswithcode.com/paper/acquisition-of-translation-lexicons-for
Repo
Framework

Fast Barcode Retrieval for Consensus Contouring

Title Fast Barcode Retrieval for Consensus Contouring
Authors H. R. Tizhoosh, G. J. Czarnota
Abstract Marking tumors and organs is a challenging task suffering from both inter- and intra-observer variability. The literature quantifies observer variability by generating consensus among multiple experts when they mark the same image. Automatically building consensus contours to establish quality assurance for image segmentation is presently absent in the clinical practice. As the \emph{big data} becomes more and more available, techniques to access a large number of existing segments of multiple experts becomes possible. Fast algorithms are, hence, required to facilitate the search for similar cases. The present work puts forward a potential framework that tested with small datasets (both synthetic and real images) displays the reliability of finding similar images. In this paper, the idea of content-based barcodes is used to retrieve similar cases in order to build consensus contours in medical image segmentation. This approach may be regarded as an extension of the conventional atlas-based segmentation that generally works with rather small atlases due to required computational expenses. The fast segment-retrieval process via barcodes makes it possible to create and use large atlases, something that directly contributes to the quality of the consensus building. Because the accuracy of experts’ contours must be measured, we first used 500 synthetic prostate images with their gold markers and delineations by 20 simulated users. The fast barcode-guided computed consensus delivered an average error of $8%!\pm!5%$ compared against the gold standard segments. Furthermore, we used magnetic resonance images of prostates from 15 patients delineated by 5 oncologists and selected the best delineations to serve as the gold-standard segments. The proposed barcode atlas achieved a Jaccard overlap of $87%!\pm!9%$ with the contours of the gold-standard segments.
Tasks Medical Image Segmentation, Semantic Segmentation
Published 2017-09-28
URL http://arxiv.org/abs/1709.10197v1
PDF http://arxiv.org/pdf/1709.10197v1.pdf
PWC https://paperswithcode.com/paper/fast-barcode-retrieval-for-consensus
Repo
Framework

Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels

Title Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels
Authors Caglar Aytekin, Xingyang Ni, Francesco Cricri, Lixin Fan, Emre Aksu
Abstract Computer vision algorithms with pixel-wise labeling tasks, such as semantic segmentation and salient object detection, have gone through a significant accuracy increase with the incorporation of deep learning. Deep segmentation methods slightly modify and fine-tune pre-trained networks that have hundreds of millions of parameters. In this work, we question the need to have such memory demanding networks for the specific task of salient object segmentation. To this end, we propose a way to learn a memory-efficient network from scratch by training it only on salient object detection datasets. Our method encodes images to gridized superpixels that preserve both the object boundaries and the connectivity rules of regular pixels. This representation allows us to use convolutional neural networks that operate on regular grids. By using these encoded images, we train a memory-efficient network using only 0.048% of the number of parameters that other deep salient object detection networks have. Our method shows comparable accuracy with the state-of-the-art deep salient object detection methods and provides a faster and a much more memory-efficient alternative to them. Due to its easy deployment, such a network is preferable for applications in memory limited devices such as mobile phones and IoT devices.
Tasks Object Detection, Salient Object Detection, Semantic Segmentation
Published 2017-12-27
URL http://arxiv.org/abs/1712.09558v2
PDF http://arxiv.org/pdf/1712.09558v2.pdf
PWC https://paperswithcode.com/paper/memory-efficient-deep-salient-object
Repo
Framework

Indexing the Event Calculus with Kd-trees to Monitor Diabetes

Title Indexing the Event Calculus with Kd-trees to Monitor Diabetes
Authors Stefano Bromuri, Albert Brugues de la Torre, Fabien Duboisson, Michael Schumacher
Abstract Personal Health Systems (PHS) are mobile solutions tailored to monitoring patients affected by chronic non communicable diseases. A patient affected by a chronic disease can generate large amounts of events. Type 1 Diabetic patients generate several glucose events per day, ranging from at least 6 events per day (under normal monitoring) to 288 per day when wearing a continuous glucose monitor (CGM) that samples the blood every 5 minutes for several days. This is a large number of events to monitor for medical doctors, in particular when considering that they may have to take decisions concerning adjusting the treatment, which may impact the life of the patients for a long time. Given the need to analyse such a large stream of data, doctors need a simple approach towards physiological time series that allows them to promptly transfer their knowledge into queries to identify interesting patterns in the data. Achieving this with current technology is not an easy task, as on one hand it cannot be expected that medical doctors have the technical knowledge to query databases and on the other hand these time series include thousands of events, which requires to re-think the way data is indexed. In order to tackle the knowledge representation and efficiency problem, this contribution presents the kd-tree cached event calculus (\ceckd) an event calculus extension for knowledge engineering of temporal rules capable to handle many thousands events produced by a diabetic patient. \ceckd\ is built as a support to a graphical interface to represent monitoring rules for diabetes type 1. In addition, the paper evaluates the \ceckd\ with respect to the cached event calculus (CEC) to show how indexing events using kd-trees improves scalability with respect to the current state of the art.
Tasks Time Series
Published 2017-10-03
URL http://arxiv.org/abs/1710.01275v1
PDF http://arxiv.org/pdf/1710.01275v1.pdf
PWC https://paperswithcode.com/paper/indexing-the-event-calculus-with-kd-trees-to
Repo
Framework

Kernelized Hashcode Representations for Relation Extraction

Title Kernelized Hashcode Representations for Relation Extraction
Authors Sahil Garg, Aram Galstyan, Greg Ver Steeg, Irina Rish, Guillermo Cecchi, Shuyang Gao
Abstract Kernel methods have produced state-of-the-art results for a number of NLP tasks such as relation extraction, but suffer from poor scalability due to the high cost of computing kernel similarities between natural language structures. A recently proposed technique, kernelized locality-sensitive hashing (KLSH), can significantly reduce the computational cost, but is only applicable to classifiers operating on kNN graphs. Here we propose to use random subspaces of KLSH codes for efficiently constructing an explicit representation of NLP structures suitable for general classification methods. Further, we propose an approach for optimizing the KLSH model for classification problems by maximizing an approximation of mutual information between the KLSH codes (feature vectors) and the class labels. We evaluate the proposed approach on biomedical relation extraction datasets, and observe significant and robust improvements in accuracy w.r.t. state-of-the-art classifiers, along with drastic (orders-of-magnitude) speedup compared to conventional kernel methods.
Tasks Relation Extraction
Published 2017-11-10
URL https://arxiv.org/abs/1711.04044v7
PDF https://arxiv.org/pdf/1711.04044v7.pdf
PWC https://paperswithcode.com/paper/kernelized-hashcode-representations-for
Repo
Framework

Adversarial Structured Prediction for Multivariate Measures

Title Adversarial Structured Prediction for Multivariate Measures
Authors Hong Wang, Ashkan Rezaei, Brian D. Ziebart
Abstract Many predicted structured objects (e.g., sequences, matchings, trees) are evaluated using the F-score, alignment error rate (AER), or other multivariate performance measures. Since inductively optimizing these measures using training data is typically computationally difficult, empirical risk minimization of surrogate losses is employed, using, e.g., the hinge loss for (structured) support vector machines. These approximations often introduce a mismatch between the learner’s objective and the desired application performance, leading to inconsistency. We take a different approach: adversarially approximate training data while optimizing the exact F-score or AER. Structured predictions under this formulation result from solving zero-sum games between a predictor seeking the best performance and an adversary seeking the worst while required to (approximately) match certain structured properties of the training data. We explore this approach for word alignment (AER evaluation) and named entity recognition (F-score evaluation) with linear-chain constraints.
Tasks Named Entity Recognition, Structured Prediction, Word Alignment
Published 2017-12-20
URL http://arxiv.org/abs/1712.07374v2
PDF http://arxiv.org/pdf/1712.07374v2.pdf
PWC https://paperswithcode.com/paper/adversarial-structured-prediction-for
Repo
Framework

A Convex Similarity Index for Sparse Recovery of Missing Image Samples

Title A Convex Similarity Index for Sparse Recovery of Missing Image Samples
Authors Amirhossein Javaheri, Hadi Zayyani, Farokh Marvasti
Abstract This paper investigates the problem of recovering missing samples using methods based on sparse representation adapted especially for image signals. Instead of $l_2$-norm or Mean Square Error (MSE), a new perceptual quality measure is used as the similarity criterion between the original and the reconstructed images. The proposed criterion called Convex SIMilarity (CSIM) index is a modified version of the Structural SIMilarity (SSIM) index, which despite its predecessor, is convex and uni-modal. We derive mathematical properties for the proposed index and show how to optimally choose the parameters of the proposed criterion, investigating the Restricted Isometry (RIP) and error-sensitivity properties. We also propose an iterative sparse recovery method based on a constrained $l_1$-norm minimization problem, incorporating CSIM as the fidelity criterion. The resulting convex optimization problem is solved via an algorithm based on Alternating Direction Method of Multipliers (ADMM). Taking advantage of the convexity of the CSIM index, we also prove the convergence of the algorithm to the globally optimal solution of the proposed optimization problem, starting from any arbitrary point. Simulation results confirm the performance of the new similarity index as well as the proposed algorithm for missing sample recovery of image patch signals.
Tasks
Published 2017-01-25
URL http://arxiv.org/abs/1701.07422v3
PDF http://arxiv.org/pdf/1701.07422v3.pdf
PWC https://paperswithcode.com/paper/a-convex-similarity-index-for-sparse-recovery
Repo
Framework

Stretching Domain Adaptation: How far is too far?

Title Stretching Domain Adaptation: How far is too far?
Authors Yunhan Zhao, Haider Ali, Rene Vidal
Abstract While deep learning has led to significant advances in visual recognition over the past few years, such advances often require a lot of annotated data. Unsupervised domain adaptation has emerged as an alternative approach that does not require as much annotated data, prior evaluations of domain adaptation approaches have been limited to relatively similar datasets, e.g source and target domains are samples captured by different cameras. A new data suite is proposed that comprehensively evaluates cross-modality domain adaptation problems. This work pushes the limit of unsupervised domain adaptation through an in-depth evaluation of several state of the art methods on benchmark datasets and the new dataset suite. We also propose a new domain adaptation network called “Deep MagNet” that effectively transfers knowledge for cross-modality domain adaptation problems. Deep Magnet achieves state of the art performance on two benchmark datasets. More importantly, the proposed method shows consistent improvements in performance on the newly proposed dataset suite.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2017-12-06
URL http://arxiv.org/abs/1712.02286v2
PDF http://arxiv.org/pdf/1712.02286v2.pdf
PWC https://paperswithcode.com/paper/stretching-domain-adaptation-how-far-is-too
Repo
Framework

Learning Markov Chain in Unordered Dataset

Title Learning Markov Chain in Unordered Dataset
Authors Yao-Hung Hubert Tsai, Han Zhao, Ruslan Salakhutdinov, Nebojsa Jojic
Abstract The assumption that data samples are independently identically distributed is the backbone of many learning algorithms. Nevertheless, datasets often exhibit rich structure in practice, and we argue that there exist some unknown order within the data instances. In this technical report, we introduce OrderNet that can be used to extract the order of data instances in an unsupervised way. By assuming that the instances are sampled from a Markov chain, our goal is to learn the transitional operator of the underlying Markov chain, as well as the order by maximizing the generation probability under all possible data permutations. Specifically, we use neural network as a compact and soft lookup table to approximate the possibly huge, but discrete transition matrix. This strategy allows us to amortize the space complexity with a single model. Furthermore, this simple and compact representation also provides a short description to the dataset and generalizes to unseen instances as well. To ensure that the learned Markov chain is ergodic, we propose a greedy batch-wise permutation scheme that allows fast training. Empirically, we show that OrderNet is able to discover an order among data instances. We also extend the proposed OrderNet to one-shot recognition task and demonstrate favorable results.
Tasks
Published 2017-11-08
URL http://arxiv.org/abs/1711.03167v3
PDF http://arxiv.org/pdf/1711.03167v3.pdf
PWC https://paperswithcode.com/paper/learning-markov-chain-in-unordered-dataset
Repo
Framework

Face Detection, Bounding Box Aggregation and Pose Estimation for Robust Facial Landmark Localisation in the Wild

Title Face Detection, Bounding Box Aggregation and Pose Estimation for Robust Facial Landmark Localisation in the Wild
Authors Zhen-Hua Feng, Josef Kittler, Muhammad Awais, Patrik Huber, Xiao-Jun Wu
Abstract We present a framework for robust face detection and landmark localisation of faces in the wild, which has been evaluated as part of `the 2nd Facial Landmark Localisation Competition’. The framework has four stages: face detection, bounding box aggregation, pose estimation and landmark localisation. To achieve a high detection rate, we use two publicly available CNN-based face detectors and two proprietary detectors. We aggregate the detected face bounding boxes of each input image to reduce false positives and improve face detection accuracy. A cascaded shape regressor, trained using faces with a variety of pose variations, is then employed for pose estimation and image pre-processing. Last, we train the final cascaded shape regressor for fine-grained landmark localisation, using a large number of training samples with limited pose variations. The experimental results obtained on the 300W and Menpo benchmarks demonstrate the superiority of our framework over state-of-the-art methods. |
Tasks Face Alignment, Face Detection, Pose Estimation
Published 2017-05-05
URL http://arxiv.org/abs/1705.02402v2
PDF http://arxiv.org/pdf/1705.02402v2.pdf
PWC https://paperswithcode.com/paper/face-detection-bounding-box-aggregation-and
Repo
Framework

Deep Face Deblurring

Title Deep Face Deblurring
Authors Grigorios G. Chrysos, Stefanos Zafeiriou
Abstract Blind deblurring consists a long studied task, however the outcomes of generic methods are not effective in real world blurred images. Domain-specific methods for deblurring targeted object categories, e.g. text or faces, frequently outperform their generic counterparts, hence they are attracting an increasing amount of attention. In this work, we develop such a domain-specific method to tackle deblurring of human faces, henceforth referred to as face deblurring. Studying faces is of tremendous significance in computer vision, however face deblurring has yet to demonstrate some convincing results. This can be partly attributed to the combination of i) poor texture and ii) highly structure shape that yield the contour/gradient priors (that are typically used) sub-optimal. In our work instead of making assumptions over the prior, we adopt a learning approach by inserting weak supervision that exploits the well-documented structure of the face. Namely, we utilise a deep network to perform the deblurring and employ a face alignment technique to pre-process each face. We additionally surpass the requirement of the deep network for thousands training samples, by introducing an efficient framework that allows the generation of a large dataset. We utilised this framework to create 2MF2, a dataset of over two million frames. We conducted experiments with real world blurred facial images and report that our method returns a result close to the sharp natural latent image.
Tasks Deblurring, Face Alignment
Published 2017-04-27
URL http://arxiv.org/abs/1704.08772v2
PDF http://arxiv.org/pdf/1704.08772v2.pdf
PWC https://paperswithcode.com/paper/deep-face-deblurring
Repo
Framework
comments powered by Disqus