May 7, 2019

2926 words 14 mins read

Paper Group AWR 39

Paper Group AWR 39

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. CogALex-V Shared Task: LexNET - Integrated Path-based and Distributional Method for the Identification of Semantic Relations. Learning a Deep Embedding Model for Zero-Shot Learning. An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recog …

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Title Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
Authors Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi
Abstract Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.
Tasks Image Super-Resolution, Super-Resolution
Published 2016-09-15
URL http://arxiv.org/abs/1609.04802v5
PDF http://arxiv.org/pdf/1609.04802v5.pdf
PWC https://paperswithcode.com/paper/photo-realistic-single-image-super-resolution
Repo https://github.com/idearibosome/tf-perceptual-eusr
Framework tf

CogALex-V Shared Task: LexNET - Integrated Path-based and Distributional Method for the Identification of Semantic Relations

Title CogALex-V Shared Task: LexNET - Integrated Path-based and Distributional Method for the Identification of Semantic Relations
Authors Vered Shwartz, Ido Dagan
Abstract We present a submission to the CogALex 2016 shared task on the corpus-based identification of semantic relations, using LexNET (Shwartz and Dagan, 2016), an integrated path-based and distributional method for semantic relation classification. The reported results in the shared task bring this submission to the third place on subtask 1 (word relatedness), and the first place on subtask 2 (semantic relation classification), demonstrating the utility of integrating the complementary path-based and distributional information sources in recognizing concrete semantic relations. Combined with a common similarity measure, LexNET performs fairly good on the word relatedness task (subtask 1). The relatively low performance of LexNET and all other systems on subtask 2, however, confirms the difficulty of the semantic relation classification task, and stresses the need to develop additional methods for this task.
Tasks Relation Classification
Published 2016-10-27
URL http://arxiv.org/abs/1610.08694v3
PDF http://arxiv.org/pdf/1610.08694v3.pdf
PWC https://paperswithcode.com/paper/cogalex-v-shared-task-lexnet-integrated-path
Repo https://github.com/vered1986/LexNET
Framework none

Learning a Deep Embedding Model for Zero-Shot Learning

Title Learning a Deep Embedding Model for Zero-Shot Learning
Authors Li Zhang, Tao Xiang, Shaogang Gong
Abstract Zero-shot learning (ZSL) models rely on learning a joint embedding space where both textual/semantic description of object classes and visual representation of object images can be projected to for nearest neighbour search. Despite the success of deep neural networks that learn an end-to-end model between text and images in other vision problems such as image captioning, very few deep ZSL model exists and they show little advantage over ZSL models that utilise deep feature representations but do not learn an end-to-end embedding. In this paper we argue that the key to make deep ZSL models succeed is to choose the right embedding space. Instead of embedding into a semantic space or an intermediate space, we propose to use the visual space as the embedding space. This is because that in this space, the subsequent nearest neighbour search would suffer much less from the hubness problem and thus become more effective. This model design also provides a natural mechanism for multiple semantic modalities (e.g., attributes and sentence descriptions) to be fused and optimised jointly in an end-to-end manner. Extensive experiments on four benchmarks show that our model significantly outperforms the existing models. Code is available at https://github.com/lzrobots/DeepEmbeddingModel_ZSL
Tasks Image Captioning, Zero-Shot Learning
Published 2016-11-15
URL https://arxiv.org/abs/1611.05088v4
PDF https://arxiv.org/pdf/1611.05088v4.pdf
PWC https://paperswithcode.com/paper/learning-a-deep-embedding-model-for-zero-shot
Repo https://github.com/lzrobots/DeepEmbeddingModel_ZSL
Framework tf

An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild

Title An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild
Authors Wei-Lun Chao, Soravit Changpinyo, Boqing Gong, Fei Sha
Abstract Zero-shot learning (ZSL) methods have been studied in the unrealistic setting where test data are assumed to come from unseen classes only. In this paper, we advocate studying the problem of generalized zero-shot learning (GZSL) where the test data’s class memberships are unconstrained. We show empirically that naively using the classifiers constructed by ZSL approaches does not perform well in the generalized setting. Motivated by this, we propose a simple but effective calibration method that can be used to balance two conflicting forces: recognizing data from seen classes versus those from unseen ones. We develop a performance metric to characterize such a trade-off and examine the utility of this metric in evaluating various ZSL approaches. Our analysis further shows that there is a large gap between the performance of existing approaches and an upper bound established via idealized semantic embeddings, suggesting that improving class semantic embeddings is vital to GZSL.
Tasks Calibration, Few-Shot Learning, Object Recognition, Zero-Shot Learning
Published 2016-05-13
URL http://arxiv.org/abs/1605.04253v2
PDF http://arxiv.org/pdf/1605.04253v2.pdf
PWC https://paperswithcode.com/paper/an-empirical-study-and-analysis-of
Repo https://github.com/omallo/kaggle-whale
Framework pytorch

Ultradense Word Embeddings by Orthogonal Transformation

Title Ultradense Word Embeddings by Orthogonal Transformation
Authors Sascha Rothe, Sebastian Ebert, Hinrich Schütze
Abstract Embeddings are generic representations that are useful for many NLP tasks. In this paper, we introduce DENSIFIER, a method that learns an orthogonal transformation of the embedding space that focuses the information relevant for a task in an ultradense subspace of a dimensionality that is smaller by a factor of 100 than the original space. We show that ultradense embeddings generated by DENSIFIER reach state of the art on a lexicon creation task in which words are annotated with three types of lexical information - sentiment, concreteness and frequency. On the SemEval2015 10B sentiment analysis task we show that no information is lost when the ultradense subspace is used, but training is an order of magnitude more efficient due to the compactness of the ultradense space.
Tasks Sentiment Analysis, Word Embeddings
Published 2016-02-24
URL http://arxiv.org/abs/1602.07572v2
PDF http://arxiv.org/pdf/1602.07572v2.pdf
PWC https://paperswithcode.com/paper/ultradense-word-embeddings-by-orthogonal
Repo https://github.com/pdufter/densray
Framework none

Learning Distributed Representations of Sentences from Unlabelled Data

Title Learning Distributed Representations of Sentences from Unlabelled Data
Authors Felix Hill, Kyunghyun Cho, Anna Korhonen
Abstract Unsupervised methods for learning distributed representations of words are ubiquitous in today’s NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.
Tasks Representation Learning, Unsupervised Representation Learning
Published 2016-02-10
URL http://arxiv.org/abs/1602.03483v1
PDF http://arxiv.org/pdf/1602.03483v1.pdf
PWC https://paperswithcode.com/paper/learning-distributed-representations-of
Repo https://github.com/jihunchoi/sequential-denoising-autoencoder-tf
Framework tf

Bidirectional Tree-Structured LSTM with Head Lexicalization

Title Bidirectional Tree-Structured LSTM with Head Lexicalization
Authors Zhiyang Teng, Yue Zhang
Abstract Sequential LSTM has been extended to model tree structures, giving competitive results for a number of tasks. Existing methods model constituent trees by bottom-up combinations of constituent nodes, making direct use of input word information only for leaf nodes. This is different from sequential LSTMs, which contain reference to input words for each node. In this paper, we propose a method for automatic head-lexicalization for tree-structure LSTMs, propagating head words from leaf nodes to every constituent node. In addition, enabled by head lexicalization, we build a tree LSTM in the top-down direction, which corresponds to bidirectional sequential LSTM structurally. Experiments show that both extensions give better representations of tree structures. Our final model gives the best results on the Standford Sentiment Treebank and highly competitive results on the TREC question type classification task.
Tasks
Published 2016-11-21
URL http://arxiv.org/abs/1611.06788v1
PDF http://arxiv.org/pdf/1611.06788v1.pdf
PWC https://paperswithcode.com/paper/bidirectional-tree-structured-lstm-with-head
Repo https://github.com/zeeeyang/lexicalized_bitreelstm
Framework none

Gossip training for deep learning

Title Gossip training for deep learning
Authors Michael Blot, David Picard, Matthieu Cord, Nicolas Thome
Abstract We address the issue of speeding up the training of convolutional networks. Here we study a distributed method adapted to stochastic gradient descent (SGD). The parallel optimization setup uses several threads, each applying individual gradient descents on a local variable. We propose a new way to share information between different threads inspired by gossip algorithms and showing good consensus convergence properties. Our method called GoSGD has the advantage to be fully asynchronous and decentralized. We compared our method to the recent EASGD in \cite{elastic} on CIFAR-10 show encouraging results.
Tasks
Published 2016-11-29
URL http://arxiv.org/abs/1611.09726v1
PDF http://arxiv.org/pdf/1611.09726v1.pdf
PWC https://paperswithcode.com/paper/gossip-training-for-deep-learning
Repo https://github.com/uoguelph-mlrg/Theano-MPI
Framework none

Rényi Divergence Variational Inference

Title Rényi Divergence Variational Inference
Authors Yingzhen Li, Richard E. Turner
Abstract This paper introduces the variational R'enyi bound (VR) that extends traditional variational inference to R'enyi’s alpha-divergences. This new family of variational methods unifies a number of existing approaches, and enables a smooth interpolation from the evidence lower-bound to the log (marginal) likelihood that is controlled by the value of alpha that parametrises the divergence. The reparameterization trick, Monte Carlo approximation and stochastic optimisation methods are deployed to obtain a tractable and unified framework for optimisation. We further consider negative alpha values and propose a novel variational inference method as a new special case in the proposed framework. Experiments on Bayesian neural networks and variational auto-encoders demonstrate the wide applicability of the VR bound.
Tasks
Published 2016-02-06
URL http://arxiv.org/abs/1602.02311v3
PDF http://arxiv.org/pdf/1602.02311v3.pdf
PWC https://paperswithcode.com/paper/renyi-divergence-variational-inference
Repo https://github.com/YingzhenLi/vae_renyi_divergence
Framework tf

Multi-view Kernel Completion

Title Multi-view Kernel Completion
Authors Sahely Bhadra, Samuel Kaski, Juho Rousu
Abstract In this paper, we introduce the first method that (1) can complete kernel matrices with completely missing rows and columns as opposed to individual missing kernel values, (2) does not require any of the kernels to be complete a priori, and (3) can tackle non-linear kernels. These aspects are necessary in practical applications such as integrating legacy data sets, learning under sensor failures and learning when measurements are costly for some of the views. The proposed approach predicts missing rows by modelling both within-view and between-view relationships among kernel values. We show, both on simulated data and real world data, that the proposed method outperforms existing techniques in the restricted settings where they are available, and extends applicability to new settings.
Tasks
Published 2016-02-08
URL http://arxiv.org/abs/1602.02518v1
PDF http://arxiv.org/pdf/1602.02518v1.pdf
PWC https://paperswithcode.com/paper/multi-view-kernel-completion
Repo https://github.com/aalto-ics-kepaco/MKC_software
Framework none

Finding Statistically Significant Attribute Interactions

Title Finding Statistically Significant Attribute Interactions
Authors Andreas Henelius, Antti Ukkonen, Kai Puolamäki
Abstract In many data exploration tasks it is meaningful to identify groups of attribute interactions that are specific to a variable of interest. For instance, in a dataset where the attributes are medical markers and the variable of interest (class variable) is binary indicating presence/absence of disease, we would like to know which medical markers interact with respect to the binary class label. These interactions are useful in several practical applications, for example, to gain insight into the structure of the data, in feature selection, and in data anonymisation. We present a novel method, based on statistical significance testing, that can be used to verify if the data set has been created by a given factorised class-conditional joint distribution, where the distribution is parametrised by a partition of its attributes. Furthermore, we provide a method, named ASTRID, for automatically finding a partition of attributes describing the distribution that has generated the data. State-of-the-art classifiers are utilised to capture the interactions present in the data by systematically breaking attribute interactions and observing the effect of this breaking on classifier performance. We empirically demonstrate the utility of the proposed method with examples using real and synthetic data.
Tasks Feature Selection
Published 2016-12-22
URL http://arxiv.org/abs/1612.07597v2
PDF http://arxiv.org/pdf/1612.07597v2.pdf
PWC https://paperswithcode.com/paper/finding-statistically-significant-attribute
Repo https://github.com/bwrc/astrid-r
Framework none

Attributing Hacks

Title Attributing Hacks
Authors Ziqi Liu, Alexander J. Smola, Kyle Soska, Yu-Xiang Wang, Qinghua Zheng, Jun Zhou
Abstract In this paper we describe an algorithm for estimating the provenance of hacks on websites. That is, given properties of sites and the temporal occurrence of attacks, we are able to attribute individual attacks to joint causes and vulnerabilities, as well as estimating the evolution of these vulnerabilities over time. Specifically, we use hazard regression with a time-varying additive hazard function parameterized in a generalized linear form. The activation coefficients on each feature are continuous-time functions over time. We formulate the problem of learning these functions as a constrained variational maximum likelihood estimation problem with total variation penalty and show that the optimal solution is a 0th order spline (a piecewise constant function) with a finite number of known knots. This allows the inference problem to be solved efficiently and at scale by solving a finite dimensional optimization problem. Extensive experiments on real data sets show that our method significantly outperforms Cox’s proportional hazard model. We also conduct a case study and verify that the fitted functions are indeed recovering vulnerable features and real-life events such as the release of code to exploit these features in hacker blogs.
Tasks
Published 2016-11-07
URL http://arxiv.org/abs/1611.03021v2
PDF http://arxiv.org/pdf/1611.03021v2.pdf
PWC https://paperswithcode.com/paper/attributing-hacks
Repo https://github.com/ziqilau/Experimental-HazardRegression
Framework none

The World of Fast Moving Objects

Title The World of Fast Moving Objects
Authors Denys Rozumnyi, Jan Kotera, Filip Sroubek, Lukas Novotny, Jiri Matas
Abstract The notion of a Fast Moving Object (FMO), i.e. an object that moves over a distance exceeding its size within the exposure time, is introduced. FMOs may, and typically do, rotate with high angular speed. FMOs are very common in sports videos, but are not rare elsewhere. In a single frame, such objects are often barely visible and appear as semi-transparent streaks. A method for the detection and tracking of FMOs is proposed. The method consists of three distinct algorithms, which form an efficient localization pipeline that operates successfully in a broad range of conditions. We show that it is possible to recover the appearance of the object and its axis of rotation, despite its blurred appearance. The proposed method is evaluated on a new annotated dataset. The results show that existing trackers are inadequate for the problem of FMO localization and a new approach is required. Two applications of localization, temporal super-resolution and highlighting, are presented.
Tasks Super-Resolution
Published 2016-11-23
URL http://arxiv.org/abs/1611.07889v1
PDF http://arxiv.org/pdf/1611.07889v1.pdf
PWC https://paperswithcode.com/paper/the-world-of-fast-moving-objects
Repo https://github.com/rozumden/tbd
Framework none

Reducing Drift in Visual Odometry by Inferring Sun Direction Using a Bayesian Convolutional Neural Network

Title Reducing Drift in Visual Odometry by Inferring Sun Direction Using a Bayesian Convolutional Neural Network
Authors Valentin Peretroukhin, Lee Clement, Jonathan Kelly
Abstract We present a method to incorporate global orientation information from the sun into a visual odometry pipeline using only the existing image stream, where the sun is typically not visible. We leverage recent advances in Bayesian Convolutional Neural Networks to train and implement a sun detection model that infers a three-dimensional sun direction vector from a single RGB image. Crucially, our method also computes a principled uncertainty associated with each prediction, using a Monte Carlo dropout scheme. We incorporate this uncertainty into a sliding window stereo visual odometry pipeline where accurate uncertainty estimates are critical for optimal data fusion. Our Bayesian sun detection model achieves a median error of approximately 12 degrees on the KITTI odometry benchmark training set, and yields improvements of up to 42% in translational ARMSE and 32% in rotational ARMSE compared to standard VO. An open source implementation of our Bayesian CNN sun estimator (Sun-BCNN) using Caffe is available at https://github. com/utiasSTARS/sun-bcnn-vo
Tasks Visual Odometry
Published 2016-09-20
URL https://arxiv.org/abs/1609.05993v5
PDF https://arxiv.org/pdf/1609.05993v5.pdf
PWC https://paperswithcode.com/paper/reducing-drift-in-visual-odometry-by
Repo https://github.com/utiasSTARS/sun-bcnn-vo
Framework caffe2

Nested Kriging predictions for datasets with large number of observations

Title Nested Kriging predictions for datasets with large number of observations
Authors Didier Rullière, Nicolas Durrande, François Bachoc, Clément Chevalier
Abstract This work falls within the context of predicting the value of a real function at some input locations given a limited number of observations of this function. The Kriging interpolation technique (or Gaussian process regression) is often considered to tackle such a problem but the method suffers from its computational burden when the number of observation points is large. We introduce in this article nested Kriging predictors which are constructed by aggregating sub-models based on subsets of observation points. This approach is proven to have better theoretical properties than other aggregation methods that can be found in the literature. Contrarily to some other methods it can be shown that the proposed aggregation method is consistent. Finally, the practical interest of the proposed method is illustrated on simulated datasets and on an industrial test case with $10^4$ observations in a 6-dimensional space.
Tasks
Published 2016-07-19
URL http://arxiv.org/abs/1607.05432v3
PDF http://arxiv.org/pdf/1607.05432v3.pdf
PWC https://paperswithcode.com/paper/nested-kriging-predictions-for-datasets-with
Repo https://github.com/drulliere/nestedKriging
Framework none
comments powered by Disqus