October 21, 2019

2840 words 14 mins read

Paper Group AWR 118

Paper Group AWR 118

Universal Dependency Parsing with a General Transition-Based DAG Parser. Modeling Composite Labels for Neural Morphological Tagging. Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme. Hashing with Mutual Information. GAN Q-learning. 3DSRnet: Video Super-resolution using 3D Convolutional Neural Networks. Cross-lingual Argumentati …

Universal Dependency Parsing with a General Transition-Based DAG Parser

Title Universal Dependency Parsing with a General Transition-Based DAG Parser
Authors Daniel Hershcovich, Omri Abend, Ari Rappoport
Abstract This paper presents our experiments with applying TUPA to the CoNLL 2018 UD shared task. TUPA is a general neural transition-based DAG parser, which we use to present the first experiments on recovering enhanced dependencies as part of the general parsing task. TUPA was designed for parsing UCCA, a cross-linguistic semantic annotation scheme, exhibiting reentrancy, discontinuity and non-terminal nodes. By converting UD trees and graphs to a UCCA-like DAG format, we train TUPA almost without modification on the UD parsing task. The generic nature of our approach lends itself naturally to multitask learning. Our code is available at https://github.com/CoNLL-UD-2018/HUJI
Tasks Dependency Parsing
Published 2018-08-28
URL http://arxiv.org/abs/1808.09354v1
PDF http://arxiv.org/pdf/1808.09354v1.pdf
PWC https://paperswithcode.com/paper/universal-dependency-parsing-with-a-general
Repo https://github.com/CoNLL-UD-2018/HUJI
Framework none

Modeling Composite Labels for Neural Morphological Tagging

Title Modeling Composite Labels for Neural Morphological Tagging
Authors Alexander Tkachenko, Kairit Sirts
Abstract Neural morphological tagging has been regarded as an extension to POS tagging task, treating each morphological tag as a monolithic label and ignoring its internal structure. We propose to view morphological tags as composite labels and explicitly model their internal structure in a neural sequence tagger. For this, we explore three different neural architectures and compare their performance with both CRF and simple neural multiclass baselines. We evaluate our models on 49 languages and show that the neural architecture that models the morphological labels as sequences of morphological category values performs significantly better than both baselines establishing state-of-the-art results in morphological tagging for most languages.
Tasks Morphological Tagging
Published 2018-10-20
URL http://arxiv.org/abs/1810.08815v1
PDF http://arxiv.org/pdf/1810.08815v1.pdf
PWC https://paperswithcode.com/paper/modeling-composite-labels-for-neural
Repo https://github.com/AleksTk/seq-morph-tagger
Framework tf

Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme

Title Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme
Authors Jey Han Lau, Trevor Cohn, Timothy Baldwin, Julian Brooke, Adam Hammond
Abstract In this paper, we propose a joint architecture that captures language, rhyme and meter for sonnet modelling. We assess the quality of generated poems using crowd and expert judgements. The stress and rhyme models perform very well, as generated poems are largely indistinguishable from human-written poems. Expert evaluation, however, reveals that a vanilla language model captures meter implicitly, and that machine-generated poems still underperform in terms of readability and emotion. Our research shows the importance expert evaluation for poetry generation, and that future research should look beyond rhyme/meter and focus on poetic language.
Tasks Language Modelling
Published 2018-07-10
URL http://arxiv.org/abs/1807.03491v1
PDF http://arxiv.org/pdf/1807.03491v1.pdf
PWC https://paperswithcode.com/paper/deep-speare-a-joint-neural-model-of-poetic
Repo https://github.com/jhlau/deepspeare
Framework tf

Hashing with Mutual Information

Title Hashing with Mutual Information
Authors Fatih Cakir, Kun He, Sarah Adel Bargal, Stan Sclaroff
Abstract Binary vector embeddings enable fast nearest neighbor retrieval in large databases of high-dimensional objects, and play an important role in many practical applications, such as image and video retrieval. We study the problem of learning binary vector embeddings under a supervised setting, also known as hashing. We propose a novel supervised hashing method based on optimizing an information-theoretic quantity: mutual information. We show that optimizing mutual information can reduce ambiguity in the induced neighborhood structure in the learned Hamming space, which is essential in obtaining high retrieval performance. To this end, we optimize mutual information in deep neural networks with minibatch stochastic gradient descent, with a formulation that maximally and efficiently utilizes available supervision. Experiments on four image retrieval benchmarks, including ImageNet, confirm the effectiveness of our method in learning high-quality binary embeddings for nearest neighbor retrieval.
Tasks Image Retrieval, Video Retrieval
Published 2018-03-02
URL http://arxiv.org/abs/1803.00974v2
PDF http://arxiv.org/pdf/1803.00974v2.pdf
PWC https://paperswithcode.com/paper/hashing-with-mutual-information
Repo https://github.com/fcakir/deep-mihash
Framework none

GAN Q-learning

Title GAN Q-learning
Authors Thang Doan, Bogdan Mazoure, Clare Lyle
Abstract Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However, there are many different ways in which one can leverage the distributional approach to reinforcement learning. In this paper, we propose GAN Q-learning, a novel distributional RL method based on generative adversarial networks (GANs) and analyze its performance in simple tabular environments, as well as OpenAI Gym. We empirically show that our algorithm leverages the flexibility and blackbox approach of deep learning models while providing a viable alternative to traditional methods.
Tasks Distributional Reinforcement Learning, Q-Learning
Published 2018-05-13
URL http://arxiv.org/abs/1805.04874v3
PDF http://arxiv.org/pdf/1805.04874v3.pdf
PWC https://paperswithcode.com/paper/gan-q-learning
Repo https://github.com/daggertye/GAN-Q-Learning
Framework tf

3DSRnet: Video Super-resolution using 3D Convolutional Neural Networks

Title 3DSRnet: Video Super-resolution using 3D Convolutional Neural Networks
Authors Soo Ye Kim, Jeongyeon Lim, Taeyoung Na, Munchurl Kim
Abstract In video super-resolution, the spatio-temporal coherence between, and among the frames must be exploited appropriately for accurate prediction of the high resolution frames. Although 2D convolutional neural networks (CNNs) are powerful in modelling images, 3D-CNNs are more suitable for spatio-temporal feature extraction as they can preserve temporal information. To this end, we propose an effective 3D-CNN for video super-resolution, called the 3DSRnet that does not require motion alignment as preprocessing. Our 3DSRnet maintains the temporal depth of spatio-temporal feature maps to maximally capture the temporally nonlinear characteristics between low and high resolution frames, and adopts residual learning in conjunction with the sub-pixel outputs. It outperforms the most state-of-the-art method with average 0.45 and 0.36 dB higher in PSNR for scales 3 and 4, respectively, in the Vidset4 benchmark. Our 3DSRnet first deals with the performance drop due to scene change, which is important in practice but has not been previously considered.
Tasks Super-Resolution, Video Super-Resolution
Published 2018-12-21
URL https://arxiv.org/abs/1812.09079v2
PDF https://arxiv.org/pdf/1812.09079v2.pdf
PWC https://paperswithcode.com/paper/3dsrnet-video-super-resolution-using-3d
Repo https://github.com/sooyekim/3DSRnet
Framework none

Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!

Title Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need!
Authors Steffen Eger, Johannes Daxenberger, Christian Stab, Iryna Gurevych
Abstract Argumentation mining (AM) requires the identification of complex discourse structures and has lately been applied with success monolingually. In this work, we show that the existing resources are, however, not adequate for assessing cross-lingual AM, due to their heterogeneity or lack of complexity. We therefore create suitable parallel corpora by (human and machine) translating a popular AM dataset consisting of persuasive student essays into German, French, Spanish, and Chinese. We then compare (i) annotation projection and (ii) bilingual word embeddings based direct transfer strategies for cross-lingual AM, finding that the former performs considerably better and almost eliminates the loss from cross-lingual transfer. Moreover, we find that annotation projection works equally well when using either costly human or cheap machine translations. Our code and data are available at \url{http://github.com/UKPLab/coling2018-xling_argument_mining}.
Tasks Cross-Lingual Transfer, Machine Translation, Word Embeddings
Published 2018-07-24
URL http://arxiv.org/abs/1807.08998v1
PDF http://arxiv.org/pdf/1807.08998v1.pdf
PWC https://paperswithcode.com/paper/cross-lingual-argumentation-mining-machine
Repo https://github.com/UKPLab/coling2018-xling_argument_mining
Framework none

PennyLane: Automatic differentiation of hybrid quantum-classical computations

Title PennyLane: Automatic differentiation of hybrid quantum-classical computations
Authors Ville Bergholm, Josh Izaac, Maria Schuld, Christian Gogolin, M. Sohaib Alam, Shahnawaz Ahmed, Juan Miguel Arrazola, Carsten Blank, Alain Delgado, Soran Jahangiri, Keri McKiernan, Johannes Jakob Meyer, Zeyue Niu, Antal Száva, Nathan Killoran
Abstract PennyLane is a Python 3 software framework for optimization and machine learning of quantum and hybrid quantum-classical computations. The library provides a unified architecture for near-term quantum computing devices, supporting both qubit and continuous-variable paradigms. PennyLane’s core feature is the ability to compute gradients of variational quantum circuits in a way that is compatible with classical techniques such as backpropagation. PennyLane thus extends the automatic differentiation algorithms common in optimization and machine learning to include quantum and hybrid computations. A plugin system makes the framework compatible with any gate-based quantum simulator or hardware. We provide plugins for Strawberry Fields, Rigetti Forest, Qiskit, Cirq, and ProjectQ, allowing PennyLane optimizations to be run on publicly accessible quantum devices provided by Rigetti and IBM Q. On the classical front, PennyLane interfaces with accelerated machine learning libraries such as TensorFlow, PyTorch, and autograd. PennyLane can be used for the optimization of variational quantum eigensolvers, quantum approximate optimization, quantum machine learning models, and many other applications.
Tasks Quantum Machine Learning
Published 2018-11-12
URL https://arxiv.org/abs/1811.04968v3
PDF https://arxiv.org/pdf/1811.04968v3.pdf
PWC https://paperswithcode.com/paper/pennylane-automatic-differentiation-of-hybrid
Repo https://github.com/XanaduAI/pennylane
Framework tf

Continuous-variable quantum neural networks

Title Continuous-variable quantum neural networks
Authors Nathan Killoran, Thomas R. Bromley, Juan Miguel Arrazola, Maria Schuld, Nicolás Quesada, Seth Lloyd
Abstract We introduce a general method for building neural networks on quantum computers. The quantum neural network is a variational quantum circuit built in the continuous-variable (CV) architecture, which encodes quantum information in continuous degrees of freedom such as the amplitudes of the electromagnetic field. This circuit contains a layered structure of continuously parameterized gates which is universal for CV quantum computation. Affine transformations and nonlinear activation functions, two key elements in neural networks, are enacted in the quantum network using Gaussian and non-Gaussian gates, respectively. The non-Gaussian gates provide both the nonlinearity and the universality of the model. Due to the structure of the CV model, the CV quantum neural network can encode highly nonlinear transformations while remaining completely unitary. We show how a classical network can be embedded into the quantum formalism and propose quantum versions of various specialized model such as convolutional, recurrent, and residual networks. Finally, we present numerous modeling experiments built with the Strawberry Fields software library. These experiments, including a classifier for fraud detection, a network which generates Tetris images, and a hybrid classical-quantum autoencoder, demonstrate the capability and adaptability of CV quantum neural networks.
Tasks Fraud Detection, Quantum Machine Learning
Published 2018-06-18
URL http://arxiv.org/abs/1806.06871v1
PDF http://arxiv.org/pdf/1806.06871v1.pdf
PWC https://paperswithcode.com/paper/continuous-variable-quantum-neural-networks
Repo https://github.com/XanaduAI/quantum-learning
Framework tf

Albumentations: fast and flexible image augmentations

Title Albumentations: fast and flexible image augmentations
Authors Alexander Buslaev, Alex Parinov, Eugene Khvedchenya, Vladimir I. Iglovikov, Alexandr A. Kalinin
Abstract Data augmentation is a commonly used technique for increasing both the size and the diversity of labeled training sets by leveraging input transformations that preserve output labels. In computer vision domain, image augmentations have become a common implicit regularization technique to combat overfitting in deep convolutional neural networks and are ubiquitously used to improve performance. While most deep learning frameworks implement basic image transformations, the list is typically limited to some variations and combinations of flipping, rotating, scaling, and cropping. Moreover, the image processing speed varies in existing tools for image augmentation. We present Albumentations, a fast and flexible library for image augmentations with many various image transform operations available, that is also an easy-to-use wrapper around other augmentation libraries. We provide examples of image augmentations for different computer vision tasks and show that Albumentations is faster than other commonly used image augmentation tools on the most of commonly used image transformations. The source code for Albumentations is made publicly available online at https://github.com/albu/albumentations
Tasks Data Augmentation, Image Augmentation
Published 2018-09-18
URL http://arxiv.org/abs/1809.06839v1
PDF http://arxiv.org/pdf/1809.06839v1.pdf
PWC https://paperswithcode.com/paper/albumentations-fast-and-flexible-image
Repo https://github.com/albu/albumentations
Framework pytorch

Variance Networks: When Expectation Does Not Meet Your Expectations

Title Variance Networks: When Expectation Does Not Meet Your Expectations
Authors Kirill Neklyudov, Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
Abstract Ordinary stochastic neural networks mostly rely on the expected values of their weights to make predictions, whereas the induced noise is mostly used to capture the uncertainty, prevent overfitting and slightly boost the performance through test-time averaging. In this paper, we introduce variance layers, a different kind of stochastic layers. Each weight of a variance layer follows a zero-mean distribution and is only parameterized by its variance. We show that such layers can learn surprisingly well, can serve as an efficient exploration tool in reinforcement learning tasks and provide a decent defense against adversarial attacks. We also show that a number of conventional Bayesian neural networks naturally converge to such zero-mean posteriors. We observe that in these cases such zero-mean parameterization leads to a much better training objective than conventional parameterizations where the mean is being learned.
Tasks Efficient Exploration
Published 2018-03-10
URL http://arxiv.org/abs/1803.03764v5
PDF http://arxiv.org/pdf/1803.03764v5.pdf
PWC https://paperswithcode.com/paper/variance-networks-when-expectation-does-not
Repo https://github.com/jondaa/CS236605FinalProject
Framework pytorch

A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice

Title A Theory-Based Evaluation of Nearest Neighbor Models Put Into Practice
Authors Hendrik Fichtenberger, Dennis Rohde
Abstract In the $k$-nearest neighborhood model ($k$-NN), we are given a set of points $P$, and we shall answer queries $q$ by returning the $k$ nearest neighbors of $q$ in $P$ according to some metric. This concept is crucial in many areas of data analysis and data processing, e.g., computer vision, document retrieval and machine learning. Many $k$-NN algorithms have been published and implemented, but often the relation between parameters and accuracy of the computed $k$-NN is not explicit. We study property testing of $k$-NN graphs in theory and evaluate it empirically: given a point set $P \subset \mathbb{R}^\delta$ and a directed graph $G=(P,E)$, is $G$ a $k$-NN graph, i.e., every point $p \in P$ has outgoing edges to its $k$ nearest neighbors, or is it $\epsilon$-far from being a $k$-NN graph? Here, $\epsilon$-far means that one has to change more than an $\epsilon$-fraction of the edges in order to make $G$ a $k$-NN graph. We develop a randomized algorithm with one-sided error that decides this question, i.e., a property tester for the $k$-NN property, with complexity $O(\sqrt{n} k^2 / \epsilon^2)$ measured in terms of the number of vertices and edges it inspects, and we prove a lower bound of $\Omega(\sqrt{n / \epsilon k})$. We evaluate our tester empirically on the $k$-NN models computed by various algorithms and show that it can be used to detect $k$-NN models with bad accuracy in significantly less time than the building time of the $k$-NN model.
Tasks
Published 2018-10-11
URL http://arxiv.org/abs/1810.05064v3
PDF http://arxiv.org/pdf/1810.05064v3.pdf
PWC https://paperswithcode.com/paper/a-theory-based-evaluation-of-nearest-neighbor
Repo https://github.com/derohde/knn_test
Framework none

Model selection with lasso-zero: adding straw to the haystack to better find needles

Title Model selection with lasso-zero: adding straw to the haystack to better find needles
Authors Pascaline Descloux, Sylvain Sardy
Abstract The high-dimensional linear model $y = X \beta^0 + \epsilon$ is considered and the focus is put on the problem of recovering the support $S^0$ of the sparse vector $\beta^0.$ We introduce Lasso-Zero, a new $\ell_1$-based estimator whose novelty resides in an “overfit, then threshold” paradigm and the use of noise dictionaries concatenated to $X$ for overfitting the response. To select the threshold, we employ the quantile universal threshold based on a pivotal statistic that requires neither knowledge nor preliminary estimation of the noise level. Numerical simulations show that Lasso-Zero performs well in terms of support recovery and provides an excellent trade-off between high true positive rate and low false discovery rate compared to competitors. Our methodology is supported by theoretical results showing that when no noise dictionary is used, Lasso-Zero recovers the signs of $\beta^0$ under weaker conditions on $X$ and $S^0$ than the Lasso and achieves sign consistency for correlated Gaussian designs. The use of noise dictionary improves the procedure for low signals.
Tasks Model Selection
Published 2018-05-14
URL http://arxiv.org/abs/1805.05133v2
PDF http://arxiv.org/pdf/1805.05133v2.pdf
PWC https://paperswithcode.com/paper/model-selection-with-lasso-zero-adding-straw
Repo https://github.com/pascalinedescloux/lasso-zero
Framework none

Universal and Succinct Source Coding of Deep Neural Networks

Title Universal and Succinct Source Coding of Deep Neural Networks
Authors Sourya Basu, Lav R. Varshney
Abstract Deep neural networks have shown incredible performance for inference tasks in a variety of domains. Unfortunately, most current deep networks are enormous cloud-based structures that require significant storage space, which limits scaling of deep learning as a service (DLaaS) and use for on-device intelligence. This paper is concerned with finding universal lossless compressed representations of deep feedforward networks with synaptic weights drawn from discrete sets, and directly performing inference without full decompression. The basic insight that allows less rate than naive approaches is recognizing that the bipartite graph layers of feedforward networks have a kind of permutation invariance to the labeling of nodes, in terms of inferential operation. We provide efficient algorithms to dissipate this irrelevant uncertainty and then use arithmetic coding to nearly achieve the entropy bound in a universal manner. We also provide experimental results of our approach on several standard datasets.
Tasks
Published 2018-04-09
URL https://arxiv.org/abs/1804.02800v2
PDF https://arxiv.org/pdf/1804.02800v2.pdf
PWC https://paperswithcode.com/paper/universal-and-succinct-source-coding-of-deep
Repo https://github.com/basusourya/DNN
Framework none

Effects of sampling skewness of the importance-weighted risk estimator on model selection

Title Effects of sampling skewness of the importance-weighted risk estimator on model selection
Authors Wouter M. Kouw, Marco Loog
Abstract Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator can be skewed. For sample selection bias settings, and for small sample sizes, the importance-weighted risk estimator produces overestimates for datasets in the body of the sampling distribution, i.e. the majority of cases, and large underestimates for data sets in the tail of the sampling distribution. These over- and underestimates of the risk lead to suboptimal regularization parameters when used for importance-weighted validation.
Tasks Model Selection
Published 2018-04-19
URL http://arxiv.org/abs/1804.07344v1
PDF http://arxiv.org/pdf/1804.07344v1.pdf
PWC https://paperswithcode.com/paper/effects-of-sampling-skewness-of-the
Repo https://github.com/wmkouw/covshift-skewness
Framework none
comments powered by Disqus