July 30, 2019

2867 words 14 mins read

Paper Group AWR 49

Paper Group AWR 49

Bidirectional Attention for SQL Generation. Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net. A Generic Deep Architecture for Single Image Reflection Removal and Image Smoothing. Multi-Task Learning by Deep Collaboration and Application in Facial Landmark Detection. Contextualized Word Representations for …

Bidirectional Attention for SQL Generation

Title Bidirectional Attention for SQL Generation
Authors Tong Guo, Huilin Gao
Abstract Generating structural query language (SQL) queries from natural language is a long-standing open problem. Answering a natural language question about a database table requires modeling complex interactions between the columns of the table and the question. In this paper, we apply the synthesizing approach to solve this problem. Based on the structure of SQL queries, we break down the model to three sub-modules and design specific deep neural networks for each of them. Taking inspiration from the similar machine reading task, we employ the bidirectional attention mechanisms and character-level embedding with convolutional neural networks (CNNs) to improve the result. Experimental evaluations show that our model achieves the state-of-the-art results in WikiSQL dataset.
Tasks Reading Comprehension
Published 2017-12-30
URL http://arxiv.org/abs/1801.00076v6
PDF http://arxiv.org/pdf/1801.00076v6.pdf
PWC https://paperswithcode.com/paper/bidirectional-attention-for-sql-generation
Repo https://github.com/guotong1988/NL2SQL
Framework pytorch

Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net

Title Analytic solution and stationary phase approximation for the Bayesian lasso and elastic net
Authors Tom Michoel
Abstract The lasso and elastic net linear regression models impose a double-exponential prior distribution on the model parameters to achieve regression shrinkage and variable selection, allowing the inference of robust models from large data sets. However, there has been limited success in deriving estimates for the full posterior distribution of regression coefficients in these models, due to a need to evaluate analytically intractable partition function integrals. Here, the Fourier transform is used to express these integrals as complex-valued oscillatory integrals over “regression frequencies”. This results in an analytic expansion and stationary phase approximation for the partition functions of the Bayesian lasso and elastic net, where the non-differentiability of the double-exponential prior has so far eluded such an approach. Use of this approximation leads to highly accurate numerical estimates for the expectation values and marginal posterior distributions of the regression coefficients, and allows for Bayesian inference of much higher dimensional models than previously possible.
Tasks Bayesian Inference
Published 2017-09-25
URL http://arxiv.org/abs/1709.08535v3
PDF http://arxiv.org/pdf/1709.08535v3.pdf
PWC https://paperswithcode.com/paper/analytic-solution-and-stationary-phase
Repo https://github.com/tmichoel/bayonet
Framework none

A Generic Deep Architecture for Single Image Reflection Removal and Image Smoothing

Title A Generic Deep Architecture for Single Image Reflection Removal and Image Smoothing
Authors Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf
Abstract This paper proposes a deep neural network structure that exploits edge information in addressing representative low-level vision tasks such as layer separation and image filtering. Unlike most other deep learning strategies applied in this context, our approach tackles these challenging problems by estimating edges and reconstructing images using only cascaded convolutional layers arranged such that no handcrafted or application-specific image-processing components are required. We apply the resulting transferrable pipeline to two different problem domains that are both sensitive to edges, namely, single image reflection removal and image smoothing. For the former, using a mild reflection smoothness assumption and a novel synthetic data generation method that acts as a type of weak supervision, our network is able to solve much more difficult reflection cases that cannot be handled by previous methods. For the latter, we also exceed the state-of-the-art quantitative and qualitative results by wide margins. In all cases, the proposed framework is simple, fast, and easy to transfer across disparate domains.
Tasks Synthetic Data Generation
Published 2017-08-11
URL http://arxiv.org/abs/1708.03474v2
PDF http://arxiv.org/pdf/1708.03474v2.pdf
PWC https://paperswithcode.com/paper/a-generic-deep-architecture-for-single-image
Repo https://github.com/fqnchina/CEILNet
Framework torch

Multi-Task Learning by Deep Collaboration and Application in Facial Landmark Detection

Title Multi-Task Learning by Deep Collaboration and Application in Facial Landmark Detection
Authors Ludovic Trottier, Philippe Giguère, Brahim Chaib-draa
Abstract Convolutional neural networks (CNNs) have become the most successful approach in many vision-related domains. However, they are limited to domains where data is abundant. Recent works have looked at multi-task learning (MTL) to mitigate data scarcity by leveraging domain-specific information from related tasks. In this paper, we present a novel soft-parameter sharing mechanism for CNNs in a MTL setting, which we refer to as Deep Collaboration. We propose taking into account the notion that task relevance depends on depth by using lateral transformation blocs with skip connections. This allows extracting task-specific features at various depth without sacrificing features relevant to all tasks. We show that CNNs connected with our Deep Collaboration obtain better accuracy on facial landmark detection with related tasks. We finally verify that our approach effectively allows knowledge sharing by showing depth-specific influence of tasks that we know are related.
Tasks Facial Landmark Detection, Multi-Task Learning
Published 2017-10-28
URL http://arxiv.org/abs/1711.00111v2
PDF http://arxiv.org/pdf/1711.00111v2.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-by-deep-collaboration-and
Repo https://github.com/ltrottier/deep-collaboration-network
Framework pytorch

Contextualized Word Representations for Reading Comprehension

Title Contextualized Word Representations for Reading Comprehension
Authors Shimi Salant, Jonathan Berant
Abstract Reading a document and extracting an answer to a question about its content has attracted substantial attention recently. While most work has focused on the interaction between the question and the document, in this work we evaluate the importance of context when the question and document are processed independently. We take a standard neural architecture for this task, and show that by providing rich contextualized word representations from a large pre-trained language model as well as allowing the model to choose between context-dependent and context-independent word representations, we can obtain dramatic improvements and reach performance comparable to state-of-the-art on the competitive SQuAD dataset.
Tasks Language Modelling, Question Answering, Reading Comprehension
Published 2017-12-10
URL http://arxiv.org/abs/1712.03609v4
PDF http://arxiv.org/pdf/1712.03609v4.pdf
PWC https://paperswithcode.com/paper/contextualized-word-representations-for
Repo https://github.com/shimisalant/CWR
Framework tf

A Mixture of Matrix Variate Bilinear Factor Analyzers

Title A Mixture of Matrix Variate Bilinear Factor Analyzers
Authors Michael P. B. Gallaugher, Paul D. McNicholas
Abstract Over the years data has become increasingly higher dimensional, which has prompted an increased need for dimension reduction techniques. This is perhaps especially true for clustering (unsupervised classification) as well as semi-supervised and supervised classification. Although dimension reduction in the area of clustering for multivariate data has been quite thoroughly discussed within the literature, there is relatively little work in the area of three-way, or matrix variate, data. Herein, we develop a mixture of matrix variate bilinear factor analyzers (MMVBFA) model for use in clustering high-dimensional matrix variate data. This work can be considered both the first matrix variate bilinear factor analysis model as well as the first MMVBFA model. Parameter estimation is discussed, and the MMVBFA model is illustrated using simulated and real data.
Tasks Dimensionality Reduction
Published 2017-12-22
URL http://arxiv.org/abs/1712.08664v3
PDF http://arxiv.org/pdf/1712.08664v3.pdf
PWC https://paperswithcode.com/paper/a-mixture-of-matrix-variate-bilinear-factor
Repo https://github.com/nikpocuca/MatrixVariate.jl
Framework none

Simple and Effective Multi-Paragraph Reading Comprehension

Title Simple and Effective Multi-Paragraph Reading Comprehension
Authors Christopher Clark, Matt Gardner
Abstract We consider the problem of adapting neural paragraph-level question answering models to the case where entire documents are given as input. Our proposed solution trains models to produce well calibrated confidence scores for their results on individual paragraphs. We sample multiple paragraphs from the documents during training, and use a shared-normalization training objective that encourages the model to produce globally correct output. We combine this method with a state-of-the-art pipeline for training models on document QA data. Experiments demonstrate strong performance on several document QA datasets. Overall, we are able to achieve a score of 71.3 F1 on the web portion of TriviaQA, a large improvement from the 56.7 F1 of the previous best system.
Tasks Question Answering, Reading Comprehension
Published 2017-10-29
URL http://arxiv.org/abs/1710.10723v2
PDF http://arxiv.org/pdf/1710.10723v2.pdf
PWC https://paperswithcode.com/paper/simple-and-effective-multi-paragraph-reading
Repo https://github.com/allenai/document-qa
Framework tf

Learned in Translation: Contextualized Word Vectors

Title Learned in Translation: Contextualized Word Vectors
Authors Bryan McCann, James Bradbury, Caiming Xiong, Richard Socher
Abstract Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically sees initialization of only the lowest layer of deep models with pretrained word vectors. In this paper, we use a deep LSTM encoder from an attentional sequence-to-sequence model trained for machine translation (MT) to contextualize word vectors. We show that adding these context vectors (CoVe) improves performance over using only unsupervised word and character vectors on a wide variety of common NLP tasks: sentiment analysis (SST, IMDb), question classification (TREC), entailment (SNLI), and question answering (SQuAD). For fine-grained sentiment analysis and entailment, CoVe improves performance of our baseline models to the state of the art.
Tasks Machine Translation, Question Answering, Sentiment Analysis, Text Classification
Published 2017-08-01
URL http://arxiv.org/abs/1708.00107v2
PDF http://arxiv.org/pdf/1708.00107v2.pdf
PWC https://paperswithcode.com/paper/learned-in-translation-contextualized-word
Repo https://github.com/menajosep/AleatoricSent
Framework tf

Watset: Automatic Induction of Synsets from a Graph of Synonyms

Title Watset: Automatic Induction of Synsets from a Graph of Synonyms
Authors Dmitry Ustalov, Alexander Panchenko, Chris Biemann
Abstract This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings. First, we build a weighted graph of synonyms extracted from commonly available resources, such as Wiktionary. Second, we apply word sense induction to deal with ambiguous words. Finally, we cluster the disambiguated version of the ambiguous input graph into synsets. Our meta-clustering approach lets us use an efficient hard clustering algorithm to perform a fuzzy clustering of the graph. Despite its simplicity, our approach shows excellent results, outperforming five competitive state-of-the-art methods in terms of F-score on three gold standard datasets for English and Russian derived from large-scale manually constructed lexical resources.
Tasks Word Embeddings, Word Sense Induction
Published 2017-04-24
URL http://arxiv.org/abs/1704.07157v1
PDF http://arxiv.org/pdf/1704.07157v1.pdf
PWC https://paperswithcode.com/paper/watset-automatic-induction-of-synsets-from-a
Repo https://github.com/dustalov/watset
Framework none

PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications

Title PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications
Authors Tim Salimans, Andrej Karpathy, Xi Chen, Diederik P. Kingma
Abstract PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available at https://github.com/openai/pixel-cnn. Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.
Tasks Image Generation
Published 2017-01-19
URL http://arxiv.org/abs/1701.05517v1
PDF http://arxiv.org/pdf/1701.05517v1.pdf
PWC https://paperswithcode.com/paper/pixelcnn-improving-the-pixelcnn-with
Repo https://github.com/openai/pixel-cnn
Framework tf

Learning Deep CNN Denoiser Prior for Image Restoration

Title Learning Deep CNN Denoiser Prior for Image Restoration
Authors Kai Zhang, Wangmeng Zuo, Shuhang Gu, Lei Zhang
Abstract Model-based optimization methods and discriminative learning methods have been the two dominant strategies for solving various inverse problems in low-level vision. Typically, those two kinds of methods have their respective merits and drawbacks, e.g., model-based optimization methods are flexible for handling different inverse problems but are usually time-consuming with sophisticated priors for the purpose of good performance; in the meanwhile, discriminative learning methods have fast testing speed but their application range is greatly restricted by the specialized task. Recent works have revealed that, with the aid of variable splitting techniques, denoiser prior can be plugged in as a modular part of model-based optimization methods to solve other inverse problems (e.g., deblurring). Such an integration induces considerable advantage when the denoiser is obtained via discriminative learning. However, the study of integration with fast discriminative denoiser prior is still lacking. To this end, this paper aims to train a set of fast and effective CNN (convolutional neural network) denoisers and integrate them into model-based optimization method to solve other inverse problems. Experimental results demonstrate that the learned set of denoisers not only achieve promising Gaussian denoising results but also can be used as prior to deliver good performance for various low-level vision applications.
Tasks Deblurring, Denoising, Image Denoising, Image Restoration
Published 2017-04-11
URL http://arxiv.org/abs/1704.03264v1
PDF http://arxiv.org/pdf/1704.03264v1.pdf
PWC https://paperswithcode.com/paper/learning-deep-cnn-denoiser-prior-for-image
Repo https://github.com/cszn/ircnn
Framework none

Dual Path Networks

Title Dual Path Networks
Authors Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng
Abstract In this work, we present a simple, highly efficient and modularized Dual Path Network (DPN) for image classification which presents a new topology of connection paths internally. By revealing the equivalence of the state-of-the-art Residual Network (ResNet) and Densely Convolutional Network (DenseNet) within the HORNN framework, we find that ResNet enables feature re-usage while DenseNet enables new features exploration which are both important for learning good representations. To enjoy the benefits from both path topologies, our proposed Dual Path Network shares common features while maintaining the flexibility to explore new features through dual path architectures. Extensive experiments on three benchmark datasets, ImagNet-1k, Places365 and PASCAL VOC, clearly demonstrate superior performance of the proposed DPN over state-of-the-arts. In particular, on the ImagNet-1k dataset, a shallow DPN surpasses the best ResNeXt-101(64x4d) with 26% smaller model size, 25% less computational cost and 8% lower memory consumption, and a deeper DPN (DPN-131) further pushes the state-of-the-art single model performance with about 2 times faster training speed. Experiments on the Places365 large-scale scene dataset, PASCAL VOC detection dataset, and PASCAL VOC segmentation dataset also demonstrate its consistently better performance than DenseNet, ResNet and the latest ResNeXt model over various applications.
Tasks Image Classification
Published 2017-07-06
URL http://arxiv.org/abs/1707.01629v2
PDF http://arxiv.org/pdf/1707.01629v2.pdf
PWC https://paperswithcode.com/paper/dual-path-networks
Repo https://github.com/crowdAI/crowdai-musical-genre-recognition-starter-kit
Framework none

To prune, or not to prune: exploring the efficacy of pruning for model compression

Title To prune, or not to prune: exploring the efficacy of pruning for model compression
Authors Michael Zhu, Suyog Gupta
Abstract Model pruning seeks to induce sparsity in a deep neural network’s various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks at the cost of only a marginal loss in accuracy and achieve a sizable reduction in model size. This hints at the possibility that the baseline models in these experiments are perhaps severely over-parameterized at the outset and a viable alternative for model compression might be to simply reduce the number of hidden units while maintaining the model’s dense connection structure, exposing a similar trade-off in model size and accuracy. We investigate these two distinct paths for model compression within the context of energy-efficient inference in resource-constrained environments and propose a new gradual pruning technique that is simple and straightforward to apply across a variety of models/datasets with minimal tuning and can be seamlessly incorporated within the training process. We compare the accuracy of large, but pruned models (large-sparse) and their smaller, but dense (small-dense) counterparts with identical memory footprint. Across a broad range of neural network architectures (deep CNNs, stacked LSTM, and seq2seq LSTM models), we find large-sparse models to consistently outperform small-dense models and achieve up to 10x reduction in number of non-zero parameters with minimal loss in accuracy.
Tasks Model Compression
Published 2017-10-05
URL http://arxiv.org/abs/1710.01878v2
PDF http://arxiv.org/pdf/1710.01878v2.pdf
PWC https://paperswithcode.com/paper/to-prune-or-not-to-prune-exploring-the
Repo https://github.com/dorlivne/simple_net_pruning
Framework tf

DeepGO: Predicting protein functions from sequence and interactions using a deep ontology-aware classifier

Title DeepGO: Predicting protein functions from sequence and interactions using a deep ontology-aware classifier
Authors Maxat Kulmanov, Mohammed Asif Khan, Robert Hoehndorf
Abstract A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40,000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, with significant improvement for predicting cellular locations.
Tasks
Published 2017-05-15
URL http://arxiv.org/abs/1705.05919v1
PDF http://arxiv.org/pdf/1705.05919v1.pdf
PWC https://paperswithcode.com/paper/deepgo-predicting-protein-functions-from
Repo https://github.com/bio-ontology-research-group/deepgo
Framework none

Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Title Massively Multilingual Neural Grapheme-to-Phoneme Conversion
Authors Ben Peters, Jon Dehdari, Josef van Genabith
Abstract Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems. Most g2p systems are monolingual: they require language-specific data or handcrafting of rules. Such systems are difficult to extend to low resource languages, for which data and handcrafted rules are not available. As an alternative, we present a neural sequence-to-sequence approach to g2p which is trained on spelling–pronunciation pairs in hundreds of languages. The system shares a single encoder and decoder across all languages, allowing it to utilize the intrinsic similarities between different writing systems. We show an 11% improvement in phoneme error rate over an approach based on adapting high-resource monolingual g2p models to low-resource languages. Our model is also much more compact relative to previous approaches.
Tasks Speech Recognition
Published 2017-08-04
URL http://arxiv.org/abs/1708.01464v1
PDF http://arxiv.org/pdf/1708.01464v1.pdf
PWC https://paperswithcode.com/paper/massively-multilingual-neural-grapheme-to
Repo https://github.com/bpopeters/mg2p
Framework none
comments powered by Disqus