October 15, 2019

2082 words 10 mins read

Paper Group NANR 239

Robust image stitching with multiple registrations. Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention. 結合卷積神經網路與遞迴神經網路於推文極性分類 (Combining Convolutional Neural Network and Recurrent Neural Network for Tweet Polarity Classification) [In Chinese]. $D^2$: Decentralized Training over Decentralized Data. An Improved …

Robust image stitching with multiple registrations


Title	Robust image stitching with multiple registrations
Authors	Charles Herrmann, Chen Wang, Richard Strong Bowen, Emil Keyder, Michael Krainin, Ce Liu, Ramin Zabih
Abstract	Panorama creation is one of the most widely deployed techniques in computer vision. In addition to industry applications such as Google Street View, it is also used by millions of consumers in smartphones and other cameras. Traditionally, the problem is decomposed into three phases: registration, which picks a single transformation of each source image to align it to the other inputs, seam finding, which selects a source image for each pixel in the final result, and blending, which fixes minor visual artifacts. Here, we observe that the use of a single registration often leads to errors, especially in scenes with significant depth variation or object motion. We propose instead the use of multiple registrations, permitting regions of the image at different depths to be captured with greater accuracy. MRF inference techniques naturally extend to seam finding over multiple registrations, and we show here that their energy functions can be readily modified with new terms that discourage duplication and tearing, common problems that are exacerbated by the use of multiple registrations. Our techniques are closely related to layer-based stereo, and move image stitching closer to explicit scene modeling. Experimental evidence demonstrates that our techniques often generate significantly better panoramas when there is substantial motion or parallax.
Tasks	Image Stitching
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Charles_Herrmann_Robust_image_stitching_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Charles_Herrmann_Robust_image_stitching_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/robust-image-stitching-with-multiple
Repo
Framework

Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention


Title	Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention
Authors	Lin Zhao, Zhe Feng
Abstract	We present a generative neural network model for slot filling based on a sequence-to-sequence (Seq2Seq) model together with a pointer network, in the situation where only sentence-level slot annotations are available in the spoken dialogue data. This model predicts slot values by jointly learning to copy a word which may be out-of-vocabulary (OOV) from an input utterance through a pointer network, or generate a word within the vocabulary through an attentional Seq2Seq model. Experimental results show the effectiveness of our slot filling model, especially at addressing the OOV problem. Additionally, we integrate the proposed model into a spoken language understanding system and achieve the state-of-the-art performance on the benchmark data.
Tasks	Slot Filling, Speech Recognition, Spoken Language Understanding
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2068/
PDF	https://www.aclweb.org/anthology/P18-2068
PWC	https://paperswithcode.com/paper/improving-slot-filling-in-spoken-language
Repo
Framework

結合卷積神經網路與遞迴神經網路於推文極性分類 (Combining Convolutional Neural Network and Recurrent Neural Network for Tweet Polarity Classification) [In Chinese]


Title	結合卷積神經網路與遞迴神經網路於推文極性分類 (Combining Convolutional Neural Network and Recurrent Neural Network for Tweet Polarity Classification) [In Chinese]
Authors	Chih-Ting Yeh, Chia-Ping Chen
Abstract
Tasks	Sentiment Analysis
Published	2018-10-01
URL	https://www.aclweb.org/anthology/O18-1023/
PDF	https://www.aclweb.org/anthology/O18-1023
PWC	https://paperswithcode.com/paper/caacccc2e-eee-ccc2e-14-ae-combining
Repo
Framework

$D^2$: Decentralized Training over Decentralized Data


Title	$D^2$: Decentralized Training over Decentralized Data
Authors	Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, Ji Liu
Abstract	While training a machine learning model using multiple workers, each of which collects data from its own data source, it would be useful when the data collected from different workers are unique and different. Ironically, recent analysis of decentralized parallel stochastic gradient descent (D-PSGD) relies on the assumption that the data hosted on different workers are not too different. In this paper, we ask the question: Can we design a decentralized parallel stochastic gradient descent algorithm that is less sensitive to the data variance across workers? In this paper, we present D$^2$, a novel decentralized parallel stochastic gradient descent algorithm designed for large data variance \xr{among workers} (imprecisely, “decentralized” data). The core of D$^2$ is a variance reduction extension of D-PSGD. It improves the convergence rate from $O\left({\sigma \over \sqrt{nT}} + {(n\zeta^2)^{\frac{1}{3}} \over T^{2/3}}\right)$ to $O\left({\sigma \over \sqrt{nT}}\right)$ where $\zeta^{2}$ denotes the variance among data on different workers. As a result, D$^2$ is robust to data variance among workers. We empirically evaluated D$^2$ on image classification tasks, where each worker has access to only the data of a limited set of labels, and find that D$^2$ significantly outperforms D-PSGD.
Tasks	Image Classification, Multi-view Subspace Clustering
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2485
PDF	http://proceedings.mlr.press/v80/tang18a/tang18a.pdf
PWC	https://paperswithcode.com/paper/d2-decentralized-training-over-decentralized
Repo
Framework

An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression


Title	An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression
Authors	Sheng Chen, Arindam Banerjee
Abstract	Multi-response linear models aggregate a set of vanilla linear models by assuming correlated noise across them, which has an unknown covariance structure. To find the coefficient vector, estimators with a joint approximation of the noise covariance are often preferred than the simple linear regression in view of their superior empirical performance, which can be generally solved by alternating-minimization type procedures. Due to the non-convex nature of such joint estimators, the theoretical justification of their efficiency is typically challenging. The existing analyses fail to fully explain the empirical observations due to the assumption of resampling on the alternating procedures, which requires access to fresh samples in each iteration. In this work, we present a resampling-free analysis for the alternating minimization algorithm applied to the multi-response regression. In particular, we focus on the high-dimensional setting of multi-response linear models with structured coefficient parameter, and the statistical error of the parameter can be expressed by the complexity measure, Gaussian width, which is related to the assumed structure. More importantly, to the best of our knowledge, our result reveals for the first time that the alternating minimization with random initialization can achieve the same performance as the well-initialized one when solving this multi-response regression problem. Experimental results support our theoretical developments.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7896-an-improved-analysis-of-alternating-minimization-for-structured-multi-response-regression
PDF	http://papers.nips.cc/paper/7896-an-improved-analysis-of-alternating-minimization-for-structured-multi-response-regression.pdf
PWC	https://paperswithcode.com/paper/an-improved-analysis-of-alternating
Repo
Framework

Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks


Title	Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks
Authors	Zhihao Jia, Sina Lin, Charles R. Qi, Alex Aiken
Abstract	The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks. Current approaches parallelize training onto multiple devices by applying a single parallelization strategy (e.g., data or model parallelism) to all layers in a network. Although easy to reason about, these approaches result in suboptimal runtime performance in large-scale distributed training, since different layers in a network may prefer different parallelization strategies. In this paper, we propose layer-wise parallelism that allows each layer in a network to use an individual parallelization strategy. We jointly optimize how each layer is parallelized by solving a graph search problem. Our evaluation shows that layer-wise parallelism outperforms state-of-the-art approaches by increasing training throughput, reducing communication costs, achieving better scalability to multiple GPUs, while maintaining original network accuracy.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1891
PDF	http://proceedings.mlr.press/v80/jia18a/jia18a.pdf
PWC	https://paperswithcode.com/paper/exploring-hidden-dimensions-in-accelerating
Repo
Framework

Dense Recurrent Neural Network with Attention Gate


Title	Dense Recurrent Neural Network with Attention Gate
Authors	Yong-Ho Yoo, Kook Han, Sanghyun Cho, Kyoung-Chul Koh, Jong-Hwan Kim
Abstract	We propose the dense RNN, which has the fully connections from each hidden state to multiple preceding hidden states of all layers directly. As the density of the connection increases, the number of paths through which the gradient flows can be increased. It increases the magnitude of gradients, which help to prevent the vanishing gradient problem in time. Larger gradients, however, can also cause exploding gradient problem. To complement the trade-off between two problems, we propose an attention gate, which controls the amounts of gradient flows. We describe the relation between the attention gate and the gradient flows by approximation. The experiment on the language modeling using Penn Treebank corpus shows dense connections with the attention gate improve the model’s performance.
Tasks	Language Modelling
Published	2018-01-01
URL	https://openreview.net/forum?id=rJVruWZRW
PDF	https://openreview.net/pdf?id=rJVruWZRW
PWC	https://paperswithcode.com/paper/dense-recurrent-neural-network-with-attention
Repo
Framework

SEx BiST: A Multi-Source Trainable Parser with Deep Contextualized Lexical Representations


Title	SEx BiST: A Multi-Source Trainable Parser with Deep Contextualized Lexical Representations
Authors	KyungTae Lim, Cheoneum Park, Changki Lee, Thierry Poibeau
Abstract	We describe the SEx BiST parser (Semantically EXtended Bi-LSTM parser) developed at Lattice for the CoNLL 2018 Shared Task (Multilingual Parsing from Raw Text to Universal Dependencies). The main characteristic of our work is the encoding of three different modes of contextual information for parsing: (i) Treebank feature representations, (ii) Multilingual word representations, (iii) ELMo representations obtained via unsupervised learning from external resources. Our parser performed well in the official end-to-end evaluation (73.02 LAS {–} 4th/26 teams, and 78.72 UAS {–} 2nd/26); remarkably, we achieved the best UAS scores on all the English corpora by applying the three suggested feature representations. Finally, we were also ranked 1st at the optional event extraction task, part of the 2018 Extrinsic Parser Evaluation campaign.
Tasks	Dependency Parsing
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-2014/
PDF	https://www.aclweb.org/anthology/K18-2014
PWC	https://paperswithcode.com/paper/sex-bist-a-multi-source-trainable-parser-with
Repo
Framework

The SLT-Interactions Parsing System at the CoNLL 2018 Shared Task


Title	The SLT-Interactions Parsing System at the CoNLL 2018 Shared Task
Authors	Riyaz A. Bhat, Irshad Bhat, Srinivas Bangalore
Abstract	This paper describes our system (SLT-Interactions) for the CoNLL 2018 shared task: Multilingual Parsing from Raw Text to Universal Dependencies. Our system performs three main tasks: word segmentation (only for few treebanks), POS tagging and parsing. While segmentation is learned separately, we use neural stacking for joint learning of POS tagging and parsing tasks. For all the tasks, we employ simple neural network architectures that rely on long short-term memory (LSTM) networks for learning task-dependent features. At the basis of our parser, we use an arc-standard algorithm with Swap action for general non-projective parsing. Additionally, we use neural stacking as a knowledge transfer mechanism for cross-domain parsing of low resource domains. Our system shows substantial gains against the UDPipe baseline, with an average improvement of 4.18{%} in LAS across all languages. Overall, we are placed at the 12th position on the official test sets.
Tasks	Dependency Parsing, Part-Of-Speech Tagging, Tokenization, Transfer Learning, Transliteration
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-2015/
PDF	https://www.aclweb.org/anthology/K18-2015
PWC	https://paperswithcode.com/paper/the-slt-interactions-parsing-system-at-the
Repo
Framework

Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers)


Title	Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers)
Authors
Abstract
Tasks	Machine Translation
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-1800/
PDF	https://www.aclweb.org/anthology/W18-1800
PWC	https://paperswithcode.com/paper/proceedings-of-the-13th-conference-of-the-1
Repo
Framework

A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation


Title	A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation
Authors	Benjamin Marie, Atsushi Fujita
Abstract
Tasks	Machine Translation
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-1811/
PDF	https://www.aclweb.org/anthology/W18-1811
PWC	https://paperswithcode.com/paper/a-smorgasbord-of-features-to-combine-phrase
Repo
Framework

Balancing Translation Quality and Sentiment Preservation (Non-archival Extended Abstract)


Title	Balancing Translation Quality and Sentiment Preservation (Non-archival Extended Abstract)
Authors	Pintu Lohar, Haithem Afli, Andy Way
Abstract
Tasks	Sentiment Analysis
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-1808/
PDF	https://www.aclweb.org/anthology/W18-1808
PWC	https://paperswithcode.com/paper/balancing-translation-quality-and-sentiment
Repo
Framework

Keynote: Machine Translation Beyond the Sentence


Title	Keynote: Machine Translation Beyond the Sentence
Authors	Macduff Hughes
Abstract
Tasks	Machine Translation
Published	2018-03-01
URL	https://www.aclweb.org/anthology/W18-1901/
PDF	https://www.aclweb.org/anthology/W18-1901
PWC	https://paperswithcode.com/paper/keynote-machine-translation-beyond-the
Repo
Framework

CUNI x-ling: Parsing Under-Resourced Languages in CoNLL 2018 UD Shared Task


Title	CUNI x-ling: Parsing Under-Resourced Languages in CoNLL 2018 UD Shared Task
Authors	Rudolf Rosa, David Mare{\v{c}}ek
Abstract	This is a system description paper for the CUNI x-ling submission to the CoNLL 2018 UD Shared Task. We focused on parsing under-resourced languages, with no or little training data available. We employed a wide range of approaches, including simple word-based treebank translation, combination of delexicalized parsers, and exploitation of available morphological dictionaries, with a dedicated setup tailored to each of the languages. In the official evaluation, our submission was identified as the clear winner of the Low-resource languages category.
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-2019/
PDF	https://www.aclweb.org/anthology/K18-2019
PWC	https://paperswithcode.com/paper/cuni-x-ling-parsing-under-resourced-languages
Repo
Framework

UMDuluth-CS8761 at SemEval-2018 Task9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings


Title	UMDuluth-CS8761 at SemEval-2018 Task9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings
Authors	Arshia Zernab Hassan, Manikya Swathi Vallabhajosyula, Ted Pedersen
Abstract	Hypernym Discovery is the task of identifying potential hypernyms for a given term. A hypernym is a more generalized word that is super-ordinate to more specific words. This paper explores several approaches that rely on co-occurrence frequencies of word pairs, Hearst Patterns based on regular expressions, and word embeddings created from the UMBC corpus. Our system Babbage participated in Subtask 1A for English and placed 6th of 19 systems when identifying concept hypernyms, and 12th of 18 systems for entity hypernyms.
Tasks	Hypernym Discovery, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1149/
PDF	https://www.aclweb.org/anthology/S18-1149
PWC	https://paperswithcode.com/paper/umduluth-cs8761-at-semeval-2018-task9
Repo
Framework