Paper Group NANR 239
Robust image stitching with multiple registrations. Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention. 結合卷積神經網路與遞迴神經網路於推文極性分類 (Combining Convolutional Neural Network and Recurrent Neural Network for Tweet Polarity Classification) [In Chinese]. $D^2$: Decentralized Training over Decentralized Data. An Improved …
Robust image stitching with multiple registrations
Title | Robust image stitching with multiple registrations |
Authors | Charles Herrmann, Chen Wang, Richard Strong Bowen, Emil Keyder, Michael Krainin, Ce Liu, Ramin Zabih |
Abstract | Panorama creation is one of the most widely deployed techniques in computer vision. In addition to industry applications such as Google Street View, it is also used by millions of consumers in smartphones and other cameras. Traditionally, the problem is decomposed into three phases: registration, which picks a single transformation of each source image to align it to the other inputs, seam finding, which selects a source image for each pixel in the final result, and blending, which fixes minor visual artifacts. Here, we observe that the use of a single registration often leads to errors, especially in scenes with significant depth variation or object motion. We propose instead the use of multiple registrations, permitting regions of the image at different depths to be captured with greater accuracy. MRF inference techniques naturally extend to seam finding over multiple registrations, and we show here that their energy functions can be readily modified with new terms that discourage duplication and tearing, common problems that are exacerbated by the use of multiple registrations. Our techniques are closely related to layer-based stereo, and move image stitching closer to explicit scene modeling. Experimental evidence demonstrates that our techniques often generate significantly better panoramas when there is substantial motion or parallax. |
Tasks | Image Stitching |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Charles_Herrmann_Robust_image_stitching_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Charles_Herrmann_Robust_image_stitching_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/robust-image-stitching-with-multiple |
Repo | |
Framework | |
Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention
Title | Improving Slot Filling in Spoken Language Understanding with Joint Pointer and Attention |
Authors | Lin Zhao, Zhe Feng |
Abstract | We present a generative neural network model for slot filling based on a sequence-to-sequence (Seq2Seq) model together with a pointer network, in the situation where only sentence-level slot annotations are available in the spoken dialogue data. This model predicts slot values by jointly learning to copy a word which may be out-of-vocabulary (OOV) from an input utterance through a pointer network, or generate a word within the vocabulary through an attentional Seq2Seq model. Experimental results show the effectiveness of our slot filling model, especially at addressing the OOV problem. Additionally, we integrate the proposed model into a spoken language understanding system and achieve the state-of-the-art performance on the benchmark data. |
Tasks | Slot Filling, Speech Recognition, Spoken Language Understanding |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2068/ |
https://www.aclweb.org/anthology/P18-2068 | |
PWC | https://paperswithcode.com/paper/improving-slot-filling-in-spoken-language |
Repo | |
Framework | |
結合卷積神經網路與遞迴神經網路於推文極性分類 (Combining Convolutional Neural Network and Recurrent Neural Network for Tweet Polarity Classification) [In Chinese]
Title | 結合卷積神經網路與遞迴神經網路於推文極性分類 (Combining Convolutional Neural Network and Recurrent Neural Network for Tweet Polarity Classification) [In Chinese] |
Authors | Chih-Ting Yeh, Chia-Ping Chen |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/O18-1023/ |
https://www.aclweb.org/anthology/O18-1023 | |
PWC | https://paperswithcode.com/paper/caacccc2e-eee-ccc2e-14-ae-combining |
Repo | |
Framework | |
$D^2$: Decentralized Training over Decentralized Data
Title | $D^2$: Decentralized Training over Decentralized Data |
Authors | Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, Ji Liu |
Abstract | While training a machine learning model using multiple workers, each of which collects data from its own data source, it would be useful when the data collected from different workers are unique and different. Ironically, recent analysis of decentralized parallel stochastic gradient descent (D-PSGD) relies on the assumption that the data hosted on different workers are not too different. In this paper, we ask the question: Can we design a decentralized parallel stochastic gradient descent algorithm that is less sensitive to the data variance across workers? In this paper, we present D$^2$, a novel decentralized parallel stochastic gradient descent algorithm designed for large data variance \xr{among workers} (imprecisely, “decentralized” data). The core of D$^2$ is a variance reduction extension of D-PSGD. It improves the convergence rate from $O\left({\sigma \over \sqrt{nT}} + {(n\zeta^2)^{\frac{1}{3}} \over T^{2/3}}\right)$ to $O\left({\sigma \over \sqrt{nT}}\right)$ where $\zeta^{2}$ denotes the variance among data on different workers. As a result, D$^2$ is robust to data variance among workers. We empirically evaluated D$^2$ on image classification tasks, where each worker has access to only the data of a limited set of labels, and find that D$^2$ significantly outperforms D-PSGD. |
Tasks | Image Classification, Multi-view Subspace Clustering |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2485 |
http://proceedings.mlr.press/v80/tang18a/tang18a.pdf | |
PWC | https://paperswithcode.com/paper/d2-decentralized-training-over-decentralized |
Repo | |
Framework | |
An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression
Title | An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression |
Authors | Sheng Chen, Arindam Banerjee |
Abstract | Multi-response linear models aggregate a set of vanilla linear models by assuming correlated noise across them, which has an unknown covariance structure. To find the coefficient vector, estimators with a joint approximation of the noise covariance are often preferred than the simple linear regression in view of their superior empirical performance, which can be generally solved by alternating-minimization type procedures. Due to the non-convex nature of such joint estimators, the theoretical justification of their efficiency is typically challenging. The existing analyses fail to fully explain the empirical observations due to the assumption of resampling on the alternating procedures, which requires access to fresh samples in each iteration. In this work, we present a resampling-free analysis for the alternating minimization algorithm applied to the multi-response regression. In particular, we focus on the high-dimensional setting of multi-response linear models with structured coefficient parameter, and the statistical error of the parameter can be expressed by the complexity measure, Gaussian width, which is related to the assumed structure. More importantly, to the best of our knowledge, our result reveals for the first time that the alternating minimization with random initialization can achieve the same performance as the well-initialized one when solving this multi-response regression problem. Experimental results support our theoretical developments. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7896-an-improved-analysis-of-alternating-minimization-for-structured-multi-response-regression |
http://papers.nips.cc/paper/7896-an-improved-analysis-of-alternating-minimization-for-structured-multi-response-regression.pdf | |
PWC | https://paperswithcode.com/paper/an-improved-analysis-of-alternating |
Repo | |
Framework | |
Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks
Title | Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks |
Authors | Zhihao Jia, Sina Lin, Charles R. Qi, Alex Aiken |
Abstract | The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks. Current approaches parallelize training onto multiple devices by applying a single parallelization strategy (e.g., data or model parallelism) to all layers in a network. Although easy to reason about, these approaches result in suboptimal runtime performance in large-scale distributed training, since different layers in a network may prefer different parallelization strategies. In this paper, we propose layer-wise parallelism that allows each layer in a network to use an individual parallelization strategy. We jointly optimize how each layer is parallelized by solving a graph search problem. Our evaluation shows that layer-wise parallelism outperforms state-of-the-art approaches by increasing training throughput, reducing communication costs, achieving better scalability to multiple GPUs, while maintaining original network accuracy. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1891 |
http://proceedings.mlr.press/v80/jia18a/jia18a.pdf | |
PWC | https://paperswithcode.com/paper/exploring-hidden-dimensions-in-accelerating |
Repo | |
Framework | |
Dense Recurrent Neural Network with Attention Gate
Title | Dense Recurrent Neural Network with Attention Gate |
Authors | Yong-Ho Yoo, Kook Han, Sanghyun Cho, Kyoung-Chul Koh, Jong-Hwan Kim |
Abstract | We propose the dense RNN, which has the fully connections from each hidden state to multiple preceding hidden states of all layers directly. As the density of the connection increases, the number of paths through which the gradient flows can be increased. It increases the magnitude of gradients, which help to prevent the vanishing gradient problem in time. Larger gradients, however, can also cause exploding gradient problem. To complement the trade-off between two problems, we propose an attention gate, which controls the amounts of gradient flows. We describe the relation between the attention gate and the gradient flows by approximation. The experiment on the language modeling using Penn Treebank corpus shows dense connections with the attention gate improve the model’s performance. |
Tasks | Language Modelling |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJVruWZRW |
https://openreview.net/pdf?id=rJVruWZRW | |
PWC | https://paperswithcode.com/paper/dense-recurrent-neural-network-with-attention |
Repo | |
Framework | |
SEx BiST: A Multi-Source Trainable Parser with Deep Contextualized Lexical Representations
Title | SEx BiST: A Multi-Source Trainable Parser with Deep Contextualized Lexical Representations |
Authors | KyungTae Lim, Cheoneum Park, Changki Lee, Thierry Poibeau |
Abstract | We describe the SEx BiST parser (Semantically EXtended Bi-LSTM parser) developed at Lattice for the CoNLL 2018 Shared Task (Multilingual Parsing from Raw Text to Universal Dependencies). The main characteristic of our work is the encoding of three different modes of contextual information for parsing: (i) Treebank feature representations, (ii) Multilingual word representations, (iii) ELMo representations obtained via unsupervised learning from external resources. Our parser performed well in the official end-to-end evaluation (73.02 LAS {–} 4th/26 teams, and 78.72 UAS {–} 2nd/26); remarkably, we achieved the best UAS scores on all the English corpora by applying the three suggested feature representations. Finally, we were also ranked 1st at the optional event extraction task, part of the 2018 Extrinsic Parser Evaluation campaign. |
Tasks | Dependency Parsing |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2014/ |
https://www.aclweb.org/anthology/K18-2014 | |
PWC | https://paperswithcode.com/paper/sex-bist-a-multi-source-trainable-parser-with |
Repo | |
Framework | |
The SLT-Interactions Parsing System at the CoNLL 2018 Shared Task
Title | The SLT-Interactions Parsing System at the CoNLL 2018 Shared Task |
Authors | Riyaz A. Bhat, Irshad Bhat, Srinivas Bangalore |
Abstract | This paper describes our system (SLT-Interactions) for the CoNLL 2018 shared task: Multilingual Parsing from Raw Text to Universal Dependencies. Our system performs three main tasks: word segmentation (only for few treebanks), POS tagging and parsing. While segmentation is learned separately, we use neural stacking for joint learning of POS tagging and parsing tasks. For all the tasks, we employ simple neural network architectures that rely on long short-term memory (LSTM) networks for learning task-dependent features. At the basis of our parser, we use an arc-standard algorithm with Swap action for general non-projective parsing. Additionally, we use neural stacking as a knowledge transfer mechanism for cross-domain parsing of low resource domains. Our system shows substantial gains against the UDPipe baseline, with an average improvement of 4.18{%} in LAS across all languages. Overall, we are placed at the 12th position on the official test sets. |
Tasks | Dependency Parsing, Part-Of-Speech Tagging, Tokenization, Transfer Learning, Transliteration |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2015/ |
https://www.aclweb.org/anthology/K18-2015 | |
PWC | https://paperswithcode.com/paper/the-slt-interactions-parsing-system-at-the |
Repo | |
Framework | |
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers)
Title | Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers) |
Authors | |
Abstract | |
Tasks | Machine Translation |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-1800/ |
https://www.aclweb.org/anthology/W18-1800 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-13th-conference-of-the-1 |
Repo | |
Framework | |
A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation
Title | A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation |
Authors | Benjamin Marie, Atsushi Fujita |
Abstract | |
Tasks | Machine Translation |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-1811/ |
https://www.aclweb.org/anthology/W18-1811 | |
PWC | https://paperswithcode.com/paper/a-smorgasbord-of-features-to-combine-phrase |
Repo | |
Framework | |
Balancing Translation Quality and Sentiment Preservation (Non-archival Extended Abstract)
Title | Balancing Translation Quality and Sentiment Preservation (Non-archival Extended Abstract) |
Authors | Pintu Lohar, Haithem Afli, Andy Way |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-1808/ |
https://www.aclweb.org/anthology/W18-1808 | |
PWC | https://paperswithcode.com/paper/balancing-translation-quality-and-sentiment |
Repo | |
Framework | |
Keynote: Machine Translation Beyond the Sentence
Title | Keynote: Machine Translation Beyond the Sentence |
Authors | Macduff Hughes |
Abstract | |
Tasks | Machine Translation |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-1901/ |
https://www.aclweb.org/anthology/W18-1901 | |
PWC | https://paperswithcode.com/paper/keynote-machine-translation-beyond-the |
Repo | |
Framework | |
CUNI x-ling: Parsing Under-Resourced Languages in CoNLL 2018 UD Shared Task
Title | CUNI x-ling: Parsing Under-Resourced Languages in CoNLL 2018 UD Shared Task |
Authors | Rudolf Rosa, David Mare{\v{c}}ek |
Abstract | This is a system description paper for the CUNI x-ling submission to the CoNLL 2018 UD Shared Task. We focused on parsing under-resourced languages, with no or little training data available. We employed a wide range of approaches, including simple word-based treebank translation, combination of delexicalized parsers, and exploitation of available morphological dictionaries, with a dedicated setup tailored to each of the languages. In the official evaluation, our submission was identified as the clear winner of the Low-resource languages category. |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2019/ |
https://www.aclweb.org/anthology/K18-2019 | |
PWC | https://paperswithcode.com/paper/cuni-x-ling-parsing-under-resourced-languages |
Repo | |
Framework | |
UMDuluth-CS8761 at SemEval-2018 Task9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings
Title | UMDuluth-CS8761 at SemEval-2018 Task9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings |
Authors | Arshia Zernab Hassan, Manikya Swathi Vallabhajosyula, Ted Pedersen |
Abstract | Hypernym Discovery is the task of identifying potential hypernyms for a given term. A hypernym is a more generalized word that is super-ordinate to more specific words. This paper explores several approaches that rely on co-occurrence frequencies of word pairs, Hearst Patterns based on regular expressions, and word embeddings created from the UMBC corpus. Our system Babbage participated in Subtask 1A for English and placed 6th of 19 systems when identifying concept hypernyms, and 12th of 18 systems for entity hypernyms. |
Tasks | Hypernym Discovery, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1149/ |
https://www.aclweb.org/anthology/S18-1149 | |
PWC | https://paperswithcode.com/paper/umduluth-cs8761-at-semeval-2018-task9 |
Repo | |
Framework | |