Paper Group NANR 104
Going From Image to Video Saliency: Augmenting Image Salience With Dynamic Attentional Push. Prediction Improves Simultaneous Neural Machine Translation. Simultaneous Translation using Optimized Segmentation. Character Level Convolutional Neural Network for Indo-Aryan Language Identification. Representing and Learning High Dimensional Data With the …
Going From Image to Video Saliency: Augmenting Image Salience With Dynamic Attentional Push
Title | Going From Image to Video Saliency: Augmenting Image Salience With Dynamic Attentional Push |
Authors | Siavash Gorji, James J. Clark |
Abstract | We present a novel method to incorporate the recent advent in static saliency models to predict the saliency in videos. Our model augments the static saliency models with the Attentional Push effect of the photographer and the scene actors in a shared attention setting. We demonstrate that not only it is imperative to use static Attentional Push cues, noticeable performance improvement is achievable by learning the time-varying nature of Attentional Push. We propose a multi-stream Convolutional Long Short-Term Memory network (ConvLSTM) structure which augments state-of-the-art in static saliency models with dynamic Attentional Push. Our network contains four pathways, a saliency pathway and three Attentional Push pathways. The multi-pathway structure is followed by an augmenting convnet that learns to combine the complementary and time-varying outputs of the ConvLSTMs by minimizing the relative entropy between the augmented saliency and viewers fixation patterns on videos. We evaluate our model by comparing the performance of several augmented static saliency models with state-of-the-art in spatiotemporal saliency on three largest dynamic eye tracking datasets, HOLLYWOOD2, UCF-Sport and DIEM. Experimental results illustrates that solid performance gain is achievable using the proposed methodology. |
Tasks | Eye Tracking |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Gorji_Going_From_Image_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Gorji_Going_From_Image_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/going-from-image-to-video-saliency-augmenting |
Repo | |
Framework | |
Prediction Improves Simultaneous Neural Machine Translation
Title | Prediction Improves Simultaneous Neural Machine Translation |
Authors | Ashkan Alinejad, Maryam Siahbani, Anoop Sarkar |
Abstract | Simultaneous speech translation aims to maintain translation quality while minimizing the delay between reading input and incrementally producing the output. We propose a new general-purpose prediction action which predicts future words in the input to improve quality and minimize delay in simultaneous translation. We train this agent using reinforcement learning with a novel reward function. Our agent with prediction has better translation quality and less delay compared to an agent-based simultaneous translation system without prediction. |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1337/ |
https://www.aclweb.org/anthology/D18-1337 | |
PWC | https://paperswithcode.com/paper/prediction-improves-simultaneous-neural |
Repo | |
Framework | |
Simultaneous Translation using Optimized Segmentation
Title | Simultaneous Translation using Optimized Segmentation |
Authors | Maryam Siahbani, Hassan Shavarani, Ashkan Alinejad, Anoop Sarkar |
Abstract | |
Tasks | Machine Translation |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-1815/ |
https://www.aclweb.org/anthology/W18-1815 | |
PWC | https://paperswithcode.com/paper/simultaneous-translation-using-optimized |
Repo | |
Framework | |
Character Level Convolutional Neural Network for Indo-Aryan Language Identification
Title | Character Level Convolutional Neural Network for Indo-Aryan Language Identification |
Authors | Mohamed Ali |
Abstract | This submission is a description paper for our system in ILI shared task |
Tasks | Language Identification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3932/ |
https://www.aclweb.org/anthology/W18-3932 | |
PWC | https://paperswithcode.com/paper/character-level-convolutional-neural-network |
Repo | |
Framework | |
Representing and Learning High Dimensional Data With the Optimal Transport Map From a Probabilistic Viewpoint
Title | Representing and Learning High Dimensional Data With the Optimal Transport Map From a Probabilistic Viewpoint |
Authors | Serim Park, Matthew Thorpe |
Abstract | In this paper, we propose a generative model in the space of diffeomorphic deformation maps. More precisely, we utilize the Kantarovich-Wasserstein metric and accompanying geometry to represent an image as a deformation from templates. Moreover, we incorporate a probabilistic viewpoint by assuming that each image is locally generated from a reference image. We capture the local structure by modelling the tangent planes at reference images. %; we assume that each image is generated from one of finite number of tangent planes. % by an unobserved discrete random variable that indexes the tangent plane the image belongs to. Once basis vectors for each tangent plane are learned via probabilistic PCA, we can sample a local coordinate, that can be inverted back to image space exactly. With experiments using 4 different datasets, we show that the generative tangent plane model in the optimal transport (OT) manifold can be learned with small numbers of images and can be used to create infinitely many `unseen’ images. In addition, the Bayesian classification accompanied with the probabilist modeling of the tangent planes shows improved accuracy over that done in the image space. Combining the results of our experiments supports our claim that certain datasets can be better represented with the Kantarovich-Wasserstein metric. We envision that the proposed method could be a practical solution to learning and representing data that is generated with templates in situatons where only limited numbers of data points are available. | |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Park_Representing_and_Learning_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Park_Representing_and_Learning_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/representing-and-learning-high-dimensional |
Repo | |
Framework | |
A Morphologically Annotated Corpus of Emirati Arabic
Title | A Morphologically Annotated Corpus of Emirati Arabic |
Authors | Salam Khalifa, Nizar Habash, Fadhl Eryani, Ossama Obeid, Dana Abdulrahim, Meera Al Kaabi |
Abstract | |
Tasks | Lemmatization, Machine Translation, Morphological Analysis, Part-Of-Speech Tagging, Tokenization |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1607/ |
https://www.aclweb.org/anthology/L18-1607 | |
PWC | https://paperswithcode.com/paper/a-morphologically-annotated-corpus-of-emirati |
Repo | |
Framework | |
Learning to Detect Features in Texture Images
Title | Learning to Detect Features in Texture Images |
Authors | Linguang Zhang, Szymon Rusinkiewicz |
Abstract | Local feature detection is a fundamental task in computer vision, and hand-crafted feature detectors such as SIFT have shown success in applications including image-based localization and registration. Recent work has used features detected in texture images for precise global localization, but is limited by the performance of existing feature detectors on textures, as opposed to natural images. We propose an effective and scalable method for learning feature detectors for textures, which combines an existing “ranking” loss with an efficient fully-convolutional architecture as well as a new training-loss term that maximizes the “peakedness” of the response map. We demonstrate that our detector is more repeatable than existing methods, leading to improvements in a real-world texture-based localization application. |
Tasks | Image-Based Localization |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Learning_to_Detect_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Learning_to_Detect_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-detect-features-in-texture-images |
Repo | |
Framework | |
InferLite: Simple Universal Sentence Representations from Natural Language Inference Data
Title | InferLite: Simple Universal Sentence Representations from Natural Language Inference Data |
Authors | Jamie Kiros, William Chan |
Abstract | Natural language inference has been shown to be an effective supervised task for learning generic sentence embeddings. In order to better understand the components that lead to effective representations, we propose a lightweight version of InferSent, called InferLite, that does not use any recurrent layers and operates on a collection of pre-trained word embeddings. We show that a simple instance of our model that makes no use of context, word ordering or position can still obtain competitive performance on the majority of downstream prediction tasks, with most performance gaps being filled by adding local contextual information through temporal convolutions. Our models can be trained in under 1 hour on a single GPU and allows for fast inference of new representations. Finally we describe a semantic hashing layer that allows our model to learn generic binary codes for sentences. |
Tasks | Natural Language Inference, Sentence Embeddings, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1524/ |
https://www.aclweb.org/anthology/D18-1524 | |
PWC | https://paperswithcode.com/paper/inferlite-simple-universal-sentence |
Repo | |
Framework | |
Generating Market Comments Referring to External Resources
Title | Generating Market Comments Referring to External Resources |
Authors | Tatsuya Aoki, Akira Miyazawa, Tatsuya Ishigaki, Keiichi Goshima, Kasumi Aoki, Ichiro Kobayashi, Hiroya Takamura, Yusuke Miyao |
Abstract | Comments on a stock market often include the reason or cause of changes in stock prices, such as {``}Nikkei turns lower as yen{'}s rise hits exporters.{''} Generating such informative sentences requires capturing the relationship between different resources, including a target stock price. In this paper, we propose a model for automatically generating such informative market comments that refer to external resources. We evaluated our model through an automatic metric in terms of BLEU and human evaluation done by an expert in finance. The results show that our model outperforms the existing model both in BLEU scores and human judgment. | |
Tasks | Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6515/ |
https://www.aclweb.org/anthology/W18-6515 | |
PWC | https://paperswithcode.com/paper/generating-market-comments-referring-to |
Repo | |
Framework | |
Phrase-level Self-Attention Networks for Universal Sentence Encoding
Title | Phrase-level Self-Attention Networks for Universal Sentence Encoding |
Authors | Wei Wu, Houfeng Wang, Tianyu Liu, Shuming Ma |
Abstract | Universal sentence encoding is a hot topic in recent NLP research. Attention mechanism has been an integral part in many sentence encoding models, allowing the models to capture context dependencies regardless of the distance between the elements in the sequence. Fully attention-based models have recently attracted enormous interest due to their highly parallelizable computation and significantly less training time. However, the memory consumption of their models grows quadratically with the sentence length, and the syntactic information is neglected. To this end, we propose Phrase-level Self-Attention Networks (PSAN) that perform self-attention across words inside a phrase to capture context dependencies at the phrase level, and use the gated memory updating mechanism to refine each word{'}s representation hierarchically with longer-term context dependencies captured in a larger phrase. As a result, the memory consumption can be reduced because the self-attention is performed at the phrase level instead of the sentence level. At the same time, syntactic information can be easily integrated in the model. Experiment results show that PSAN can achieve the state-of-the-art performance across a plethora of NLP tasks including binary and multi-class classification, natural language inference and sentence similarity. |
Tasks | Natural Language Inference, Sentence Classification, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1408/ |
https://www.aclweb.org/anthology/D18-1408 | |
PWC | https://paperswithcode.com/paper/phrase-level-self-attention-networks-for |
Repo | |
Framework | |
Automatic Labeling of Problem-Solving Dialogues for Computational Microgenetic Learning Analytics
Title | Automatic Labeling of Problem-Solving Dialogues for Computational Microgenetic Learning Analytics |
Authors | Yuanliang Meng, Anna Rumshisky, Florence Sullivan |
Abstract | |
Tasks | Sentence Embeddings, Word Embeddings |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1639/ |
https://www.aclweb.org/anthology/L18-1639 | |
PWC | https://paperswithcode.com/paper/automatic-labeling-of-problem-solving |
Repo | |
Framework | |
Depth separation and weight-width trade-offs for sigmoidal neural networks
Title | Depth separation and weight-width trade-offs for sigmoidal neural networks |
Authors | Amit Deshpande, Navin Goyal, Sushrut Karmalkar |
Abstract | Some recent work has shown separation between the expressive power of depth-2 and depth-3 neural networks. These separation results are shown by constructing functions and input distributions, so that the function is well-approximable by a depth-3 neural network of polynomial size but it cannot be well-approximated under the chosen input distribution by any depth-2 neural network of polynomial size. These results are not robust and require carefully chosen functions as well as input distributions. We show a similar separation between the expressive power of depth-2 and depth-3 sigmoidal neural networks over a large class of input distributions, as long as the weights are polynomially bounded. While doing so, we also show that depth-2 sigmoidal neural networks with small width and small weights can be well-approximated by low-degree multivariate polynomials. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJICXeWAb |
https://openreview.net/pdf?id=SJICXeWAb | |
PWC | https://paperswithcode.com/paper/depth-separation-and-weight-width-trade-offs |
Repo | |
Framework | |
Using Classifier Features to Determine Language Transfer on Morphemes
Title | Using Classifier Features to Determine Language Transfer on Morphemes |
Authors | Alex Lavrentovich, ra |
Abstract | The aim of this thesis is to perform a Native Language Identification (NLI) task where we identify an English learner{'}s native language background based only on the learner{'}s English writing samples. We focus on the use of English grammatical morphemes across four proficiency levels. The outcome of the computational task is connected to a position in second language acquisition research that holds all learners acquire English grammatical morphemes in the same order, regardless of native language background. We use the NLI task as a tool to uncover cross-linguistic influence on the developmental trajectory of morphemes. We perform a cross-corpus evaluation across proficiency levels to increase the reliability and validity of the linguistic features that predict the native language background. We include native English data to determine the different morpheme patterns used by native versus non-native English speakers. Furthermore, we conduct a human NLI task to determine the type and magnitude of language transfer cues used by human raters versus the classifier. |
Tasks | Language Acquisition, Language Identification, Native Language Identification, Text Classification |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-4009/ |
https://www.aclweb.org/anthology/N18-4009 | |
PWC | https://paperswithcode.com/paper/using-classifier-features-to-determine |
Repo | |
Framework | |
Surface Statistics of an Unknown Language Indicate How to Parse It
Title | Surface Statistics of an Unknown Language Indicate How to Parse It |
Authors | Dingquan Wang, Jason Eisner |
Abstract | We introduce a novel framework for delexicalized dependency parsing in a new language. We show that useful features of the target language can be extracted automatically from an unparsed corpus, which consists only of gold part-of-speech (POS) sequences. Providing these features to our neural parser enables it to parse sequences like those in the corpus. Strikingly, our system has no supervision in the target language. Rather, it is a multilingual system that is trained end-to-end on a variety of other languages, so it learns a feature extractor that works well. We show experimentally across multiple languages: (1) Features computed from the unparsed corpus improve parsing accuracy. (2) Including thousands of synthetic languages in the training yields further improvement. (3) Despite being computed from unparsed corpora, our learned task-specific features beat previous work{'}s interpretable typological features that require parsed corpora or expert categorization of the language. Our best method improved attachment scores on held-out test languages by an average of 5.6 percentage points over past work that does not inspect the unparsed data (McDonald et al., 2011), and by 20.7 points over past {``}grammar induction{''} work that does not use training languages (Naseem et al., 2010). | |
Tasks | Dependency Parsing |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/Q18-1046/ |
https://www.aclweb.org/anthology/Q18-1046 | |
PWC | https://paperswithcode.com/paper/surface-statistics-of-an-unknown-language |
Repo | |
Framework | |
Sparse, Smart Contours to Represent and Edit Images
Title | Sparse, Smart Contours to Represent and Edit Images |
Authors | Tali Dekel, Chuang Gan, Dilip Krishnan, Ce Liu, William T. Freeman |
Abstract | We study the problem of reconstructing an image from information stored at contour locations. We show that high-quality reconstructions with high fidelity to the source image can be obtained from sparse input, e.g., comprising less than 6% of image pixels. This is a significant improvement over existing contour-based reconstruction methods that require much denser input to capture subtle texture information and to ensure image quality. Our model, based on generative adversarial networks, synthesizes texture and details in regions where no input information is provided. The semantic knowledge encoded into our model and the sparsity of the input allows to use contours as an intuitive interface for semantically-aware image manipulation: local edits in contour domain translate to long-range and coherent changes in pixel space. We can perform complex structural changes such as changing facial expression by simple edits of contours. Our experiments demonstrate that humans as well as a face recognition system mostly cannot distinguish between our reconstructions and the source images. |
Tasks | Face Recognition |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Dekel_Sparse_Smart_Contours_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Dekel_Sparse_Smart_Contours_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/sparse-smart-contours-to-represent-and-edit |
Repo | |
Framework | |