Paper Group NANR 77
What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets. Unsupervised Bilingual Lexicon Induction via Latent Variable Models. Complex Word Identification: Convolutional Neural Network vs. Feature Engineering. CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities. Noise-Base …
What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
Title | What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets |
Authors | De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Lorenzo Torresani, Manohar Paluri, Li Fei-Fei, Juan Carlos Niebles |
Abstract | The ability to capture temporal information has been critical to the development of video understanding models. While there have been numerous attempts at modeling motion in videos, an explicit analysis of the effect of temporal information for video understanding is still missing. In this work, we aim to bridge this gap and ask the following question: How important is the motion in the video for recognizing the action? To this end, we propose two novel frameworks: (i) class-agnostic temporal generator and (ii) motion-invariant frame selector to reduce/remove motion for an ablation analysis without introducing other artifacts. This isolates the analysis of motion from other aspects of the video. The proposed frameworks provide a much tighter estimate of the effect of motion (from 25% to 6% on UCF101 and 15% to 5% on Kinetics) compared to baselines in our analysis. Our analysis provides critical insights about existing models like C3D, and how it could be made to achieve comparable results with a sparser set of frames. |
Tasks | Video Understanding |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Huang_What_Makes_a_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Huang_What_Makes_a_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/what-makes-a-video-a-video-analyzing-temporal |
Repo | |
Framework | |
Unsupervised Bilingual Lexicon Induction via Latent Variable Models
Title | Unsupervised Bilingual Lexicon Induction via Latent Variable Models |
Authors | Zi-Yi Dou, Zhi-Hao Zhou, Shujian Huang |
Abstract | Bilingual lexicon extraction has been studied for decades and most previous methods have relied on parallel corpora or bilingual dictionaries. Recent studies have shown that it is possible to build a bilingual dictionary by aligning monolingual word embedding spaces in an unsupervised way. With the recent advances in generative models, we propose a novel approach which builds cross-lingual dictionaries via latent variable models and adversarial training with no parallel corpora. To demonstrate the effectiveness of our approach, we evaluate our approach on several language pairs and the experimental results show that our model could achieve competitive and even superior performance compared with several state-of-the-art models. |
Tasks | Latent Variable Models, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1062/ |
https://www.aclweb.org/anthology/D18-1062 | |
PWC | https://paperswithcode.com/paper/unsupervised-bilingual-lexicon-induction-via |
Repo | |
Framework | |
Complex Word Identification: Convolutional Neural Network vs. Feature Engineering
Title | Complex Word Identification: Convolutional Neural Network vs. Feature Engineering |
Authors | Segun Taofeek Aroyehun, Jason Angel, Daniel Alej P{'e}rez Alvarez, ro, Alex Gelbukh, er |
Abstract | We describe the systems of NLP-CIC team that participated in the Complex Word Identification (CWI) 2018 shared task. The shared task aimed to benchmark approaches for identifying complex words in English and other languages from the perspective of non-native speakers. Our goal is to compare two approaches: feature engineering and a deep neural network. Both approaches achieved comparable performance on the English test set. We demonstrated the flexibility of the deep-learning approach by using the same deep neural network setup in the Spanish track. Our systems achieved competitive results: all our systems were within 0.01 of the system with the best macro-F1 score on the test sets except on Wikipedia test set, on which our best system is 0.04 below the best macro-F1 score. |
Tasks | Complex Word Identification, Feature Engineering, Text Simplification |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0538/ |
https://www.aclweb.org/anthology/W18-0538 | |
PWC | https://paperswithcode.com/paper/complex-word-identification-convolutional |
Repo | |
Framework | |
CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities
Title | CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities |
Authors | Yiyun Liang, Zhucheng Tu, Laetitia Huang, Jimmy Lin |
Abstract | We demonstrate a JavaScript implementation of a convolutional neural network that performs feedforward inference completely in the browser. Such a deployment means that models can run completely on the client, on a wide range of devices, without making backend server requests. This design is useful for applications with stringent latency requirements or low connectivity. Our evaluations show the feasibility of JavaScript as a deployment target. Furthermore, an in-browser implementation enables seamless integration with the JavaScript ecosystem for information visualization, providing opportunities to visually inspect neural networks and better understand their inner workings. |
Tasks | Interpretable Machine Learning, Sentence Classification, Sentiment Analysis |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-5013/ |
https://www.aclweb.org/anthology/N18-5013 | |
PWC | https://paperswithcode.com/paper/cnns-for-nlp-in-the-browser-client-side |
Repo | |
Framework | |
Noise-Based Regularizers for Recurrent Neural Networks
Title | Noise-Based Regularizers for Recurrent Neural Networks |
Authors | Adji B. Dieng, Jaan Altosaar, Rajesh Ranganath, David M. Blei |
Abstract | Recurrent neural networks (RNNs) are powerful models for sequential data. They can approximate arbitrary computations, and have been used successfully in domains such as text and speech. However, the flexibility of RNNs makes them susceptible to overfitting and regularization is important. We develop a noise-based regularization method for RNNs. The idea is simple and easy to implement: we inject noise in the hidden units of the RNN and then maximize the original RNN’s likelihood averaged over the injected noise. On a language modeling benchmark, our method achieves better performance than the deterministic RNN and the variational dropout. |
Tasks | Language Modelling |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=ryk77mbRZ |
https://openreview.net/pdf?id=ryk77mbRZ | |
PWC | https://paperswithcode.com/paper/noise-based-regularizers-for-recurrent-neural |
Repo | |
Framework | |
PubSE: A Hierarchical Model for Publication Extraction from Academic Homepages
Title | PubSE: A Hierarchical Model for Publication Extraction from Academic Homepages |
Authors | Yiqing Zhang, Jianzhong Qi, Rui Zhang, Chu Yin, ong |
Abstract | Publication information in a researcher{'}s academic homepage provides insights about the researcher{'}s expertise, research interests, and collaboration networks. We aim to extract all the publication strings from a given academic homepage. This is a challenging task because the publication strings in different academic homepages may be located at different positions with different structures. To capture the positional and structural diversity, we propose an end-to-end hierarchical model named PubSE based on Bi-LSTM-CRF. We further propose an alternating training method for training the model. Experiments on real data show that PubSE outperforms the state-of-the-art models by up to 11.8{%} in F1-score. |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1123/ |
https://www.aclweb.org/anthology/D18-1123 | |
PWC | https://paperswithcode.com/paper/pubse-a-hierarchical-model-for-publication |
Repo | |
Framework | |
Mumpitz at PARSEME Shared Task 2018: A Bidirectional LSTM for the Identification of Verbal Multiword Expressions
Title | Mumpitz at PARSEME Shared Task 2018: A Bidirectional LSTM for the Identification of Verbal Multiword Expressions |
Authors | Rafael Ehren, Timm Lichte, Younes Samih |
Abstract | In this paper, we describe Mumpitz, the system we submitted to the PARSEME Shared task on automatic identification of verbal multiword expressions (VMWEs). Mumpitz consists of a Bidirectional Recurrent Neural Network (BRNN) with Long Short-Term Memory (LSTM) units and a heuristic that leverages the dependency information provided in the PARSEME corpus data to differentiate VMWEs in a sentence. We submitted results for seven languages in the closed track of the task and for one language in the open track. For the open track we used the same system, but with pretrained instead of randomly initialized word embeddings to improve the system performance. |
Tasks | Machine Translation, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4929/ |
https://www.aclweb.org/anthology/W18-4929 | |
PWC | https://paperswithcode.com/paper/mumpitz-at-parseme-shared-task-2018-a |
Repo | |
Framework | |
Recognizing Human Actions as the Evolution of Pose Estimation Maps
Title | Recognizing Human Actions as the Evolution of Pose Estimation Maps |
Authors | Mengyuan Liu, Junsong Yuan |
Abstract | Most video-based action recognition approaches choose to extract features from the whole video to recognize actions. The cluttered background and non-action motions limit the performances of these methods, since they lack the explicit modeling of human body movements. With recent advances of human pose estimation, this work presents a novel method to recognize human action as the evolution of pose estimation maps. Instead of relying on the inaccurate human poses estimated from videos, we observe that pose estimation maps, the byproduct of pose estimation, preserve richer cues of human body to benefit action recognition. Specifically, the evolution of pose estimation maps can be decomposed as an evolution of heatmaps, e.g., probabilistic maps, and an evolution of estimated 2D human poses, which denote the changes of body shape and body pose, respectively. Considering the sparse property of heatmap, we develop spatial rank pooling to aggregate the evolution of heatmaps as a body shape evolution image. As body shape evolution image does not differentiate body parts, we design body guided sampling to aggregate the evolution of poses as a body pose evolution image. The complementary properties between both types of images are explored by deep convolutional neural networks to predict action label. Experiments on NTU RGB+D, UTD-MHAD and PennAction datasets verify the effectiveness of our method, which outperforms most state-of-the-art methods. |
Tasks | Action Recognition In Videos, Multimodal Activity Recognition, Pose Estimation, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Liu_Recognizing_Human_Actions_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Liu_Recognizing_Human_Actions_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/recognizing-human-actions-as-the-evolution-of |
Repo | |
Framework | |
Learning clip representations for skeleton-based 3d action recognition
Title | Learning clip representations for skeleton-based 3d action recognition |
Authors | Qiuhong Ke, Mohammed Bennamoun, Senjian An, Ferdous Sohel, Farid Boussaid |
Abstract | This paper presents a new representation of skeleton sequences for 3D action recognition. Existing methods based on hand-crafted features or recurrent neural networks cannot adequately capture the complex spatial structures and the long-term temporal dynamics of the skeleton sequences, which are very important to recognize the actions. In this paper, we propose to transform each channel of the 3D coordinates of a skeleton sequence into a clip. Each frame of the generated clip represents the temporal information of the entire skeleton sequence and one particular spatial relationship between the skeleton joints. The entire clip incorporates multiple frames with different spatial relationships, which provide useful spatial structural information of the human skeleton. We also propose a multitask convolutional neural network (MTCNN) to learn the generated clips for action recognition. The proposed MTCNN processes all the frames of the generated clips in parallel to explore the spatial and temporal information of the skeleton sequences. The proposed method has been extensively tested on six challenging benchmark datasets. Experimental results consistently demonstrate the superiority of the proposed clip representation and the feature learning method for 3D action recognition compared to the existing techniques. |
Tasks | 3D Human Action Recognition, Skeleton Based Action Recognition |
Published | 2018-03-05 |
URL | https://doi.org/10.1109/TIP.2018.2812099 |
https://www.semanticscholar.org/paper/Learning-Clip-Representations-for-Skeleton-Based-3D-Ke-Bennamoun/ef761435c1af2b3e5caba5e8bbbf5aeab69d934e | |
PWC | https://paperswithcode.com/paper/learning-clip-representations-for-skeleton |
Repo | |
Framework | |
A Unified Syntax-aware Framework for Semantic Role Labeling
Title | A Unified Syntax-aware Framework for Semantic Role Labeling |
Authors | Zuchao Li, Shexia He, Jiaxun Cai, Zhuosheng Zhang, Hai Zhao, Gongshen Liu, Linlin Li, Luo Si |
Abstract | Semantic role labeling (SRL) aims to recognize the predicate-argument structure of a sentence. Syntactic information has been paid a great attention over the role of enhancing SRL. However, the latest advance shows that syntax would not be so important for SRL with the emerging much smaller gap between syntax-aware and syntax-agnostic SRL. To comprehensively explore the role of syntax for SRL task, we extend existing models and propose a unified framework to investigate more effective and more diverse ways of incorporating syntax into sequential neural networks. Exploring the effect of syntactic input quality on SRL performance, we confirm that high-quality syntactic parse could still effectively enhance syntactically-driven SRL. Using empirically optimized integration strategy, we even enlarge the gap between syntax-aware and syntax-agnostic SRL. Our framework achieves state-of-the-art results on CoNLL-2009 benchmarks both for English and Chinese, substantially outperforming all previous models. |
Tasks | Machine Translation, Question Answering, Semantic Role Labeling |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1262/ |
https://www.aclweb.org/anthology/D18-1262 | |
PWC | https://paperswithcode.com/paper/a-unified-syntax-aware-framework-for-semantic |
Repo | |
Framework | |
Unbabel: How to combine AI with the crowd to scale professional-quality translation
Title | Unbabel: How to combine AI with the crowd to scale professional-quality translation |
Authors | Jo{~a}o Gra{\c{c}}a |
Abstract | |
Tasks | Automatic Post-Editing |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-2103/ |
https://www.aclweb.org/anthology/W18-2103 | |
PWC | https://paperswithcode.com/paper/unbabel-how-to-combine-ai-with-the-crowd-to |
Repo | |
Framework | |
Investigating the Challenges of Temporal Relation Extraction from Clinical Text
Title | Investigating the Challenges of Temporal Relation Extraction from Clinical Text |
Authors | Diana Galvan, Naoaki Okazaki, Koji Matsuda, Kentaro Inui |
Abstract | Temporal reasoning remains as an unsolved task for Natural Language Processing (NLP), particularly demonstrated in the clinical domain. The complexity of temporal representation in language is evident as results of the 2016 Clinical TempEval challenge indicate: the current state-of-the-art systems perform well in solving mention-identification tasks of event and time expressions but poorly in temporal relation extraction, showing a gap of around 0.25 point below human performance. We explore to adapt the tree-based LSTM-RNN model proposed by Miwa and Bansal (2016) to temporal relation extraction from clinical text, obtaining a five point improvement over the best 2016 Clinical TempEval system and two points over the state-of-the-art. We deliver a deep analysis of the results and discuss the next step towards human-like temporal reasoning. |
Tasks | Named Entity Recognition, Question Answering, Relation Extraction, Temporal Information Extraction, Text Summarization |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5607/ |
https://www.aclweb.org/anthology/W18-5607 | |
PWC | https://paperswithcode.com/paper/investigating-the-challenges-of-temporal |
Repo | |
Framework | |
Gaussian Process Neurons
Title | Gaussian Process Neurons |
Authors | Sebastian Urban, Patrick van der Smagt |
Abstract | We propose a method to learn stochastic activation functions for use in probabilistic neural networks. First, we develop a framework to embed stochastic activation functions based on Gaussian processes in probabilistic neural networks. Second, we analytically derive expressions for the propagation of means and covariances in such a network, thus allowing for an efficient implementation and training without the need for sampling. Third, we show how to apply variational Bayesian inference to regularize and efficiently train this model. The resulting model can deal with uncertain inputs and implicitly provides an estimate of the confidence of its predictions. Like a conventional neural network it can scale to datasets of arbitrary size and be extended with convolutional and recurrent connections, if desired. |
Tasks | Bayesian Inference, Gaussian Processes |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=By-IifZRW |
https://openreview.net/pdf?id=By-IifZRW | |
PWC | https://paperswithcode.com/paper/gaussian-process-neurons |
Repo | |
Framework | |
Exploring Lexical-Semantic Knowledge in the Generation of Novel Riddles in Portuguese
Title | Exploring Lexical-Semantic Knowledge in the Generation of Novel Riddles in Portuguese |
Authors | Hugo Gon{\c{c}}alo Oliveira, Ricardo Rodrigues |
Abstract | |
Tasks | Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6604/ |
https://www.aclweb.org/anthology/W18-6604 | |
PWC | https://paperswithcode.com/paper/exploring-lexical-semantic-knowledge-in-the |
Repo | |
Framework | |
Early action prediction by soft regression
Title | Early action prediction by soft regression |
Authors | Jian-Fang Hu, Wei-Shi Zheng, Lianyang Ma, Gang Wang, Jian-Huang Lai, Jianguo Zhang |
Abstract | We propose a novel approach for predicting on-going action with the assistance of a low-cost depth camera. Our approach introduces a soft regression-based early prediction framework. In this framework, we estimate soft labels for the subsequences at different progress levels, jointly learned with an action predictor. Our formulation of soft regression framework 1) overcomes a usual assumption in existing early action prediction systems that the progress level of on-going sequence is given in the testing stage; and 2) presents a theoretical framework to better resolve the ambiguity and uncertainty of subsequences at early performing stage. The proposed soft regression framework is further enhanced in order to take the relationships among subsequences and the discrepancy of soft labels over different classes into consideration, so that a Multiple Soft labels Recurrent Neural Network (MSRNN) is finally developed. For real-time performance, we also introduce “local accumulative frame feature (LAFF)", which can be computed efficiently by constructing an integral feature map. Our experiments on three RGB-D benchmark datasets and an unconstrained RGB action set demonstrate that the proposed regression-based early action prediction model outperforms existing models and the early action prediction on RGB-D sequence is more accurate than that on RGB channel. |
Tasks | Skeleton Based Action Recognition |
Published | 2018-08-06 |
URL | https://doi.org/10.1109/TPAMI.2018.2863279 |
https://discovery.dundee.ac.uk/ws/portalfiles/portal/28028712/Early_Action_Prediction_by_Soft_Regression.pdf | |
PWC | https://paperswithcode.com/paper/early-action-prediction-by-soft-regression |
Repo | |
Framework | |