January 24, 2020

2382 words 12 mins read

Paper Group NANR 234

Paper Group NANR 234

Speech Recognition for Tigrinya language Using Deep Neural Network Approach. Neural Conversation Recommendation with Online Interaction Modeling. Generative Multi-View Human Action Recognition. Purchase as Reward : Session-based Recommendation by Imagination Reconstruction. Combining Unsupervised Pre-training and Annotator Rationales to Improve Low …

Speech Recognition for Tigrinya language Using Deep Neural Network Approach

Title Speech Recognition for Tigrinya language Using Deep Neural Network Approach
Authors Hafte Abera, Sebsibe H/mariam
Abstract This work presents a speech recognition model for Tigrinya language .The Deep Neural Network is used to make the recognition model. The Long Short-Term Memory Network (LSTM), which is a special kind of Recurrent Neural Network composed of Long Short-Term Memory blocks, is the primary layer of our neural network model. The 40-dimensional features are MFCC-LDA-MLLT-fMLLR with CMN were used. The acoustic models are trained on features that are obtained by projecting down to 40 dimensions using linear discriminant analysis (LDA). Moreover, speaker adaptive training (SAT) is done using a single feature-space maximum likelihood linear regression (FMLLR) transform estimated per speaker. We train and compare LSTM and DNN models at various numbers of parameters and configurations. We show that LSTM models converge quickly and give state of the art speech recognition performance for relatively small sized models. Finally, the accuracy of the model is evaluated based on the recognition rate.
Tasks Speech Recognition
Published 2019-08-01
URL https://www.aclweb.org/anthology/papers/W/W19/W19-3603/
PDF https://www.aclweb.org/anthology/W19-3603
PWC https://paperswithcode.com/paper/speech-recognition-for-tigrinya-language
Repo
Framework

Neural Conversation Recommendation with Online Interaction Modeling

Title Neural Conversation Recommendation with Online Interaction Modeling
Authors Xingshan Zeng, Jing Li, Lu Wang, Kam-Fai Wong
Abstract The prevalent use of social media leads to a vast amount of online conversations being produced on a daily basis. It presents a concrete challenge for individuals to better discover and engage in social media discussions. In this paper, we present a novel framework to automatically recommend conversations to users based on their prior conversation behaviors. Built on neural collaborative filtering, our model explores deep semantic features that measure how a user{'}s preferences match an ongoing conversation{'}s context. Furthermore, to identify salient characteristics from interleaving user interactions, our model incorporates graph-structured networks, where both replying relations and temporal features are encoded as conversation context. Experimental results on two large-scale datasets collected from Twitter and Reddit show that our model yields better performance than previous state-of-the-art models, which only utilize lexical features and ignore past user interactions in the conversations.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1470/
PDF https://www.aclweb.org/anthology/D19-1470
PWC https://paperswithcode.com/paper/neural-conversation-recommendation-with
Repo
Framework

Generative Multi-View Human Action Recognition

Title Generative Multi-View Human Action Recognition
Authors Lichen Wang, Zhengming Ding, Zhiqiang Tao, Yunyu Liu, Yun Fu
Abstract Multi-view action recognition targets to integrate complementary information from different views to improve classification performance. It is a challenging task due to the distinct gap between heterogeneous feature domains. Moreover, most existing methods neglect to consider the incomplete multi-view data, which limits their potential compatibility in real-world applications. In this work, we propose a Generative Multi-View Action Recognition (GMVAR) framework to address the challenges above. The adversarial generative network is leveraged to generate one view conditioning on the other view, which fully explores the latent connections in both intra-view and cross-view aspects. Our approach enhances the model robustness by employing adversarial training, and naturally handles the incomplete view case by imputing the missing data. Moreover, an effective View Correlation Discovery Network (VCDN) is proposed to further fuse the multi-view information in a higher-level label space. Extensive experiments demonstrate the effectiveness of our proposed approach by comparing with state-of-the-art algorithms.
Tasks Temporal Action Localization
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Wang_Generative_Multi-View_Human_Action_Recognition_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_Generative_Multi-View_Human_Action_Recognition_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/generative-multi-view-human-action
Repo
Framework

Purchase as Reward : Session-based Recommendation by Imagination Reconstruction

Title Purchase as Reward : Session-based Recommendation by Imagination Reconstruction
Authors Qibing Li, Xiaolin Zheng
Abstract One of the key challenges of session-based recommender systems is to enhance users’ purchase intentions. In this paper, we formulate the sequential interactions between user sessions and a recommender agent as a Markov Decision Process (MDP). In practice, the purchase reward is delayed and sparse, and may be buried by clicks, making it an impoverished signal for policy learning. Inspired by the prediction error minimization (PEM) and embodied cognition, we propose a simple architecture to augment reward, namely Imagination Reconstruction Network (IRN). Specifically, IRN enables the agent to explore its environment and learn predictive representations via three key components. The imagination core generates predicted trajectories, i.e., imagined items that users may purchase. The trajectory manager controls the granularity of imagined trajectories using the planning strategies, which balances the long-term rewards and short-term rewards. To optimize the action policy, the imagination-augmented executor minimizes the intrinsic imagination error of simulated trajectories by self-supervised reconstruction, while maximizing the extrinsic reward using model-free algorithms. Empirically, IRN promotes quicker adaptation to user interest, and shows improved robustness to the cold-start scenario and ultimately higher purchase performance compared to several baselines. Somewhat surprisingly, IRN using only the purchase reward achieves excellent next-click prediction performance, demonstrating that the agent can “guess what you like” via internal planning.
Tasks Recommendation Systems, Session-Based Recommendations
Published 2019-05-01
URL https://openreview.net/forum?id=SkfTIj0cKX
PDF https://openreview.net/pdf?id=SkfTIj0cKX
PWC https://paperswithcode.com/paper/purchase-as-reward-session-based
Repo
Framework

Combining Unsupervised Pre-training and Annotator Rationales to Improve Low-shot Text Classification

Title Combining Unsupervised Pre-training and Annotator Rationales to Improve Low-shot Text Classification
Authors Oren Melamud, Mihaela Bornea, Ken Barker
Abstract Supervised learning models often perform poorly at low-shot tasks, i.e. tasks for which little labeled data is available for training. One prominent approach for improving low-shot learning is to use unsupervised pre-trained neural models. Another approach is to obtain richer supervision by collecting annotator rationales (explanations supporting label annotations). In this work, we combine these two approaches to improve low-shot text classification with two novel methods: a simple bag-of-words embedding approach; and a more complex context-aware method, based on the BERT model. In experiments with two English text classification datasets, we demonstrate substantial performance gains from combining pre-training with rationales. Furthermore, our investigation of a range of train-set sizes reveals that the simple bag-of-words approach is the clear top performer when there are only a few dozen training instances or less, while more complex models, such as BERT or CNN, require more training data to shine.
Tasks Text Classification
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1401/
PDF https://www.aclweb.org/anthology/D19-1401
PWC https://paperswithcode.com/paper/combining-unsupervised-pre-training-and
Repo
Framework

Robustness Certification with Refinement

Title Robustness Certification with Refinement
Authors Gagandeep Singh, Timon Gehr, Markus Püschel, Martin Vechev
Abstract We present a novel approach for verification of neural networks which combines scalable over-approximation methods with precise (mixed integer) linear programming. This results in significantly better precision than state of the art verifiers on feed forward neural networks with piecewise linear activation functions.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=HJgeEh09KQ
PDF https://openreview.net/pdf?id=HJgeEh09KQ
PWC https://paperswithcode.com/paper/robustness-certification-with-refinement
Repo
Framework

A Preliminary Plains Cree Speech Synthesizer

Title A Preliminary Plains Cree Speech Synthesizer
Authors Atticus Harrigan, Antti Arppe, Timothy Mills
Abstract
Tasks
Published 2019-02-01
URL https://www.aclweb.org/anthology/W19-6009/
PDF https://www.aclweb.org/anthology/W19-6009
PWC https://paperswithcode.com/paper/a-preliminary-plains-cree-speech-synthesizer
Repo
Framework

Proceedings of the Second Workshop on Storytelling

Title Proceedings of the Second Workshop on Storytelling
Authors
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3400/
PDF https://www.aclweb.org/anthology/W19-3400
PWC https://paperswithcode.com/paper/proceedings-of-the-second-workshop-on-16
Repo
Framework

Learning Latent Global Network for Skeleton-based Action Prediction

Title Learning Latent Global Network for Skeleton-based Action Prediction
Authors Qiuhong Ke, Mohammed Bennamoun, Hossein Rahmani, Senjian An, Ferdous Sohel, Farid Boussaid
Abstract Human actions represented with 3D skeleton sequences are robust to clustered backgrounds and illumination changes. In this paper, we investigate skeleton-based action prediction, which aims to recognize an action from a partial skeleton sequence that contains incomplete action information. We propose a new Latent Global Network based on adversarial learning for action prediction. We demonstrate that the proposed network provides latent long-term global information that is complementary to the local action information of the partial sequences and helps improve action prediction. We show that action prediction can be improved by combining the latent global information with the local action information. We test the proposed method on three challenging skeleton datasets and report state-of-the-art performance.
Tasks Skeleton Based Action Recognition
Published 2019-09-02
URL https://doi.org/10.1109/TIP.2019.2937757
PDF https://eprints.lancs.ac.uk/id/eprint/136503/1/bare_jrnl7.pdf
PWC https://paperswithcode.com/paper/learning-latent-global-network-for-skeleton
Repo
Framework

Continual Learning via Explicit Structure Learning

Title Continual Learning via Explicit Structure Learning
Authors Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, Caiming Xiong
Abstract Despite recent advances in deep learning, neural networks suffer catastrophic forgetting when tasks are learned sequentially. We propose a conceptually simple and general framework for continual learning, where structure optimization is considered explicitly during learning. We implement this idea by separating the structure and parameter learning. During structure learning, the model optimizes for the best structure for the current task. The model learns when to reuse or modify structure from previous tasks, or create new ones when necessary. The model parameters are then estimated with the optimal structure. Empirically, we found that our approach leads to sensible structures when learning multiple tasks continuously. Additionally, catastrophic forgetting is also largely alleviated from explicit learning of structures. Our method also outperforms all other baselines on the permuted MNIST and split CIFAR datasets in continual learning setting.
Tasks Continual Learning
Published 2019-05-01
URL https://openreview.net/forum?id=ryxsS3A5Km
PDF https://openreview.net/pdf?id=ryxsS3A5Km
PWC https://paperswithcode.com/paper/continual-learning-via-explicit-structure
Repo
Framework

Cross-Task Knowledge Transfer for Query-Based Text Summarization

Title Cross-Task Knowledge Transfer for Query-Based Text Summarization
Authors Elozino Egonmwan, Vittorio Castelli, Md Arafat Sultan
Abstract We demonstrate the viability of knowledge transfer between two related tasks: machine reading comprehension (MRC) and query-based text summarization. Using an MRC model trained on the SQuAD1.1 dataset as a core system component, we first build an extractive query-based summarizer. For better precision, this summarizer also compresses the output of the MRC model using a novel sentence compression technique. We further leverage pre-trained machine translation systems to abstract our extracted summaries. Our models achieve state-of-the-art results on the publicly available CNN/Daily Mail and Debatepedia datasets, and can serve as simple yet powerful baselines for future systems. We also hope that these results will encourage research on transfer learning from large MRC corpora to query-based summarization.
Tasks Machine Reading Comprehension, Machine Translation, Reading Comprehension, Sentence Compression, Text Summarization, Transfer Learning
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5810/
PDF https://www.aclweb.org/anthology/D19-5810
PWC https://paperswithcode.com/paper/cross-task-knowledge-transfer-for-query-based
Repo
Framework

Evaluating Conjunction Disambiguation on English-to-German and French-to-German WMT 2019 Translation Hypotheses

Title Evaluating Conjunction Disambiguation on English-to-German and French-to-German WMT 2019 Translation Hypotheses
Authors Maja Popovi{'c}
Abstract We present a test set for evaluating an MT system{'}s capability to translate ambiguous conjunctions depending on the sentence structure. We concentrate on the English conjunction {}but{''} and its French equivalent {}mais{''} which can be translated into two different German conjunctions. We evaluate all English-to-German and French-to-German submissions to the WMT 2019 shared translation task. The evaluation is done mainly automatically, with additional fast manual inspection of unclear cases. All systems almost perfectly recognise the target conjunction {}aber{''}, whereas accuracies for the other target conjunction {}sondern{''} range from 78{%} to 97{%}, and the errors are mostly caused by replacing it with the alternative conjunction {}aber{''}. The best performing system for both language pairs is a multilingual Transformer {}TartuNLP{''} system trained on all WMT 2019 language pairs which use the Latin script, indicating that the multilingual approach is beneficial for conjunction disambiguation. As for other system features, such as using synthetic back-translated data, context-aware, hybrid, etc., no particular (dis)advantages can be observed. Qualitative manual inspection of translation hypotheses shown that highly ranked systems generally produce translations with high adequacy and fluency, meaning that these systems are not only capable of capturing the right conjunction whereas the rest of the translation hypothesis is poor. On the other hand, the low ranked systems generally exhibit lower fluency and poor adequacy.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5353/
PDF https://www.aclweb.org/anthology/W19-5353
PWC https://paperswithcode.com/paper/evaluating-conjunction-disambiguation-on
Repo
Framework

Generative Question Answering: Learning to Answer the Whole Question

Title Generative Question Answering: Learning to Answer the Whole Question
Authors Mike Lewis, Angela Fan
Abstract Discriminative question answering models can overfit to superficial biases in datasets, because their loss function saturates when any clue makes the answer likely. We introduce generative models of the joint distribution of questions and answers, which are trained to explain the whole question, not just to answer it.Our question answering (QA) model is implemented by learning a prior over answers, and a conditional language model to generate the question given the answer—allowing scalable and interpretable many-hop reasoning as the question is generated word-by-word. Our model achieves competitive performance with specialised discriminative models on the SQUAD and CLEVR benchmarks, indicating that it is a more general architecture for language understanding and reasoning than previous work. The model greatly improves generalisation both from biased training data and to adversarial testing data, achieving a new state-of-the-art on ADVERSARIAL SQUAD. We will release our code.
Tasks Language Modelling, Question Answering
Published 2019-05-01
URL https://openreview.net/forum?id=Bkx0RjA9tX
PDF https://openreview.net/pdf?id=Bkx0RjA9tX
PWC https://paperswithcode.com/paper/generative-question-answering-learning-to
Repo
Framework

Not Using the Car to See the Sidewalk – Quantifying and Controlling the Effects of Context in Classification and Segmentation

Title Not Using the Car to See the Sidewalk – Quantifying and Controlling the Effects of Context in Classification and Segmentation
Authors Rakshith Shetty, Bernt Schiele, Mario Fritz
Abstract Importance of visual context in scene understanding tasks is well recognized in the computer vision community. However, to what extent the computer vision models are dependent on the context to make their predictions is unclear. A model overly relying on context will fail when encountering objects in different contexts than in training data and hence it is important to identify these dependencies before we can deploy the models in the real-world. We propose a method to quantify the sensitivity of black-box vision models to visual context by editing images to remove selected objects and measuring the response of the target models. We apply this methodology on two tasks, image classification and semantic segmentation, and discover undesirable dependency between objects and context, for example that “sidewalk” segmentation is very sensitive to the presence of “cars” in the image. We propose an object removal based data augmentation solution to mitigate this dependency and increase the robustness of classification and segmentation models to contextual variations. Our experiments show that the proposed data augmentation helps these models improve the performance in out-of-context scenarios, while preserving the performance on regular data.
Tasks Data Augmentation, Image Classification, Scene Understanding, Semantic Segmentation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Shetty_Not_Using_the_Car_to_See_the_Sidewalk_--_Quantifying_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Shetty_Not_Using_the_Car_to_See_the_Sidewalk_--_Quantifying_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/not-using-the-car-to-see-the-sidewalk-1
Repo
Framework

Proceedings of the Third Workshop on Abusive Language Online

Title Proceedings of the Third Workshop on Abusive Language Online
Authors
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3500/
PDF https://www.aclweb.org/anthology/W19-3500
PWC https://paperswithcode.com/paper/proceedings-of-the-third-workshop-on-abusive
Repo
Framework
comments powered by Disqus