Paper Group NANR 234
Speech Recognition for Tigrinya language Using Deep Neural Network Approach. Neural Conversation Recommendation with Online Interaction Modeling. Generative Multi-View Human Action Recognition. Purchase as Reward : Session-based Recommendation by Imagination Reconstruction. Combining Unsupervised Pre-training and Annotator Rationales to Improve Low …
Speech Recognition for Tigrinya language Using Deep Neural Network Approach
Title | Speech Recognition for Tigrinya language Using Deep Neural Network Approach |
Authors | Hafte Abera, Sebsibe H/mariam |
Abstract | This work presents a speech recognition model for Tigrinya language .The Deep Neural Network is used to make the recognition model. The Long Short-Term Memory Network (LSTM), which is a special kind of Recurrent Neural Network composed of Long Short-Term Memory blocks, is the primary layer of our neural network model. The 40-dimensional features are MFCC-LDA-MLLT-fMLLR with CMN were used. The acoustic models are trained on features that are obtained by projecting down to 40 dimensions using linear discriminant analysis (LDA). Moreover, speaker adaptive training (SAT) is done using a single feature-space maximum likelihood linear regression (FMLLR) transform estimated per speaker. We train and compare LSTM and DNN models at various numbers of parameters and configurations. We show that LSTM models converge quickly and give state of the art speech recognition performance for relatively small sized models. Finally, the accuracy of the model is evaluated based on the recognition rate. |
Tasks | Speech Recognition |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/papers/W/W19/W19-3603/ |
https://www.aclweb.org/anthology/W19-3603 | |
PWC | https://paperswithcode.com/paper/speech-recognition-for-tigrinya-language |
Repo | |
Framework | |
Neural Conversation Recommendation with Online Interaction Modeling
Title | Neural Conversation Recommendation with Online Interaction Modeling |
Authors | Xingshan Zeng, Jing Li, Lu Wang, Kam-Fai Wong |
Abstract | The prevalent use of social media leads to a vast amount of online conversations being produced on a daily basis. It presents a concrete challenge for individuals to better discover and engage in social media discussions. In this paper, we present a novel framework to automatically recommend conversations to users based on their prior conversation behaviors. Built on neural collaborative filtering, our model explores deep semantic features that measure how a user{'}s preferences match an ongoing conversation{'}s context. Furthermore, to identify salient characteristics from interleaving user interactions, our model incorporates graph-structured networks, where both replying relations and temporal features are encoded as conversation context. Experimental results on two large-scale datasets collected from Twitter and Reddit show that our model yields better performance than previous state-of-the-art models, which only utilize lexical features and ignore past user interactions in the conversations. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1470/ |
https://www.aclweb.org/anthology/D19-1470 | |
PWC | https://paperswithcode.com/paper/neural-conversation-recommendation-with |
Repo | |
Framework | |
Generative Multi-View Human Action Recognition
Title | Generative Multi-View Human Action Recognition |
Authors | Lichen Wang, Zhengming Ding, Zhiqiang Tao, Yunyu Liu, Yun Fu |
Abstract | Multi-view action recognition targets to integrate complementary information from different views to improve classification performance. It is a challenging task due to the distinct gap between heterogeneous feature domains. Moreover, most existing methods neglect to consider the incomplete multi-view data, which limits their potential compatibility in real-world applications. In this work, we propose a Generative Multi-View Action Recognition (GMVAR) framework to address the challenges above. The adversarial generative network is leveraged to generate one view conditioning on the other view, which fully explores the latent connections in both intra-view and cross-view aspects. Our approach enhances the model robustness by employing adversarial training, and naturally handles the incomplete view case by imputing the missing data. Moreover, an effective View Correlation Discovery Network (VCDN) is proposed to further fuse the multi-view information in a higher-level label space. Extensive experiments demonstrate the effectiveness of our proposed approach by comparing with state-of-the-art algorithms. |
Tasks | Temporal Action Localization |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Wang_Generative_Multi-View_Human_Action_Recognition_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_Generative_Multi-View_Human_Action_Recognition_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/generative-multi-view-human-action |
Repo | |
Framework | |
Purchase as Reward : Session-based Recommendation by Imagination Reconstruction
Title | Purchase as Reward : Session-based Recommendation by Imagination Reconstruction |
Authors | Qibing Li, Xiaolin Zheng |
Abstract | One of the key challenges of session-based recommender systems is to enhance users’ purchase intentions. In this paper, we formulate the sequential interactions between user sessions and a recommender agent as a Markov Decision Process (MDP). In practice, the purchase reward is delayed and sparse, and may be buried by clicks, making it an impoverished signal for policy learning. Inspired by the prediction error minimization (PEM) and embodied cognition, we propose a simple architecture to augment reward, namely Imagination Reconstruction Network (IRN). Specifically, IRN enables the agent to explore its environment and learn predictive representations via three key components. The imagination core generates predicted trajectories, i.e., imagined items that users may purchase. The trajectory manager controls the granularity of imagined trajectories using the planning strategies, which balances the long-term rewards and short-term rewards. To optimize the action policy, the imagination-augmented executor minimizes the intrinsic imagination error of simulated trajectories by self-supervised reconstruction, while maximizing the extrinsic reward using model-free algorithms. Empirically, IRN promotes quicker adaptation to user interest, and shows improved robustness to the cold-start scenario and ultimately higher purchase performance compared to several baselines. Somewhat surprisingly, IRN using only the purchase reward achieves excellent next-click prediction performance, demonstrating that the agent can “guess what you like” via internal planning. |
Tasks | Recommendation Systems, Session-Based Recommendations |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SkfTIj0cKX |
https://openreview.net/pdf?id=SkfTIj0cKX | |
PWC | https://paperswithcode.com/paper/purchase-as-reward-session-based |
Repo | |
Framework | |
Combining Unsupervised Pre-training and Annotator Rationales to Improve Low-shot Text Classification
Title | Combining Unsupervised Pre-training and Annotator Rationales to Improve Low-shot Text Classification |
Authors | Oren Melamud, Mihaela Bornea, Ken Barker |
Abstract | Supervised learning models often perform poorly at low-shot tasks, i.e. tasks for which little labeled data is available for training. One prominent approach for improving low-shot learning is to use unsupervised pre-trained neural models. Another approach is to obtain richer supervision by collecting annotator rationales (explanations supporting label annotations). In this work, we combine these two approaches to improve low-shot text classification with two novel methods: a simple bag-of-words embedding approach; and a more complex context-aware method, based on the BERT model. In experiments with two English text classification datasets, we demonstrate substantial performance gains from combining pre-training with rationales. Furthermore, our investigation of a range of train-set sizes reveals that the simple bag-of-words approach is the clear top performer when there are only a few dozen training instances or less, while more complex models, such as BERT or CNN, require more training data to shine. |
Tasks | Text Classification |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1401/ |
https://www.aclweb.org/anthology/D19-1401 | |
PWC | https://paperswithcode.com/paper/combining-unsupervised-pre-training-and |
Repo | |
Framework | |
Robustness Certification with Refinement
Title | Robustness Certification with Refinement |
Authors | Gagandeep Singh, Timon Gehr, Markus Püschel, Martin Vechev |
Abstract | We present a novel approach for verification of neural networks which combines scalable over-approximation methods with precise (mixed integer) linear programming. This results in significantly better precision than state of the art verifiers on feed forward neural networks with piecewise linear activation functions. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=HJgeEh09KQ |
https://openreview.net/pdf?id=HJgeEh09KQ | |
PWC | https://paperswithcode.com/paper/robustness-certification-with-refinement |
Repo | |
Framework | |
A Preliminary Plains Cree Speech Synthesizer
Title | A Preliminary Plains Cree Speech Synthesizer |
Authors | Atticus Harrigan, Antti Arppe, Timothy Mills |
Abstract | |
Tasks | |
Published | 2019-02-01 |
URL | https://www.aclweb.org/anthology/W19-6009/ |
https://www.aclweb.org/anthology/W19-6009 | |
PWC | https://paperswithcode.com/paper/a-preliminary-plains-cree-speech-synthesizer |
Repo | |
Framework | |
Proceedings of the Second Workshop on Storytelling
Title | Proceedings of the Second Workshop on Storytelling |
Authors | |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3400/ |
https://www.aclweb.org/anthology/W19-3400 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-second-workshop-on-16 |
Repo | |
Framework | |
Learning Latent Global Network for Skeleton-based Action Prediction
Title | Learning Latent Global Network for Skeleton-based Action Prediction |
Authors | Qiuhong Ke, Mohammed Bennamoun, Hossein Rahmani, Senjian An, Ferdous Sohel, Farid Boussaid |
Abstract | Human actions represented with 3D skeleton sequences are robust to clustered backgrounds and illumination changes. In this paper, we investigate skeleton-based action prediction, which aims to recognize an action from a partial skeleton sequence that contains incomplete action information. We propose a new Latent Global Network based on adversarial learning for action prediction. We demonstrate that the proposed network provides latent long-term global information that is complementary to the local action information of the partial sequences and helps improve action prediction. We show that action prediction can be improved by combining the latent global information with the local action information. We test the proposed method on three challenging skeleton datasets and report state-of-the-art performance. |
Tasks | Skeleton Based Action Recognition |
Published | 2019-09-02 |
URL | https://doi.org/10.1109/TIP.2019.2937757 |
https://eprints.lancs.ac.uk/id/eprint/136503/1/bare_jrnl7.pdf | |
PWC | https://paperswithcode.com/paper/learning-latent-global-network-for-skeleton |
Repo | |
Framework | |
Continual Learning via Explicit Structure Learning
Title | Continual Learning via Explicit Structure Learning |
Authors | Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, Caiming Xiong |
Abstract | Despite recent advances in deep learning, neural networks suffer catastrophic forgetting when tasks are learned sequentially. We propose a conceptually simple and general framework for continual learning, where structure optimization is considered explicitly during learning. We implement this idea by separating the structure and parameter learning. During structure learning, the model optimizes for the best structure for the current task. The model learns when to reuse or modify structure from previous tasks, or create new ones when necessary. The model parameters are then estimated with the optimal structure. Empirically, we found that our approach leads to sensible structures when learning multiple tasks continuously. Additionally, catastrophic forgetting is also largely alleviated from explicit learning of structures. Our method also outperforms all other baselines on the permuted MNIST and split CIFAR datasets in continual learning setting. |
Tasks | Continual Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=ryxsS3A5Km |
https://openreview.net/pdf?id=ryxsS3A5Km | |
PWC | https://paperswithcode.com/paper/continual-learning-via-explicit-structure |
Repo | |
Framework | |
Cross-Task Knowledge Transfer for Query-Based Text Summarization
Title | Cross-Task Knowledge Transfer for Query-Based Text Summarization |
Authors | Elozino Egonmwan, Vittorio Castelli, Md Arafat Sultan |
Abstract | We demonstrate the viability of knowledge transfer between two related tasks: machine reading comprehension (MRC) and query-based text summarization. Using an MRC model trained on the SQuAD1.1 dataset as a core system component, we first build an extractive query-based summarizer. For better precision, this summarizer also compresses the output of the MRC model using a novel sentence compression technique. We further leverage pre-trained machine translation systems to abstract our extracted summaries. Our models achieve state-of-the-art results on the publicly available CNN/Daily Mail and Debatepedia datasets, and can serve as simple yet powerful baselines for future systems. We also hope that these results will encourage research on transfer learning from large MRC corpora to query-based summarization. |
Tasks | Machine Reading Comprehension, Machine Translation, Reading Comprehension, Sentence Compression, Text Summarization, Transfer Learning |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5810/ |
https://www.aclweb.org/anthology/D19-5810 | |
PWC | https://paperswithcode.com/paper/cross-task-knowledge-transfer-for-query-based |
Repo | |
Framework | |
Evaluating Conjunction Disambiguation on English-to-German and French-to-German WMT 2019 Translation Hypotheses
Title | Evaluating Conjunction Disambiguation on English-to-German and French-to-German WMT 2019 Translation Hypotheses |
Authors | Maja Popovi{'c} |
Abstract | We present a test set for evaluating an MT system{'}s capability to translate ambiguous conjunctions depending on the sentence structure. We concentrate on the English conjunction {}but{''} and its French equivalent { }mais{''} which can be translated into two different German conjunctions. We evaluate all English-to-German and French-to-German submissions to the WMT 2019 shared translation task. The evaluation is done mainly automatically, with additional fast manual inspection of unclear cases. All systems almost perfectly recognise the target conjunction {}aber{''}, whereas accuracies for the other target conjunction { }sondern{''} range from 78{%} to 97{%}, and the errors are mostly caused by replacing it with the alternative conjunction {}aber{''}. The best performing system for both language pairs is a multilingual Transformer { }TartuNLP{''} system trained on all WMT 2019 language pairs which use the Latin script, indicating that the multilingual approach is beneficial for conjunction disambiguation. As for other system features, such as using synthetic back-translated data, context-aware, hybrid, etc., no particular (dis)advantages can be observed. Qualitative manual inspection of translation hypotheses shown that highly ranked systems generally produce translations with high adequacy and fluency, meaning that these systems are not only capable of capturing the right conjunction whereas the rest of the translation hypothesis is poor. On the other hand, the low ranked systems generally exhibit lower fluency and poor adequacy. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5353/ |
https://www.aclweb.org/anthology/W19-5353 | |
PWC | https://paperswithcode.com/paper/evaluating-conjunction-disambiguation-on |
Repo | |
Framework | |
Generative Question Answering: Learning to Answer the Whole Question
Title | Generative Question Answering: Learning to Answer the Whole Question |
Authors | Mike Lewis, Angela Fan |
Abstract | Discriminative question answering models can overfit to superficial biases in datasets, because their loss function saturates when any clue makes the answer likely. We introduce generative models of the joint distribution of questions and answers, which are trained to explain the whole question, not just to answer it.Our question answering (QA) model is implemented by learning a prior over answers, and a conditional language model to generate the question given the answer—allowing scalable and interpretable many-hop reasoning as the question is generated word-by-word. Our model achieves competitive performance with specialised discriminative models on the SQUAD and CLEVR benchmarks, indicating that it is a more general architecture for language understanding and reasoning than previous work. The model greatly improves generalisation both from biased training data and to adversarial testing data, achieving a new state-of-the-art on ADVERSARIAL SQUAD. We will release our code. |
Tasks | Language Modelling, Question Answering |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Bkx0RjA9tX |
https://openreview.net/pdf?id=Bkx0RjA9tX | |
PWC | https://paperswithcode.com/paper/generative-question-answering-learning-to |
Repo | |
Framework | |
Not Using the Car to See the Sidewalk – Quantifying and Controlling the Effects of Context in Classification and Segmentation
Title | Not Using the Car to See the Sidewalk – Quantifying and Controlling the Effects of Context in Classification and Segmentation |
Authors | Rakshith Shetty, Bernt Schiele, Mario Fritz |
Abstract | Importance of visual context in scene understanding tasks is well recognized in the computer vision community. However, to what extent the computer vision models are dependent on the context to make their predictions is unclear. A model overly relying on context will fail when encountering objects in different contexts than in training data and hence it is important to identify these dependencies before we can deploy the models in the real-world. We propose a method to quantify the sensitivity of black-box vision models to visual context by editing images to remove selected objects and measuring the response of the target models. We apply this methodology on two tasks, image classification and semantic segmentation, and discover undesirable dependency between objects and context, for example that “sidewalk” segmentation is very sensitive to the presence of “cars” in the image. We propose an object removal based data augmentation solution to mitigate this dependency and increase the robustness of classification and segmentation models to contextual variations. Our experiments show that the proposed data augmentation helps these models improve the performance in out-of-context scenarios, while preserving the performance on regular data. |
Tasks | Data Augmentation, Image Classification, Scene Understanding, Semantic Segmentation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Shetty_Not_Using_the_Car_to_See_the_Sidewalk_--_Quantifying_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Shetty_Not_Using_the_Car_to_See_the_Sidewalk_--_Quantifying_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/not-using-the-car-to-see-the-sidewalk-1 |
Repo | |
Framework | |
Proceedings of the Third Workshop on Abusive Language Online
Title | Proceedings of the Third Workshop on Abusive Language Online |
Authors | |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3500/ |
https://www.aclweb.org/anthology/W19-3500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-third-workshop-on-abusive |
Repo | |
Framework | |