January 24, 2020

2382 words 12 mins read

Paper Group NANR 234

Speech Recognition for Tigrinya language Using Deep Neural Network Approach. Neural Conversation Recommendation with Online Interaction Modeling. Generative Multi-View Human Action Recognition. Purchase as Reward : Session-based Recommendation by Imagination Reconstruction. Combining Unsupervised Pre-training and Annotator Rationales to Improve Low …

Speech Recognition for Tigrinya language Using Deep Neural Network Approach


Title	Speech Recognition for Tigrinya language Using Deep Neural Network Approach
Authors	Hafte Abera, Sebsibe H/mariam
Abstract	This work presents a speech recognition model for Tigrinya language .The Deep Neural Network is used to make the recognition model. The Long Short-Term Memory Network (LSTM), which is a special kind of Recurrent Neural Network composed of Long Short-Term Memory blocks, is the primary layer of our neural network model. The 40-dimensional features are MFCC-LDA-MLLT-fMLLR with CMN were used. The acoustic models are trained on features that are obtained by projecting down to 40 dimensions using linear discriminant analysis (LDA). Moreover, speaker adaptive training (SAT) is done using a single feature-space maximum likelihood linear regression (FMLLR) transform estimated per speaker. We train and compare LSTM and DNN models at various numbers of parameters and configurations. We show that LSTM models converge quickly and give state of the art speech recognition performance for relatively small sized models. Finally, the accuracy of the model is evaluated based on the recognition rate.
Tasks	Speech Recognition
Published	2019-08-01
URL	https://www.aclweb.org/anthology/papers/W/W19/W19-3603/
PDF	https://www.aclweb.org/anthology/W19-3603
PWC	https://paperswithcode.com/paper/speech-recognition-for-tigrinya-language
Repo
Framework

Neural Conversation Recommendation with Online Interaction Modeling


Title	Neural Conversation Recommendation with Online Interaction Modeling
Authors	Xingshan Zeng, Jing Li, Lu Wang, Kam-Fai Wong
Abstract	The prevalent use of social media leads to a vast amount of online conversations being produced on a daily basis. It presents a concrete challenge for individuals to better discover and engage in social media discussions. In this paper, we present a novel framework to automatically recommend conversations to users based on their prior conversation behaviors. Built on neural collaborative filtering, our model explores deep semantic features that measure how a user{'}s preferences match an ongoing conversation{'}s context. Furthermore, to identify salient characteristics from interleaving user interactions, our model incorporates graph-structured networks, where both replying relations and temporal features are encoded as conversation context. Experimental results on two large-scale datasets collected from Twitter and Reddit show that our model yields better performance than previous state-of-the-art models, which only utilize lexical features and ignore past user interactions in the conversations.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1470/
PDF	https://www.aclweb.org/anthology/D19-1470
PWC	https://paperswithcode.com/paper/neural-conversation-recommendation-with
Repo
Framework

Generative Multi-View Human Action Recognition


Title	Generative Multi-View Human Action Recognition
Authors	Lichen Wang, Zhengming Ding, Zhiqiang Tao, Yunyu Liu, Yun Fu
Abstract	Multi-view action recognition targets to integrate complementary information from different views to improve classification performance. It is a challenging task due to the distinct gap between heterogeneous feature domains. Moreover, most existing methods neglect to consider the incomplete multi-view data, which limits their potential compatibility in real-world applications. In this work, we propose a Generative Multi-View Action Recognition (GMVAR) framework to address the challenges above. The adversarial generative network is leveraged to generate one view conditioning on the other view, which fully explores the latent connections in both intra-view and cross-view aspects. Our approach enhances the model robustness by employing adversarial training, and naturally handles the incomplete view case by imputing the missing data. Moreover, an effective View Correlation Discovery Network (VCDN) is proposed to further fuse the multi-view information in a higher-level label space. Extensive experiments demonstrate the effectiveness of our proposed approach by comparing with state-of-the-art algorithms.
Tasks	Temporal Action Localization
Published	2019-10-01
URL	http://openaccess.thecvf.com/content_ICCV_2019/html/Wang_Generative_Multi-View_Human_Action_Recognition_ICCV_2019_paper.html
PDF	http://openaccess.thecvf.com/content_ICCV_2019/papers/Wang_Generative_Multi-View_Human_Action_Recognition_ICCV_2019_paper.pdf
PWC	https://paperswithcode.com/paper/generative-multi-view-human-action
Repo
Framework

Purchase as Reward : Session-based Recommendation by Imagination Reconstruction


Title	Purchase as Reward : Session-based Recommendation by Imagination Reconstruction
Authors	Qibing Li, Xiaolin Zheng
Abstract	One of the key challenges of session-based recommender systems is to enhance users’ purchase intentions. In this paper, we formulate the sequential interactions between user sessions and a recommender agent as a Markov Decision Process (MDP). In practice, the purchase reward is delayed and sparse, and may be buried by clicks, making it an impoverished signal for policy learning. Inspired by the prediction error minimization (PEM) and embodied cognition, we propose a simple architecture to augment reward, namely Imagination Reconstruction Network (IRN). Speciﬁcally, IRN enables the agent to explore its environment and learn predictive representations via three key components. The imagination core generates predicted trajectories, i.e., imagined items that users may purchase. The trajectory manager controls the granularity of imagined trajectories using the planning strategies, which balances the long-term rewards and short-term rewards. To optimize the action policy, the imagination-augmented executor minimizes the intrinsic imagination error of simulated trajectories by self-supervised reconstruction, while maximizing the extrinsic reward using model-free algorithms. Empirically, IRN promotes quicker adaptation to user interest, and shows improved robustness to the cold-start scenario and ultimately higher purchase performance compared to several baselines. Somewhat surprisingly, IRN using only the purchase reward achieves excellent next-click prediction performance, demonstrating that the agent can “guess what you like” via internal planning.
Tasks	Recommendation Systems, Session-Based Recommendations
Published	2019-05-01
URL	https://openreview.net/forum?id=SkfTIj0cKX
PDF	https://openreview.net/pdf?id=SkfTIj0cKX
PWC	https://paperswithcode.com/paper/purchase-as-reward-session-based
Repo
Framework

Combining Unsupervised Pre-training and Annotator Rationales to Improve Low-shot Text Classification


Title	Combining Unsupervised Pre-training and Annotator Rationales to Improve Low-shot Text Classification
Authors	Oren Melamud, Mihaela Bornea, Ken Barker
Abstract	Supervised learning models often perform poorly at low-shot tasks, i.e. tasks for which little labeled data is available for training. One prominent approach for improving low-shot learning is to use unsupervised pre-trained neural models. Another approach is to obtain richer supervision by collecting annotator rationales (explanations supporting label annotations). In this work, we combine these two approaches to improve low-shot text classification with two novel methods: a simple bag-of-words embedding approach; and a more complex context-aware method, based on the BERT model. In experiments with two English text classification datasets, we demonstrate substantial performance gains from combining pre-training with rationales. Furthermore, our investigation of a range of train-set sizes reveals that the simple bag-of-words approach is the clear top performer when there are only a few dozen training instances or less, while more complex models, such as BERT or CNN, require more training data to shine.
Tasks	Text Classification
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1401/
PDF	https://www.aclweb.org/anthology/D19-1401
PWC	https://paperswithcode.com/paper/combining-unsupervised-pre-training-and
Repo
Framework


Title	Robustness Certification with Refinement
Authors	Gagandeep Singh, Timon Gehr, Markus Püschel, Martin Vechev
Abstract	We present a novel approach for verification of neural networks which combines scalable over-approximation methods with precise (mixed integer) linear programming. This results in significantly better precision than state of the art verifiers on feed forward neural networks with piecewise linear activation functions.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=HJgeEh09KQ
PDF	https://openreview.net/pdf?id=HJgeEh09KQ
PWC	https://paperswithcode.com/paper/robustness-certification-with-refinement
Repo
Framework

A Preliminary Plains Cree Speech Synthesizer


Title	A Preliminary Plains Cree Speech Synthesizer
Authors	Atticus Harrigan, Antti Arppe, Timothy Mills
Abstract
Tasks
Published	2019-02-01
URL	https://www.aclweb.org/anthology/W19-6009/
PDF	https://www.aclweb.org/anthology/W19-6009
PWC	https://paperswithcode.com/paper/a-preliminary-plains-cree-speech-synthesizer
Repo
Framework

Proceedings of the Second Workshop on Storytelling


Title	Proceedings of the Second Workshop on Storytelling
Authors
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3400/
PDF	https://www.aclweb.org/anthology/W19-3400
PWC	https://paperswithcode.com/paper/proceedings-of-the-second-workshop-on-16
Repo
Framework

Learning Latent Global Network for Skeleton-based Action Prediction


Title	Learning Latent Global Network for Skeleton-based Action Prediction
Authors	Qiuhong Ke, Mohammed Bennamoun, Hossein Rahmani, Senjian An, Ferdous Sohel, Farid Boussaid
Abstract	Human actions represented with 3D skeleton sequences are robust to clustered backgrounds and illumination changes. In this paper, we investigate skeleton-based action prediction, which aims to recognize an action from a partial skeleton sequence that contains incomplete action information. We propose a new Latent Global Network based on adversarial learning for action prediction. We demonstrate that the proposed network provides latent long-term global information that is complementary to the local action information of the partial sequences and helps improve action prediction. We show that action prediction can be improved by combining the latent global information with the local action information. We test the proposed method on three challenging skeleton datasets and report state-of-the-art performance.
Tasks	Skeleton Based Action Recognition
Published	2019-09-02
URL	https://doi.org/10.1109/TIP.2019.2937757
PDF	https://eprints.lancs.ac.uk/id/eprint/136503/1/bare_jrnl7.pdf
PWC	https://paperswithcode.com/paper/learning-latent-global-network-for-skeleton
Repo
Framework

Continual Learning via Explicit Structure Learning


Title	Continual Learning via Explicit Structure Learning
Authors	Xilai Li, Yingbo Zhou, Tianfu Wu, Richard Socher, Caiming Xiong
Abstract	Despite recent advances in deep learning, neural networks suffer catastrophic forgetting when tasks are learned sequentially. We propose a conceptually simple and general framework for continual learning, where structure optimization is considered explicitly during learning. We implement this idea by separating the structure and parameter learning. During structure learning, the model optimizes for the best structure for the current task. The model learns when to reuse or modify structure from previous tasks, or create new ones when necessary. The model parameters are then estimated with the optimal structure. Empirically, we found that our approach leads to sensible structures when learning multiple tasks continuously. Additionally, catastrophic forgetting is also largely alleviated from explicit learning of structures. Our method also outperforms all other baselines on the permuted MNIST and split CIFAR datasets in continual learning setting.
Tasks	Continual Learning
Published	2019-05-01
URL	https://openreview.net/forum?id=ryxsS3A5Km
PDF	https://openreview.net/pdf?id=ryxsS3A5Km
PWC	https://paperswithcode.com/paper/continual-learning-via-explicit-structure
Repo
Framework

Cross-Task Knowledge Transfer for Query-Based Text Summarization


Title	Cross-Task Knowledge Transfer for Query-Based Text Summarization
Authors	Elozino Egonmwan, Vittorio Castelli, Md Arafat Sultan
Abstract	We demonstrate the viability of knowledge transfer between two related tasks: machine reading comprehension (MRC) and query-based text summarization. Using an MRC model trained on the SQuAD1.1 dataset as a core system component, we first build an extractive query-based summarizer. For better precision, this summarizer also compresses the output of the MRC model using a novel sentence compression technique. We further leverage pre-trained machine translation systems to abstract our extracted summaries. Our models achieve state-of-the-art results on the publicly available CNN/Daily Mail and Debatepedia datasets, and can serve as simple yet powerful baselines for future systems. We also hope that these results will encourage research on transfer learning from large MRC corpora to query-based summarization.
Tasks	Machine Reading Comprehension, Machine Translation, Reading Comprehension, Sentence Compression, Text Summarization, Transfer Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5810/
PDF	https://www.aclweb.org/anthology/D19-5810
PWC	https://paperswithcode.com/paper/cross-task-knowledge-transfer-for-query-based
Repo
Framework

Evaluating Conjunction Disambiguation on English-to-German and French-to-German WMT 2019 Translation Hypotheses


Title	Evaluating Conjunction Disambiguation on English-to-German and French-to-German WMT 2019 Translation Hypotheses
Authors	Maja Popovi{'c}
Abstract	We present a test set for evaluating an MT system{'}s capability to translate ambiguous conjunctions depending on the sentence structure. We concentrate on the English conjunction {`}but{''} and its French equivalent {`}mais{''} which can be translated into two different German conjunctions. We evaluate all English-to-German and French-to-German submissions to the WMT 2019 shared translation task. The evaluation is done mainly automatically, with additional fast manual inspection of unclear cases. All systems almost perfectly recognise the target conjunction {`}aber{''}, whereas accuracies for the other target conjunction {`}sondern{''} range from 78{%} to 97{%}, and the errors are mostly caused by replacing it with the alternative conjunction {`}aber{''}. The best performing system for both language pairs is a multilingual Transformer {`}TartuNLP{''} system trained on all WMT 2019 language pairs which use the Latin script, indicating that the multilingual approach is beneficial for conjunction disambiguation. As for other system features, such as using synthetic back-translated data, context-aware, hybrid, etc., no particular (dis)advantages can be observed. Qualitative manual inspection of translation hypotheses shown that highly ranked systems generally produce translations with high adequacy and fluency, meaning that these systems are not only capable of capturing the right conjunction whereas the rest of the translation hypothesis is poor. On the other hand, the low ranked systems generally exhibit lower fluency and poor adequacy.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-5353/
PDF	https://www.aclweb.org/anthology/W19-5353
PWC	https://paperswithcode.com/paper/evaluating-conjunction-disambiguation-on
Repo
Framework

Generative Question Answering: Learning to Answer the Whole Question


Title	Generative Question Answering: Learning to Answer the Whole Question
Authors	Mike Lewis, Angela Fan
Abstract	Discriminative question answering models can overfit to superficial biases in datasets, because their loss function saturates when any clue makes the answer likely. We introduce generative models of the joint distribution of questions and answers, which are trained to explain the whole question, not just to answer it.Our question answering (QA) model is implemented by learning a prior over answers, and a conditional language model to generate the question given the answer—allowing scalable and interpretable many-hop reasoning as the question is generated word-by-word. Our model achieves competitive performance with specialised discriminative models on the SQUAD and CLEVR benchmarks, indicating that it is a more general architecture for language understanding and reasoning than previous work. The model greatly improves generalisation both from biased training data and to adversarial testing data, achieving a new state-of-the-art on ADVERSARIAL SQUAD. We will release our code.
Tasks	Language Modelling, Question Answering
Published	2019-05-01
URL	https://openreview.net/forum?id=Bkx0RjA9tX
PDF	https://openreview.net/pdf?id=Bkx0RjA9tX
PWC	https://paperswithcode.com/paper/generative-question-answering-learning-to
Repo
Framework

Not Using the Car to See the Sidewalk – Quantifying and Controlling the Effects of Context in Classification and Segmentation


Title	Not Using the Car to See the Sidewalk – Quantifying and Controlling the Effects of Context in Classification and Segmentation
Authors	Rakshith Shetty, Bernt Schiele, Mario Fritz
Abstract	Importance of visual context in scene understanding tasks is well recognized in the computer vision community. However, to what extent the computer vision models are dependent on the context to make their predictions is unclear. A model overly relying on context will fail when encountering objects in different contexts than in training data and hence it is important to identify these dependencies before we can deploy the models in the real-world. We propose a method to quantify the sensitivity of black-box vision models to visual context by editing images to remove selected objects and measuring the response of the target models. We apply this methodology on two tasks, image classification and semantic segmentation, and discover undesirable dependency between objects and context, for example that “sidewalk” segmentation is very sensitive to the presence of “cars” in the image. We propose an object removal based data augmentation solution to mitigate this dependency and increase the robustness of classification and segmentation models to contextual variations. Our experiments show that the proposed data augmentation helps these models improve the performance in out-of-context scenarios, while preserving the performance on regular data.
Tasks	Data Augmentation, Image Classification, Scene Understanding, Semantic Segmentation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Shetty_Not_Using_the_Car_to_See_the_Sidewalk_--_Quantifying_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Shetty_Not_Using_the_Car_to_See_the_Sidewalk_--_Quantifying_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/not-using-the-car-to-see-the-sidewalk-1
Repo
Framework

Proceedings of the Third Workshop on Abusive Language Online


Title	Proceedings of the Third Workshop on Abusive Language Online
Authors
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-3500/
PDF	https://www.aclweb.org/anthology/W19-3500
PWC	https://paperswithcode.com/paper/proceedings-of-the-third-workshop-on-abusive
Repo
Framework