October 16, 2019

2753 words 13 mins read

Paper Group NAWR 9

Paper Group NAWR 9

Faster Neural Networks Straight from JPEG. Textbook Question Answering Under Instructor Guidance With Memory Networks. Distractor Generation for Multiple Choice Questions Using Learning to Rank. Neural Quality Estimation of Grammatical Error Correction. ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection. Learning Word …

Faster Neural Networks Straight from JPEG

Title Faster Neural Networks Straight from JPEG
Authors Lionel Gueguen, Alex Sergeev, Ben Kadlec, Rosanne Liu, Jason Yosinski
Abstract The simple, elegant approach of training convolutional neural networks (CNNs) directly from RGB pixels has enjoyed overwhelming empirical success. But can more performance be squeezed out of networks by using different input representations? In this paper we propose and explore a simple idea: train CNNs directly on the blockwise discrete cosine transform (DCT) coefficients computed and available in the middle of the JPEG codec. Intuitively, when processing JPEG images using CNNs, it seems unnecessary to decompress a blockwise frequency representation to an expanded pixel representation, shuffle it from CPU to GPU, and then process it with a CNN that will learn something similar to a transform back to frequency representation in its first layers. Why not skip both steps and feed the frequency domain into the network directly? In this paper we modify \libjpeg to produce DCT coefficients directly, modify a ResNet-50 network to accommodate the differently sized and strided input, and evaluate performance on ImageNet. We find networks that are both faster and more accurate, as well as networks with about the same accuracy but 1.77x faster than ResNet-50.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/7649-faster-neural-networks-straight-from-jpeg
PDF http://papers.nips.cc/paper/7649-faster-neural-networks-straight-from-jpeg.pdf
PWC https://paperswithcode.com/paper/faster-neural-networks-straight-from-jpeg
Repo https://github.com/uber-research/jpeg2dct
Framework tf

Textbook Question Answering Under Instructor Guidance With Memory Networks

Title Textbook Question Answering Under Instructor Guidance With Memory Networks
Authors Juzheng Li, Hang Su, Jun Zhu, Siyu Wang, Bo Zhang
Abstract Textbook Question Answering (TQA) is a task to choose the most proper answers by reading a multi-modal context of abundant essays and images. TQA serves as a favorable test bed for visual and textual reasoning. However, most of the current methods are incapable of reasoning over the long contexts and images. To address this issue, we propose a novel approach of Instructor Guidance with Memory Networks (IGMN) which conducts the TQA task by finding contradictions between the candidate answers and their corresponding context. We build the Contradiction Entity-Relationship Graph (CERG) to extend the passage-level multi-modal contradictions to an essay level. The machine thus performs as an instructor to extract the essay-level contradictions as the Guidance. Afterwards, we exploit the memory networks to capture the information in the Guidance, and use the attention mechanisms to jointly reason over the global features of the multi-modal input. Extensive experiments demonstrate that our method outperforms the state-of-the-arts on the TQA dataset. The source code is available at https://github.com/freerailway/igmn.
Tasks Question Answering
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Textbook_Question_Answering_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Textbook_Question_Answering_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/textbook-question-answering-under-instructor
Repo https://github.com/freerailway/igmn
Framework tf

Distractor Generation for Multiple Choice Questions Using Learning to Rank

Title Distractor Generation for Multiple Choice Questions Using Learning to Rank
Authors Chen Liang, Xiao Yang, Neisarg Dave, Drew Wham, Bart Pursel, C. Lee Giles
Abstract We investigate how machine learning models, specifically ranking models, can be used to select useful distractors for multiple choice questions. Our proposed models can learn to select distractors that resemble those in actual exam questions, which is different from most existing unsupervised ontology-based and similarity-based methods. We empirically study feature-based and neural net (NN) based ranking models with experiments on the recently released SciQ dataset and our MCQL dataset. Experimental results show that feature-based ensemble learning methods (random forest and LambdaMART) outperform both the NN-based method and unsupervised baselines. These two datasets can also be used as benchmarks for distractor generation.
Tasks Learning-To-Rank
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-0533/
PDF https://www.aclweb.org/anthology/W18-0533
PWC https://paperswithcode.com/paper/distractor-generation-for-multiple-choice
Repo https://github.com/harrylclc/LTR-DG
Framework tf

Neural Quality Estimation of Grammatical Error Correction

Title Neural Quality Estimation of Grammatical Error Correction
Authors Shamil Chollampatt, Hwee Tou Ng
Abstract Grammatical error correction (GEC) systems deployed in language learning environments are expected to accurately correct errors in learners{'} writing. However, in practice, they often produce spurious corrections and fail to correct many errors, thereby misleading learners. This necessitates the estimation of the quality of output sentences produced by GEC systems so that instructors can selectively intervene and re-correct the sentences which are poorly corrected by the system and ensure that learners get accurate feedback. We propose the first neural approach to automatic quality estimation of GEC output sentences that does not employ any hand-crafted features. Our system is trained in a supervised manner on learner sentences and corresponding GEC system outputs with quality score labels computed using human-annotated references. Our neural quality estimation models for GEC show significant improvements over a strong feature-based baseline. We also show that a state-of-the-art GEC system can be improved when quality scores are used as features for re-ranking the N-best candidates.
Tasks Grammatical Error Correction, Machine Translation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1274/
PDF https://www.aclweb.org/anthology/D18-1274
PWC https://paperswithcode.com/paper/neural-quality-estimation-of-grammatical
Repo https://github.com/nusnlp/neuqe
Framework pytorch

ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection

Title ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection
Authors Devamanyu Hazarika, Soujanya Poria, Rada Mihalcea, Erik Cambria, Roger Zimmermann
Abstract Emotion recognition in conversations is crucial for building empathetic machines. Present works in this domain do not explicitly consider the inter-personal influences that thrive in the emotional dynamics of dialogues. To this end, we propose Interactive COnversational memory Network (ICON), a multimodal emotion detection framework that extracts multimodal features from conversational videos and hierarchically models the self- and inter-speaker emotional influences into global memories. Such memories generate contextual summaries which aid in predicting the emotional orientation of utterance-videos. Our model outperforms state-of-the-art networks on multiple classification and regression tasks in two benchmark datasets.
Tasks Emotion Recognition, Emotion Recognition in Context, Emotion Recognition in Conversation, Multimodal Emotion Recognition, Multimodal Sentiment Analysis, Sentiment Analysis
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1280/
PDF https://www.aclweb.org/anthology/D18-1280
PWC https://paperswithcode.com/paper/icon-interactive-conversational-memory
Repo https://github.com/SenticNet/conv-emotion
Framework pytorch

Learning Word Meta-Embeddings by Autoencoding

Title Learning Word Meta-Embeddings by Autoencoding
Authors Danushka Bollegala, Cong Bao
Abstract Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more accurate and complete meta-embeddings of words. We model the meta-embedding learning problem as an autoencoding problem, where we would like to learn a meta-embedding space that can accurately reconstruct all source embeddings simultaneously. Thereby, the meta-embedding space is enforced to capture complementary information in different source embeddings via a coherent common embedding space. We propose three flavours of autoencoded meta-embeddings motivated by different requirements that must be satisfied by a meta-embedding. Our experimental results on a series of benchmark evaluations show that the proposed autoencoded meta-embeddings outperform the existing state-of-the-art meta-embeddings in multiple tasks.
Tasks Dependency Parsing, Machine Translation, Part-Of-Speech Tagging, Sentiment Analysis, Word Embeddings
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1140/
PDF https://www.aclweb.org/anthology/C18-1140
PWC https://paperswithcode.com/paper/learning-word-meta-embeddings-by-autoencoding
Repo https://github.com/CongBao/AutoencodedMetaEmbedding
Framework none

Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks

Title Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
Authors Jiawei Zhang, Jinshan Pan, Jimmy Ren, Yibing Song, Linchao Bao, Rynson W.H. Lau, Ming-Hsuan Yang
Abstract Due to the spatially variant blur caused by camera shake and object motions under different scene depths, deblurring images captured from dynamic scenes is challenging. Although recent works based on deep neural networks have shown great progress on this problem, their models are usually large and computationally expensive. In this paper, we propose a novel spatially variant neural network to address the problem. The proposed network is composed of three deep convolutional neural networks (CNNs) and a recurrent neural network (RNN). RNN is used as a deconvolution operator performed on feature maps extracted from the input image by one of the CNNs. Another CNN is used to learn the weights for the RNN at every location. As a result, the RNN is spatially variant and could implicitly model the deblurring process with spatially variant kernels. The third CNN is used to reconstruct the final deblurred feature maps into restored image. The whole network is end-to-end trainable. Our analysis shows that the proposed network has a large receptive field even with a small model size. Quantitative and qualitative evaluations on public datasets demonstrate that the proposed method performs favorably against state-of-the-art algorithms in terms of accuracy, speed, and model size.
Tasks Deblurring
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Dynamic_Scene_Deblurring_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Dynamic_Scene_Deblurring_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/dynamic-scene-deblurring-using-spatially
Repo https://github.com/zhjwustc/cvpr18_rnn_deblur_matcaffe
Framework none

End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space

Title End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space
Authors Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia
Abstract We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn {}distributional similarity{'} in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space. To validate our hypothesis, we focus on the {}image{'} side of image captioning, and vary the input image representation but keep the RNN text generation model of a CNN-RNN constant. Our analysis indicates that image captioning models (i) are capable of separating structure from noisy input representations; (ii) experience virtually no significant performance loss when a high dimensional representation is compressed to a lower dimensional space; (iii) cluster images with similar visual and linguistic information together. Our experiments all point to one fact: that our distributional similarity hypothesis holds. We conclude that, regardless of the image representation, image captioning systems seem to match images and generate captions in a learned joint image-text semantic subspace.
Tasks Image Captioning, Text Generation
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-5455/
PDF https://www.aclweb.org/anthology/W18-5455
PWC https://paperswithcode.com/paper/end-to-end-image-captioning-exploits
Repo https://github.com/sheffieldnlp/whatIC
Framework none

A Novel Deterministic Framework for Non-probabilistic Recommender Systems

Title A Novel Deterministic Framework for Non-probabilistic Recommender Systems
Authors Avinash Bhat, Divya Madhav Kamath, Anitha C
Abstract Recommendation is a technique which helps and suggests a user, any relevant item from a large information space. Current techniques for this purpose include non-probabilistic methods like content-based filtering and collaborative filtering (CF) and probabilistic methods like Bayesian inference and Case-based reasoning methods. CF algorithms use similarity measures for calculating similarity between users. In this paper, we propose a novel framework which deterministically switches between the CF algorithms based on sparsity to improve accuracy of recommendation.
Tasks Bayesian Inference, Recommendation Systems
Published 2018-09-02
URL https://link.springer.com/chapter/10.1007/978-981-13-1498-8_8
PDF https://www.researchgate.net/publication/327389858_A_Novel_Deterministic_Framework_for_Non-probabilistic_Recommender_Systems_Proceedings_of_IEMIS_2018_Volume_2
PWC https://paperswithcode.com/paper/a-novel-deterministic-framework-for-non
Repo https://github.com/avinashbhat/similarity-item-based
Framework none

Detecting Decision Ambiguity from Facial Images

Title Detecting Decision Ambiguity from Facial Images
Authors Pavel Jahoda, Antonin Vobecky, Jan Cech, Jiri Matas
Abstract In situations when potentially costly decisions are being made, faces of people tend to reflect a level of certainty about the appropriateness of the chosen decision. This fact is known from the psychological literature. In the paper, we propose a method that uses facial images for automatic detection of the decision ambiguity state of a subject. To train and test the method, we collected a large-scale dataset from “Who Wants to Be a Millionaire?” – a popular TV game show. The videos provide examples of various mental states of contestants, including uncertainty, doubts and hesitation. The annotation of the videos is done automatically from onscreen graphics. The problem of detecting decision ambiguity is formulated as binary classification. Video-clips where a contestant asks for help (audience, friend, 50:50) are considered as positive samples; if he (she) replies directly as negative ones. We propose a baseline method combining a deep convolutional neural network with an SVM. The method has an error rate of 24%. The error of human volunteers on the same dataset is 45%, close to chance.
Tasks
Published 2018-05-15
URL https://ieeexplore.ieee.org/document/8373873
PDF https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8373873
PWC https://paperswithcode.com/paper/detecting-decision-ambiguity-from-facial
Repo https://github.com/JahodaPaul/DecisionAmbiguityRecognition
Framework tf

LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs

Title LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs
Authors Daniel Kondratyuk, Tom{'a}{\v{s}} Gaven{\v{c}}iak, Milan Straka, Jan Haji{\v{c}}
Abstract We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using bidirectional RNNs with character-level and word-level embeddings. We demonstrate that both tasks benefit from sharing the encoding part of the network, predicting tag subcategories, and using the tagger output as an input to the lemmatizer. We evaluate our model across several languages with complex morphology, which surpasses state-of-the-art accuracy in both part-of-speech tagging and lemmatization in Czech, German, and Arabic.
Tasks Lemmatization, Machine Translation, Part-Of-Speech Tagging, Semantic Role Labeling, Sentiment Analysis
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1532/
PDF https://www.aclweb.org/anthology/D18-1532
PWC https://paperswithcode.com/paper/lemmatag-jointly-tagging-and-lemmatizing-for-1
Repo https://github.com/hyperparticle/LemmaTag
Framework tf
Title Clustering for Binary Featured Datasets
Authors Peter Taraba
Abstract Clustering is one of the most important concepts for unsupervised learning in machine learning. While there are numerous clustering algorithms already, many, including the popular one—k-means algorithm, require the number of clusters to be specified in advance, a huge drawback. Some studies use the silhouette coefficient to determine the optimal number of clusters. In this study, we introduce a novel algorithm called Powered Outer Probabilistic Clustering, show how it works through back-propagation (starting with many clusters and ending with an optimal number of clusters) , and show that the algorithm converges to the expected (optimal) number of clusters on theoretical examples.
Tasks
Published 2018-10-25
URL https://link.springer.com/chapter/10.1007/978-981-13-2191-7_10
PDF https://books.google.com/books?id=ANF0DwAAQBAJ&pg=PA127#v=onepage&q&f=false
PWC https://paperswithcode.com/paper/clustering-for-binary-featured-datasets
Repo https://github.com/pepe78/Small-Bang
Framework none

Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network

Title Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network
Authors Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, Hua Wu
Abstract Human generates responses relying on semantic and functional dependencies, including coreference relation, among dialogue elements and their context. In this paper, we investigate matching a response with its multi-turn context using dependency information based entirely on attention. Our solution is inspired by the recently proposed Transformer in machine translation (Vaswani et al., 2017) and we extend the attention mechanism in two ways. First, we construct representations of text segments at different granularities solely with stacked self-attention. Second, we try to extract the truly matched segment pairs with attention across the context and response. We jointly introduce those two kinds of attention in one uniform neural network. Experiments on two large-scale multi-turn response selection tasks show that our proposed model significantly outperforms the state-of-the-art models.
Tasks Chatbot, Conversational Response Selection
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-1103/
PDF https://www.aclweb.org/anthology/P18-1103
PWC https://paperswithcode.com/paper/multi-turn-response-selection-for-chatbots
Repo https://github.com/baidu/Dialogue
Framework tf

Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations

Title Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations
Authors Jianmo Ni, Julian McAuley
Abstract In this paper, we focus on the problem of building assistive systems that can help users to write reviews. We cast this problem using an encoder-decoder framework that generates personalized reviews by expanding short phrases (e.g. review summaries, product titles) provided as input to the system. We incorporate aspect-level information via an aspect encoder that learns aspect-aware user and item representations. An attention fusion layer is applied to control generation by attending on the outputs of multiple encoders. Experimental results show that our model successfully learns representations capable of generating coherent and diverse reviews. In addition, the learned aspect-aware representations discover those aspects that users are more inclined to discuss and bias the generated text toward their personalized aspect preferences.
Tasks Recommendation Systems, Sentiment Analysis, Text Generation
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2112/
PDF https://www.aclweb.org/anthology/P18-2112
PWC https://paperswithcode.com/paper/personalized-review-generation-by-expanding
Repo https://github.com/nijianmo/textExpansion
Framework pytorch

Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings

Title Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings
Authors Junjie Li, Haitong Yang, Chengqing Zong
Abstract Document-level multi-aspect sentiment classification aims to predict user{'}s sentiment polarities for different aspects of a product in a review. Existing approaches mainly focus on text information. However, the authors (i.e. users) and overall ratings of reviews are ignored, both of which are proved to be significant on interpreting the sentiments of different aspects in this paper. Therefore, we propose a model called Hierarchical User Aspect Rating Network (HUARN) to consider user preference and overall ratings jointly. Specifically, HUARN adopts a hierarchical architecture to encode word, sentence, and document level information. Then, user attention and aspect attention are introduced into building sentence and document level representation. The document representation is combined with user and overall rating information to predict aspect ratings of a review. Diverse aspects are treated differently and a multi-task framework is adopted. Empirical results on two real-world datasets show that HUARN achieves state-of-the-art performances.
Tasks Multi-Task Learning, Sentiment Analysis
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1079/
PDF https://www.aclweb.org/anthology/C18-1079
PWC https://paperswithcode.com/paper/document-level-multi-aspect-sentiment
Repo https://github.com/Junjieli0704/HUARN
Framework pytorch
comments powered by Disqus