October 16, 2019

2753 words 13 mins read

Paper Group NAWR 9

Faster Neural Networks Straight from JPEG. Textbook Question Answering Under Instructor Guidance With Memory Networks. Distractor Generation for Multiple Choice Questions Using Learning to Rank. Neural Quality Estimation of Grammatical Error Correction. ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection. Learning Word …

Faster Neural Networks Straight from JPEG


Title	Faster Neural Networks Straight from JPEG
Authors	Lionel Gueguen, Alex Sergeev, Ben Kadlec, Rosanne Liu, Jason Yosinski
Abstract	The simple, elegant approach of training convolutional neural networks (CNNs) directly from RGB pixels has enjoyed overwhelming empirical success. But can more performance be squeezed out of networks by using different input representations? In this paper we propose and explore a simple idea: train CNNs directly on the blockwise discrete cosine transform (DCT) coefficients computed and available in the middle of the JPEG codec. Intuitively, when processing JPEG images using CNNs, it seems unnecessary to decompress a blockwise frequency representation to an expanded pixel representation, shuffle it from CPU to GPU, and then process it with a CNN that will learn something similar to a transform back to frequency representation in its first layers. Why not skip both steps and feed the frequency domain into the network directly? In this paper we modify \libjpeg to produce DCT coefficients directly, modify a ResNet-50 network to accommodate the differently sized and strided input, and evaluate performance on ImageNet. We find networks that are both faster and more accurate, as well as networks with about the same accuracy but 1.77x faster than ResNet-50.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7649-faster-neural-networks-straight-from-jpeg
PDF	http://papers.nips.cc/paper/7649-faster-neural-networks-straight-from-jpeg.pdf
PWC	https://paperswithcode.com/paper/faster-neural-networks-straight-from-jpeg
Repo	https://github.com/uber-research/jpeg2dct
Framework	tf

Textbook Question Answering Under Instructor Guidance With Memory Networks


Title	Textbook Question Answering Under Instructor Guidance With Memory Networks
Authors	Juzheng Li, Hang Su, Jun Zhu, Siyu Wang, Bo Zhang
Abstract	Textbook Question Answering (TQA) is a task to choose the most proper answers by reading a multi-modal context of abundant essays and images. TQA serves as a favorable test bed for visual and textual reasoning. However, most of the current methods are incapable of reasoning over the long contexts and images. To address this issue, we propose a novel approach of Instructor Guidance with Memory Networks (IGMN) which conducts the TQA task by finding contradictions between the candidate answers and their corresponding context. We build the Contradiction Entity-Relationship Graph (CERG) to extend the passage-level multi-modal contradictions to an essay level. The machine thus performs as an instructor to extract the essay-level contradictions as the Guidance. Afterwards, we exploit the memory networks to capture the information in the Guidance, and use the attention mechanisms to jointly reason over the global features of the multi-modal input. Extensive experiments demonstrate that our method outperforms the state-of-the-arts on the TQA dataset. The source code is available at https://github.com/freerailway/igmn.
Tasks	Question Answering
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Textbook_Question_Answering_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Textbook_Question_Answering_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/textbook-question-answering-under-instructor
Repo	https://github.com/freerailway/igmn
Framework	tf

Distractor Generation for Multiple Choice Questions Using Learning to Rank


Title	Distractor Generation for Multiple Choice Questions Using Learning to Rank
Authors	Chen Liang, Xiao Yang, Neisarg Dave, Drew Wham, Bart Pursel, C. Lee Giles
Abstract	We investigate how machine learning models, specifically ranking models, can be used to select useful distractors for multiple choice questions. Our proposed models can learn to select distractors that resemble those in actual exam questions, which is different from most existing unsupervised ontology-based and similarity-based methods. We empirically study feature-based and neural net (NN) based ranking models with experiments on the recently released SciQ dataset and our MCQL dataset. Experimental results show that feature-based ensemble learning methods (random forest and LambdaMART) outperform both the NN-based method and unsupervised baselines. These two datasets can also be used as benchmarks for distractor generation.
Tasks	Learning-To-Rank
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-0533/
PDF	https://www.aclweb.org/anthology/W18-0533
PWC	https://paperswithcode.com/paper/distractor-generation-for-multiple-choice
Repo	https://github.com/harrylclc/LTR-DG
Framework	tf

Neural Quality Estimation of Grammatical Error Correction


Title	Neural Quality Estimation of Grammatical Error Correction
Authors	Shamil Chollampatt, Hwee Tou Ng
Abstract	Grammatical error correction (GEC) systems deployed in language learning environments are expected to accurately correct errors in learners{'} writing. However, in practice, they often produce spurious corrections and fail to correct many errors, thereby misleading learners. This necessitates the estimation of the quality of output sentences produced by GEC systems so that instructors can selectively intervene and re-correct the sentences which are poorly corrected by the system and ensure that learners get accurate feedback. We propose the first neural approach to automatic quality estimation of GEC output sentences that does not employ any hand-crafted features. Our system is trained in a supervised manner on learner sentences and corresponding GEC system outputs with quality score labels computed using human-annotated references. Our neural quality estimation models for GEC show significant improvements over a strong feature-based baseline. We also show that a state-of-the-art GEC system can be improved when quality scores are used as features for re-ranking the N-best candidates.
Tasks	Grammatical Error Correction, Machine Translation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1274/
PDF	https://www.aclweb.org/anthology/D18-1274
PWC	https://paperswithcode.com/paper/neural-quality-estimation-of-grammatical
Repo	https://github.com/nusnlp/neuqe
Framework	pytorch

ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection


Title	ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection
Authors	Devamanyu Hazarika, Soujanya Poria, Rada Mihalcea, Erik Cambria, Roger Zimmermann
Abstract	Emotion recognition in conversations is crucial for building empathetic machines. Present works in this domain do not explicitly consider the inter-personal influences that thrive in the emotional dynamics of dialogues. To this end, we propose Interactive COnversational memory Network (ICON), a multimodal emotion detection framework that extracts multimodal features from conversational videos and hierarchically models the self- and inter-speaker emotional influences into global memories. Such memories generate contextual summaries which aid in predicting the emotional orientation of utterance-videos. Our model outperforms state-of-the-art networks on multiple classification and regression tasks in two benchmark datasets.
Tasks	Emotion Recognition, Emotion Recognition in Context, Emotion Recognition in Conversation, Multimodal Emotion Recognition, Multimodal Sentiment Analysis, Sentiment Analysis
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1280/
PDF	https://www.aclweb.org/anthology/D18-1280
PWC	https://paperswithcode.com/paper/icon-interactive-conversational-memory
Repo	https://github.com/SenticNet/conv-emotion
Framework	pytorch

Learning Word Meta-Embeddings by Autoencoding


Title	Learning Word Meta-Embeddings by Autoencoding
Authors	Danushka Bollegala, Cong Bao
Abstract	Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more accurate and complete meta-embeddings of words. We model the meta-embedding learning problem as an autoencoding problem, where we would like to learn a meta-embedding space that can accurately reconstruct all source embeddings simultaneously. Thereby, the meta-embedding space is enforced to capture complementary information in different source embeddings via a coherent common embedding space. We propose three flavours of autoencoded meta-embeddings motivated by different requirements that must be satisfied by a meta-embedding. Our experimental results on a series of benchmark evaluations show that the proposed autoencoded meta-embeddings outperform the existing state-of-the-art meta-embeddings in multiple tasks.
Tasks	Dependency Parsing, Machine Translation, Part-Of-Speech Tagging, Sentiment Analysis, Word Embeddings
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1140/
PDF	https://www.aclweb.org/anthology/C18-1140
PWC	https://paperswithcode.com/paper/learning-word-meta-embeddings-by-autoencoding
Repo	https://github.com/CongBao/AutoencodedMetaEmbedding
Framework	none

Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks


Title	Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
Authors	Jiawei Zhang, Jinshan Pan, Jimmy Ren, Yibing Song, Linchao Bao, Rynson W.H. Lau, Ming-Hsuan Yang
Abstract	Due to the spatially variant blur caused by camera shake and object motions under different scene depths, deblurring images captured from dynamic scenes is challenging. Although recent works based on deep neural networks have shown great progress on this problem, their models are usually large and computationally expensive. In this paper, we propose a novel spatially variant neural network to address the problem. The proposed network is composed of three deep convolutional neural networks (CNNs) and a recurrent neural network (RNN). RNN is used as a deconvolution operator performed on feature maps extracted from the input image by one of the CNNs. Another CNN is used to learn the weights for the RNN at every location. As a result, the RNN is spatially variant and could implicitly model the deblurring process with spatially variant kernels. The third CNN is used to reconstruct the final deblurred feature maps into restored image. The whole network is end-to-end trainable. Our analysis shows that the proposed network has a large receptive field even with a small model size. Quantitative and qualitative evaluations on public datasets demonstrate that the proposed method performs favorably against state-of-the-art algorithms in terms of accuracy, speed, and model size.
Tasks	Deblurring
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Dynamic_Scene_Deblurring_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Dynamic_Scene_Deblurring_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/dynamic-scene-deblurring-using-spatially
Repo	https://github.com/zhjwustc/cvpr18_rnn_deblur_matcaffe
Framework	none

End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space


Title	End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space
Authors	Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia
Abstract	We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn {`}distributional similarity{'} in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space. To validate our hypothesis, we focus on the {`}image{'} side of image captioning, and vary the input image representation but keep the RNN text generation model of a CNN-RNN constant. Our analysis indicates that image captioning models (i) are capable of separating structure from noisy input representations; (ii) experience virtually no significant performance loss when a high dimensional representation is compressed to a lower dimensional space; (iii) cluster images with similar visual and linguistic information together. Our experiments all point to one fact: that our distributional similarity hypothesis holds. We conclude that, regardless of the image representation, image captioning systems seem to match images and generate captions in a learned joint image-text semantic subspace.
Tasks	Image Captioning, Text Generation
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-5455/
PDF	https://www.aclweb.org/anthology/W18-5455
PWC	https://paperswithcode.com/paper/end-to-end-image-captioning-exploits
Repo	https://github.com/sheffieldnlp/whatIC
Framework	none

A Novel Deterministic Framework for Non-probabilistic Recommender Systems


Title	A Novel Deterministic Framework for Non-probabilistic Recommender Systems
Authors	Avinash Bhat, Divya Madhav Kamath, Anitha C
Abstract	Recommendation is a technique which helps and suggests a user, any relevant item from a large information space. Current techniques for this purpose include non-probabilistic methods like content-based filtering and collaborative filtering (CF) and probabilistic methods like Bayesian inference and Case-based reasoning methods. CF algorithms use similarity measures for calculating similarity between users. In this paper, we propose a novel framework which deterministically switches between the CF algorithms based on sparsity to improve accuracy of recommendation.
Tasks	Bayesian Inference, Recommendation Systems
Published	2018-09-02
URL	https://link.springer.com/chapter/10.1007/978-981-13-1498-8_8
PDF	https://www.researchgate.net/publication/327389858_A_Novel_Deterministic_Framework_for_Non-probabilistic_Recommender_Systems_Proceedings_of_IEMIS_2018_Volume_2
PWC	https://paperswithcode.com/paper/a-novel-deterministic-framework-for-non
Repo	https://github.com/avinashbhat/similarity-item-based
Framework	none

Detecting Decision Ambiguity from Facial Images


Title	Detecting Decision Ambiguity from Facial Images
Authors	Pavel Jahoda, Antonin Vobecky, Jan Cech, Jiri Matas
Abstract	In situations when potentially costly decisions are being made, faces of people tend to reflect a level of certainty about the appropriateness of the chosen decision. This fact is known from the psychological literature. In the paper, we propose a method that uses facial images for automatic detection of the decision ambiguity state of a subject. To train and test the method, we collected a large-scale dataset from “Who Wants to Be a Millionaire?” – a popular TV game show. The videos provide examples of various mental states of contestants, including uncertainty, doubts and hesitation. The annotation of the videos is done automatically from onscreen graphics. The problem of detecting decision ambiguity is formulated as binary classification. Video-clips where a contestant asks for help (audience, friend, 50:50) are considered as positive samples; if he (she) replies directly as negative ones. We propose a baseline method combining a deep convolutional neural network with an SVM. The method has an error rate of 24%. The error of human volunteers on the same dataset is 45%, close to chance.
Tasks
Published	2018-05-15
URL	https://ieeexplore.ieee.org/document/8373873
PDF	https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8373873
PWC	https://paperswithcode.com/paper/detecting-decision-ambiguity-from-facial
Repo	https://github.com/JahodaPaul/DecisionAmbiguityRecognition
Framework	tf

LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs


Title	LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs
Authors	Daniel Kondratyuk, Tom{'a}{\v{s}} Gaven{\v{c}}iak, Milan Straka, Jan Haji{\v{c}}
Abstract	We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using bidirectional RNNs with character-level and word-level embeddings. We demonstrate that both tasks benefit from sharing the encoding part of the network, predicting tag subcategories, and using the tagger output as an input to the lemmatizer. We evaluate our model across several languages with complex morphology, which surpasses state-of-the-art accuracy in both part-of-speech tagging and lemmatization in Czech, German, and Arabic.
Tasks	Lemmatization, Machine Translation, Part-Of-Speech Tagging, Semantic Role Labeling, Sentiment Analysis
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1532/
PDF	https://www.aclweb.org/anthology/D18-1532
PWC	https://paperswithcode.com/paper/lemmatag-jointly-tagging-and-lemmatizing-for-1
Repo	https://github.com/hyperparticle/LemmaTag
Framework	tf

Clustering for Binary Featured Datasets


Title	Clustering for Binary Featured Datasets
Authors	Peter Taraba
Abstract	Clustering is one of the most important concepts for unsupervised learning in machine learning. While there are numerous clustering algorithms already, many, including the popular one—k-means algorithm, require the number of clusters to be specified in advance, a huge drawback. Some studies use the silhouette coefficient to determine the optimal number of clusters. In this study, we introduce a novel algorithm called Powered Outer Probabilistic Clustering, show how it works through back-propagation (starting with many clusters and ending with an optimal number of clusters) , and show that the algorithm converges to the expected (optimal) number of clusters on theoretical examples.
Tasks
Published	2018-10-25
URL	https://link.springer.com/chapter/10.1007/978-981-13-2191-7_10
PDF	https://books.google.com/books?id=ANF0DwAAQBAJ&pg=PA127#v=onepage&q&f=false
PWC	https://paperswithcode.com/paper/clustering-for-binary-featured-datasets
Repo	https://github.com/pepe78/Small-Bang
Framework	none

Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network


Title	Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network
Authors	Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, Hua Wu
Abstract	Human generates responses relying on semantic and functional dependencies, including coreference relation, among dialogue elements and their context. In this paper, we investigate matching a response with its multi-turn context using dependency information based entirely on attention. Our solution is inspired by the recently proposed Transformer in machine translation (Vaswani et al., 2017) and we extend the attention mechanism in two ways. First, we construct representations of text segments at different granularities solely with stacked self-attention. Second, we try to extract the truly matched segment pairs with attention across the context and response. We jointly introduce those two kinds of attention in one uniform neural network. Experiments on two large-scale multi-turn response selection tasks show that our proposed model significantly outperforms the state-of-the-art models.
Tasks	Chatbot, Conversational Response Selection
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1103/
PDF	https://www.aclweb.org/anthology/P18-1103
PWC	https://paperswithcode.com/paper/multi-turn-response-selection-for-chatbots
Repo	https://github.com/baidu/Dialogue
Framework	tf

Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations


Title	Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations
Authors	Jianmo Ni, Julian McAuley
Abstract	In this paper, we focus on the problem of building assistive systems that can help users to write reviews. We cast this problem using an encoder-decoder framework that generates personalized reviews by expanding short phrases (e.g. review summaries, product titles) provided as input to the system. We incorporate aspect-level information via an aspect encoder that learns aspect-aware user and item representations. An attention fusion layer is applied to control generation by attending on the outputs of multiple encoders. Experimental results show that our model successfully learns representations capable of generating coherent and diverse reviews. In addition, the learned aspect-aware representations discover those aspects that users are more inclined to discuss and bias the generated text toward their personalized aspect preferences.
Tasks	Recommendation Systems, Sentiment Analysis, Text Generation
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2112/
PDF	https://www.aclweb.org/anthology/P18-2112
PWC	https://paperswithcode.com/paper/personalized-review-generation-by-expanding
Repo	https://github.com/nijianmo/textExpansion
Framework	pytorch

Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings


Title	Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings
Authors	Junjie Li, Haitong Yang, Chengqing Zong
Abstract	Document-level multi-aspect sentiment classification aims to predict user{'}s sentiment polarities for different aspects of a product in a review. Existing approaches mainly focus on text information. However, the authors (i.e. users) and overall ratings of reviews are ignored, both of which are proved to be significant on interpreting the sentiments of different aspects in this paper. Therefore, we propose a model called Hierarchical User Aspect Rating Network (HUARN) to consider user preference and overall ratings jointly. Specifically, HUARN adopts a hierarchical architecture to encode word, sentence, and document level information. Then, user attention and aspect attention are introduced into building sentence and document level representation. The document representation is combined with user and overall rating information to predict aspect ratings of a review. Diverse aspects are treated differently and a multi-task framework is adopted. Empirical results on two real-world datasets show that HUARN achieves state-of-the-art performances.
Tasks	Multi-Task Learning, Sentiment Analysis
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1079/
PDF	https://www.aclweb.org/anthology/C18-1079
PWC	https://paperswithcode.com/paper/document-level-multi-aspect-sentiment
Repo	https://github.com/Junjieli0704/HUARN
Framework	pytorch