Paper Group NAWR 9
Faster Neural Networks Straight from JPEG. Textbook Question Answering Under Instructor Guidance With Memory Networks. Distractor Generation for Multiple Choice Questions Using Learning to Rank. Neural Quality Estimation of Grammatical Error Correction. ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection. Learning Word …
Faster Neural Networks Straight from JPEG
Title | Faster Neural Networks Straight from JPEG |
Authors | Lionel Gueguen, Alex Sergeev, Ben Kadlec, Rosanne Liu, Jason Yosinski |
Abstract | The simple, elegant approach of training convolutional neural networks (CNNs) directly from RGB pixels has enjoyed overwhelming empirical success. But can more performance be squeezed out of networks by using different input representations? In this paper we propose and explore a simple idea: train CNNs directly on the blockwise discrete cosine transform (DCT) coefficients computed and available in the middle of the JPEG codec. Intuitively, when processing JPEG images using CNNs, it seems unnecessary to decompress a blockwise frequency representation to an expanded pixel representation, shuffle it from CPU to GPU, and then process it with a CNN that will learn something similar to a transform back to frequency representation in its first layers. Why not skip both steps and feed the frequency domain into the network directly? In this paper we modify \libjpeg to produce DCT coefficients directly, modify a ResNet-50 network to accommodate the differently sized and strided input, and evaluate performance on ImageNet. We find networks that are both faster and more accurate, as well as networks with about the same accuracy but 1.77x faster than ResNet-50. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7649-faster-neural-networks-straight-from-jpeg |
http://papers.nips.cc/paper/7649-faster-neural-networks-straight-from-jpeg.pdf | |
PWC | https://paperswithcode.com/paper/faster-neural-networks-straight-from-jpeg |
Repo | https://github.com/uber-research/jpeg2dct |
Framework | tf |
Textbook Question Answering Under Instructor Guidance With Memory Networks
Title | Textbook Question Answering Under Instructor Guidance With Memory Networks |
Authors | Juzheng Li, Hang Su, Jun Zhu, Siyu Wang, Bo Zhang |
Abstract | Textbook Question Answering (TQA) is a task to choose the most proper answers by reading a multi-modal context of abundant essays and images. TQA serves as a favorable test bed for visual and textual reasoning. However, most of the current methods are incapable of reasoning over the long contexts and images. To address this issue, we propose a novel approach of Instructor Guidance with Memory Networks (IGMN) which conducts the TQA task by finding contradictions between the candidate answers and their corresponding context. We build the Contradiction Entity-Relationship Graph (CERG) to extend the passage-level multi-modal contradictions to an essay level. The machine thus performs as an instructor to extract the essay-level contradictions as the Guidance. Afterwards, we exploit the memory networks to capture the information in the Guidance, and use the attention mechanisms to jointly reason over the global features of the multi-modal input. Extensive experiments demonstrate that our method outperforms the state-of-the-arts on the TQA dataset. The source code is available at https://github.com/freerailway/igmn. |
Tasks | Question Answering |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Li_Textbook_Question_Answering_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Li_Textbook_Question_Answering_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/textbook-question-answering-under-instructor |
Repo | https://github.com/freerailway/igmn |
Framework | tf |
Distractor Generation for Multiple Choice Questions Using Learning to Rank
Title | Distractor Generation for Multiple Choice Questions Using Learning to Rank |
Authors | Chen Liang, Xiao Yang, Neisarg Dave, Drew Wham, Bart Pursel, C. Lee Giles |
Abstract | We investigate how machine learning models, specifically ranking models, can be used to select useful distractors for multiple choice questions. Our proposed models can learn to select distractors that resemble those in actual exam questions, which is different from most existing unsupervised ontology-based and similarity-based methods. We empirically study feature-based and neural net (NN) based ranking models with experiments on the recently released SciQ dataset and our MCQL dataset. Experimental results show that feature-based ensemble learning methods (random forest and LambdaMART) outperform both the NN-based method and unsupervised baselines. These two datasets can also be used as benchmarks for distractor generation. |
Tasks | Learning-To-Rank |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0533/ |
https://www.aclweb.org/anthology/W18-0533 | |
PWC | https://paperswithcode.com/paper/distractor-generation-for-multiple-choice |
Repo | https://github.com/harrylclc/LTR-DG |
Framework | tf |
Neural Quality Estimation of Grammatical Error Correction
Title | Neural Quality Estimation of Grammatical Error Correction |
Authors | Shamil Chollampatt, Hwee Tou Ng |
Abstract | Grammatical error correction (GEC) systems deployed in language learning environments are expected to accurately correct errors in learners{'} writing. However, in practice, they often produce spurious corrections and fail to correct many errors, thereby misleading learners. This necessitates the estimation of the quality of output sentences produced by GEC systems so that instructors can selectively intervene and re-correct the sentences which are poorly corrected by the system and ensure that learners get accurate feedback. We propose the first neural approach to automatic quality estimation of GEC output sentences that does not employ any hand-crafted features. Our system is trained in a supervised manner on learner sentences and corresponding GEC system outputs with quality score labels computed using human-annotated references. Our neural quality estimation models for GEC show significant improvements over a strong feature-based baseline. We also show that a state-of-the-art GEC system can be improved when quality scores are used as features for re-ranking the N-best candidates. |
Tasks | Grammatical Error Correction, Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1274/ |
https://www.aclweb.org/anthology/D18-1274 | |
PWC | https://paperswithcode.com/paper/neural-quality-estimation-of-grammatical |
Repo | https://github.com/nusnlp/neuqe |
Framework | pytorch |
ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection
Title | ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection |
Authors | Devamanyu Hazarika, Soujanya Poria, Rada Mihalcea, Erik Cambria, Roger Zimmermann |
Abstract | Emotion recognition in conversations is crucial for building empathetic machines. Present works in this domain do not explicitly consider the inter-personal influences that thrive in the emotional dynamics of dialogues. To this end, we propose Interactive COnversational memory Network (ICON), a multimodal emotion detection framework that extracts multimodal features from conversational videos and hierarchically models the self- and inter-speaker emotional influences into global memories. Such memories generate contextual summaries which aid in predicting the emotional orientation of utterance-videos. Our model outperforms state-of-the-art networks on multiple classification and regression tasks in two benchmark datasets. |
Tasks | Emotion Recognition, Emotion Recognition in Context, Emotion Recognition in Conversation, Multimodal Emotion Recognition, Multimodal Sentiment Analysis, Sentiment Analysis |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1280/ |
https://www.aclweb.org/anthology/D18-1280 | |
PWC | https://paperswithcode.com/paper/icon-interactive-conversational-memory |
Repo | https://github.com/SenticNet/conv-emotion |
Framework | pytorch |
Learning Word Meta-Embeddings by Autoencoding
Title | Learning Word Meta-Embeddings by Autoencoding |
Authors | Danushka Bollegala, Cong Bao |
Abstract | Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more accurate and complete meta-embeddings of words. We model the meta-embedding learning problem as an autoencoding problem, where we would like to learn a meta-embedding space that can accurately reconstruct all source embeddings simultaneously. Thereby, the meta-embedding space is enforced to capture complementary information in different source embeddings via a coherent common embedding space. We propose three flavours of autoencoded meta-embeddings motivated by different requirements that must be satisfied by a meta-embedding. Our experimental results on a series of benchmark evaluations show that the proposed autoencoded meta-embeddings outperform the existing state-of-the-art meta-embeddings in multiple tasks. |
Tasks | Dependency Parsing, Machine Translation, Part-Of-Speech Tagging, Sentiment Analysis, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1140/ |
https://www.aclweb.org/anthology/C18-1140 | |
PWC | https://paperswithcode.com/paper/learning-word-meta-embeddings-by-autoencoding |
Repo | https://github.com/CongBao/AutoencodedMetaEmbedding |
Framework | none |
Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
Title | Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks |
Authors | Jiawei Zhang, Jinshan Pan, Jimmy Ren, Yibing Song, Linchao Bao, Rynson W.H. Lau, Ming-Hsuan Yang |
Abstract | Due to the spatially variant blur caused by camera shake and object motions under different scene depths, deblurring images captured from dynamic scenes is challenging. Although recent works based on deep neural networks have shown great progress on this problem, their models are usually large and computationally expensive. In this paper, we propose a novel spatially variant neural network to address the problem. The proposed network is composed of three deep convolutional neural networks (CNNs) and a recurrent neural network (RNN). RNN is used as a deconvolution operator performed on feature maps extracted from the input image by one of the CNNs. Another CNN is used to learn the weights for the RNN at every location. As a result, the RNN is spatially variant and could implicitly model the deblurring process with spatially variant kernels. The third CNN is used to reconstruct the final deblurred feature maps into restored image. The whole network is end-to-end trainable. Our analysis shows that the proposed network has a large receptive field even with a small model size. Quantitative and qualitative evaluations on public datasets demonstrate that the proposed method performs favorably against state-of-the-art algorithms in terms of accuracy, speed, and model size. |
Tasks | Deblurring |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Dynamic_Scene_Deblurring_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Dynamic_Scene_Deblurring_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-scene-deblurring-using-spatially |
Repo | https://github.com/zhjwustc/cvpr18_rnn_deblur_matcaffe |
Framework | none |
End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space
Title | End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space |
Authors | Pranava Swaroop Madhyastha, Josiah Wang, Lucia Specia |
Abstract | We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn {}distributional similarity{'} in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space. To validate our hypothesis, we focus on the { }image{'} side of image captioning, and vary the input image representation but keep the RNN text generation model of a CNN-RNN constant. Our analysis indicates that image captioning models (i) are capable of separating structure from noisy input representations; (ii) experience virtually no significant performance loss when a high dimensional representation is compressed to a lower dimensional space; (iii) cluster images with similar visual and linguistic information together. Our experiments all point to one fact: that our distributional similarity hypothesis holds. We conclude that, regardless of the image representation, image captioning systems seem to match images and generate captions in a learned joint image-text semantic subspace. |
Tasks | Image Captioning, Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5455/ |
https://www.aclweb.org/anthology/W18-5455 | |
PWC | https://paperswithcode.com/paper/end-to-end-image-captioning-exploits |
Repo | https://github.com/sheffieldnlp/whatIC |
Framework | none |
A Novel Deterministic Framework for Non-probabilistic Recommender Systems
Title | A Novel Deterministic Framework for Non-probabilistic Recommender Systems |
Authors | Avinash Bhat, Divya Madhav Kamath, Anitha C |
Abstract | Recommendation is a technique which helps and suggests a user, any relevant item from a large information space. Current techniques for this purpose include non-probabilistic methods like content-based filtering and collaborative filtering (CF) and probabilistic methods like Bayesian inference and Case-based reasoning methods. CF algorithms use similarity measures for calculating similarity between users. In this paper, we propose a novel framework which deterministically switches between the CF algorithms based on sparsity to improve accuracy of recommendation. |
Tasks | Bayesian Inference, Recommendation Systems |
Published | 2018-09-02 |
URL | https://link.springer.com/chapter/10.1007/978-981-13-1498-8_8 |
https://www.researchgate.net/publication/327389858_A_Novel_Deterministic_Framework_for_Non-probabilistic_Recommender_Systems_Proceedings_of_IEMIS_2018_Volume_2 | |
PWC | https://paperswithcode.com/paper/a-novel-deterministic-framework-for-non |
Repo | https://github.com/avinashbhat/similarity-item-based |
Framework | none |
Detecting Decision Ambiguity from Facial Images
Title | Detecting Decision Ambiguity from Facial Images |
Authors | Pavel Jahoda, Antonin Vobecky, Jan Cech, Jiri Matas |
Abstract | In situations when potentially costly decisions are being made, faces of people tend to reflect a level of certainty about the appropriateness of the chosen decision. This fact is known from the psychological literature. In the paper, we propose a method that uses facial images for automatic detection of the decision ambiguity state of a subject. To train and test the method, we collected a large-scale dataset from “Who Wants to Be a Millionaire?” – a popular TV game show. The videos provide examples of various mental states of contestants, including uncertainty, doubts and hesitation. The annotation of the videos is done automatically from onscreen graphics. The problem of detecting decision ambiguity is formulated as binary classification. Video-clips where a contestant asks for help (audience, friend, 50:50) are considered as positive samples; if he (she) replies directly as negative ones. We propose a baseline method combining a deep convolutional neural network with an SVM. The method has an error rate of 24%. The error of human volunteers on the same dataset is 45%, close to chance. |
Tasks | |
Published | 2018-05-15 |
URL | https://ieeexplore.ieee.org/document/8373873 |
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8373873 | |
PWC | https://paperswithcode.com/paper/detecting-decision-ambiguity-from-facial |
Repo | https://github.com/JahodaPaul/DecisionAmbiguityRecognition |
Framework | tf |
LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs
Title | LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs |
Authors | Daniel Kondratyuk, Tom{'a}{\v{s}} Gaven{\v{c}}iak, Milan Straka, Jan Haji{\v{c}} |
Abstract | We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using bidirectional RNNs with character-level and word-level embeddings. We demonstrate that both tasks benefit from sharing the encoding part of the network, predicting tag subcategories, and using the tagger output as an input to the lemmatizer. We evaluate our model across several languages with complex morphology, which surpasses state-of-the-art accuracy in both part-of-speech tagging and lemmatization in Czech, German, and Arabic. |
Tasks | Lemmatization, Machine Translation, Part-Of-Speech Tagging, Semantic Role Labeling, Sentiment Analysis |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1532/ |
https://www.aclweb.org/anthology/D18-1532 | |
PWC | https://paperswithcode.com/paper/lemmatag-jointly-tagging-and-lemmatizing-for-1 |
Repo | https://github.com/hyperparticle/LemmaTag |
Framework | tf |
Clustering for Binary Featured Datasets
Title | Clustering for Binary Featured Datasets |
Authors | Peter Taraba |
Abstract | Clustering is one of the most important concepts for unsupervised learning in machine learning. While there are numerous clustering algorithms already, many, including the popular one—k-means algorithm, require the number of clusters to be specified in advance, a huge drawback. Some studies use the silhouette coefficient to determine the optimal number of clusters. In this study, we introduce a novel algorithm called Powered Outer Probabilistic Clustering, show how it works through back-propagation (starting with many clusters and ending with an optimal number of clusters) , and show that the algorithm converges to the expected (optimal) number of clusters on theoretical examples. |
Tasks | |
Published | 2018-10-25 |
URL | https://link.springer.com/chapter/10.1007/978-981-13-2191-7_10 |
https://books.google.com/books?id=ANF0DwAAQBAJ&pg=PA127#v=onepage&q&f=false | |
PWC | https://paperswithcode.com/paper/clustering-for-binary-featured-datasets |
Repo | https://github.com/pepe78/Small-Bang |
Framework | none |
Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network
Title | Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network |
Authors | Xiangyang Zhou, Lu Li, Daxiang Dong, Yi Liu, Ying Chen, Wayne Xin Zhao, Dianhai Yu, Hua Wu |
Abstract | Human generates responses relying on semantic and functional dependencies, including coreference relation, among dialogue elements and their context. In this paper, we investigate matching a response with its multi-turn context using dependency information based entirely on attention. Our solution is inspired by the recently proposed Transformer in machine translation (Vaswani et al., 2017) and we extend the attention mechanism in two ways. First, we construct representations of text segments at different granularities solely with stacked self-attention. Second, we try to extract the truly matched segment pairs with attention across the context and response. We jointly introduce those two kinds of attention in one uniform neural network. Experiments on two large-scale multi-turn response selection tasks show that our proposed model significantly outperforms the state-of-the-art models. |
Tasks | Chatbot, Conversational Response Selection |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1103/ |
https://www.aclweb.org/anthology/P18-1103 | |
PWC | https://paperswithcode.com/paper/multi-turn-response-selection-for-chatbots |
Repo | https://github.com/baidu/Dialogue |
Framework | tf |
Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations
Title | Personalized Review Generation By Expanding Phrases and Attending on Aspect-Aware Representations |
Authors | Jianmo Ni, Julian McAuley |
Abstract | In this paper, we focus on the problem of building assistive systems that can help users to write reviews. We cast this problem using an encoder-decoder framework that generates personalized reviews by expanding short phrases (e.g. review summaries, product titles) provided as input to the system. We incorporate aspect-level information via an aspect encoder that learns aspect-aware user and item representations. An attention fusion layer is applied to control generation by attending on the outputs of multiple encoders. Experimental results show that our model successfully learns representations capable of generating coherent and diverse reviews. In addition, the learned aspect-aware representations discover those aspects that users are more inclined to discuss and bias the generated text toward their personalized aspect preferences. |
Tasks | Recommendation Systems, Sentiment Analysis, Text Generation |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2112/ |
https://www.aclweb.org/anthology/P18-2112 | |
PWC | https://paperswithcode.com/paper/personalized-review-generation-by-expanding |
Repo | https://github.com/nijianmo/textExpansion |
Framework | pytorch |
Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings
Title | Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings |
Authors | Junjie Li, Haitong Yang, Chengqing Zong |
Abstract | Document-level multi-aspect sentiment classification aims to predict user{'}s sentiment polarities for different aspects of a product in a review. Existing approaches mainly focus on text information. However, the authors (i.e. users) and overall ratings of reviews are ignored, both of which are proved to be significant on interpreting the sentiments of different aspects in this paper. Therefore, we propose a model called Hierarchical User Aspect Rating Network (HUARN) to consider user preference and overall ratings jointly. Specifically, HUARN adopts a hierarchical architecture to encode word, sentence, and document level information. Then, user attention and aspect attention are introduced into building sentence and document level representation. The document representation is combined with user and overall rating information to predict aspect ratings of a review. Diverse aspects are treated differently and a multi-task framework is adopted. Empirical results on two real-world datasets show that HUARN achieves state-of-the-art performances. |
Tasks | Multi-Task Learning, Sentiment Analysis |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1079/ |
https://www.aclweb.org/anthology/C18-1079 | |
PWC | https://paperswithcode.com/paper/document-level-multi-aspect-sentiment |
Repo | https://github.com/Junjieli0704/HUARN |
Framework | pytorch |