October 16, 2019

2988 words 15 mins read

Paper Group NAWR 32

Invertibility of Convolutional Generative Networks from Partial Measurements. Bandit Learning with Implicit Feedback. Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentiment analysis in social media. Deep Blending for Free-Viewpoint Image-Based-Rendering. Orthogonal Weight Normalization: Solution to Optimization overMultip …

Invertibility of Convolutional Generative Networks from Partial Measurements


Title	Invertibility of Convolutional Generative Networks from Partial Measurements
Authors	Fangchang Ma, Ulas Ayaz, Sertac Karaman
Abstract	In this work, we present new theoretical results on convolutional generative neural networks, in particular their invertibility (i.e., the recovery of input latent code given the network output). The study of network inversion problem is motivated by image inpainting and the mode collapse problem in training GAN. Network inversion is highly non-convex, and thus is typically computationally intractable and without optimality guarantees. However, we rigorously prove that, under some mild technical assumptions, the input of a two-layer convolutional generative network can be deduced from the network output efficiently using simple gradient descent. This new theoretical finding implies that the mapping from the low- dimensional latent space to the high-dimensional image space is bijective (i.e., one-to-one). In addition, the same conclusion holds even when the network output is only partially observed (i.e., with missing pixels). Our theorems hold for 2-layer convolutional generative network with ReLU as the activation function, but we demonstrate empirically that the same conclusion extends to multi-layer networks and networks with other activation functions, including the leaky ReLU, sigmoid and tanh.
Tasks	Image Inpainting
Published	2018-12-01
URL	http://papers.nips.cc/paper/8171-invertibility-of-convolutional-generative-networks-from-partial-measurements
PDF	http://papers.nips.cc/paper/8171-invertibility-of-convolutional-generative-networks-from-partial-measurements.pdf
PWC	https://paperswithcode.com/paper/invertibility-of-convolutional-generative
Repo	https://github.com/fangchangma/invert-generative-networks
Framework	none

Bandit Learning with Implicit Feedback


Title	Bandit Learning with Implicit Feedback
Authors	Yi Qi, Qingyun Wu, Hongning Wang, Jie Tang, Maosong Sun
Abstract	Implicit feedback, such as user clicks, although abundant in online information service systems, does not provide substantial evidence on users’ evaluation of system’s output. Without proper modeling, such incomplete supervision inevitably misleads model estimation, especially in a bandit learning setting where the feedback is acquired on the fly. In this work, we perform contextual bandit learning with implicit feedback by modeling the feedback as a composition of user result examination and relevance judgment. Since users’ examination behavior is unobserved, we introduce latent variables to model it. We perform Thompson sampling on top of variational Bayesian inference for arm selection and model update. Our upper regret bound analysis of the proposed algorithm proves its feasibility of learning from implicit feedback in a bandit setting; and extensive empirical evaluations on click logs collected from a major MOOC platform further demonstrate its learning effectiveness in practice.
Tasks	Bayesian Inference
Published	2018-12-01
URL	http://papers.nips.cc/paper/7958-bandit-learning-with-implicit-feedback
PDF	http://papers.nips.cc/paper/7958-bandit-learning-with-implicit-feedback.pdf
PWC	https://paperswithcode.com/paper/bandit-learning-with-implicit-feedback
Repo	https://github.com/qy7171/ec_bandit
Framework	none


Title	Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentiment analysis in social media
Authors	Isabel Cachola, Eric Holgate, Daniel Preo{\c{t}}iuc-Pietro, Junyi Jessy Li
Abstract	Vulgarity is a common linguistic expression and is used to perform several linguistic functions. Understanding their usage can aid both linguistic and psychological phenomena as well as benefit downstream natural language processing applications such as sentiment analysis. This study performs a large-scale, data-driven empirical analysis of vulgar words using social media data. We analyze the socio-cultural and pragmatic aspects of vulgarity using tweets from users with known demographics. Further, we collect sentiment ratings for vulgar tweets to study the relationship between the use of vulgar words and perceived sentiment and show that explicitly modeling vulgar words can boost sentiment analysis performance.
Tasks	Sentiment Analysis
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1248/
PDF	https://www.aclweb.org/anthology/C18-1248
PWC	https://paperswithcode.com/paper/expressively-vulgar-the-socio-dynamics-of
Repo	https://github.com/ericholgate/vulgartwitter
Framework	none

Deep Blending for Free-Viewpoint Image-Based-Rendering


Title	Deep Blending for Free-Viewpoint Image-Based-Rendering
Authors	Peter Hedman, Julien Philip, True Price, Jan-Michael Frahm, George Drettakis, Gabriel Brostow
Abstract	Free-viewpoint image-based rendering (IBR) is a standing challenge. IBR methods combine warped versions of input photos to synthesize a novel view. The image quality of this combination is directly affected by geometric inaccuracies of multi-view stereo (MVS) reconstruction and by view- and image-dependent effects that produce artifacts when contributions from different input views are blended. We present a new deep learning approach to blending for IBR, in which we use held-out real image data to learn blending weights to combine input photo contributions. Our Deep Blending method requires us to address several challenges to achieve our goal of interactive free-viewpoint IBR navigation. We first need to provide sufficiently accurate geometry so the Convolutional Neural Network (CNN) can succeed in finding correct blending weights. We do this by combining two different MVS reconstructions with complementary accuracy vs. completeness tradeoffs. To tightly integrate learning in an interactive IBR system, we need to adapt our rendering algorithm to produce a fixed number of input layers that can then be blended by the CNN. We generate training data with a variety of captured scenes, using each input photo as ground truth in a held-out approach. We also design the network architecture and the training loss to provide high quality novel view synthesis, while reducing temporal flickering artifacts. Our results demonstrate free-viewpoint IBR in a wide variety of scenes, clearly surpassing previous methods in visual quality, especially when moving far from the input cameras.
Tasks	Novel View Synthesis
Published	2018-12-01
URL	http://visual.cs.ucl.ac.uk/pubs/deepblending/
PDF	http://visual.cs.ucl.ac.uk/pubs/deepblending/deepblending_siggraph_asia_2018.pdf
PWC	https://paperswithcode.com/paper/deep-blending-for-free-viewpoint-image-based
Repo	https://github.com/Phog/DeepBlending
Framework	tf

Orthogonal Weight Normalization: Solution to Optimization overMultiple Dependent Stiefel Manifolds in Deep Neural Networks


Title	Orthogonal Weight Normalization: Solution to Optimization overMultiple Dependent Stiefel Manifolds in Deep Neural Networks
Authors	Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Yongliang Wang, Bo Li
Abstract	Orthogonal matrix has shown advantages in training Recurrent Neural Networks (RNNs), but such matrix is limited to be square for the hidden-to-hidden transformation in RNNs. In this paper, we generalize such square orthogonal matrix to orthogonal rectangular matrix and formulating this problem in feed-forward Neural Networks (FNNs) as Optimization over Multiple Dependent Stiefel Manifolds (OMDSM). We show that the rectangular orthogonal matrix can stabilize the distribution of network activations and regularize FNNs. We also propose a novel orthogonal weight normalization method to solve OMDSM. Particularly, it constructs orthogonal transformation over proxy parameters to ensure the weight matrix is orthogonal and back-propagates gradient information through the transformation during training. To guarantee stability, we minimize the distortions between proxy parameters and canonical weights over all tractable orthogonal transformations. In addition, we design an orthogonal linear module (OLM) to learn orthogonal filter banks in practice, which can be used as an alternative to standard linear module. Extensive experiments demonstrate that by simply substituting OLM for standard linear module without revising any experimental protocols, our method largely improves the performance of the state-of-the-art networks, including Inception and residual networks on CIFAR and ImageNet datasets. In particular, we have reduced the test error of wide residual network on CIFAR-100 from 20.04% to 18.61% with such simple substitution. Our code is available online for result reproduction.
Tasks
Published	2018-02-02
URL	https://arxiv.org/abs/1709.06079
PDF	https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/17072/16695
PWC	https://paperswithcode.com/paper/orthogonal-weight-normalization-solution-to-1
Repo	https://github.com/huangleiBuaa/OthogonalWN
Framework	pytorch

Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator


Title	Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator
Authors	Badri Narayana Patro, Vinod Kumar Kurmi, S Kumar, eep, Vinay Namboodiri
Abstract	In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.
Tasks	Machine Reading Comprehension, Machine Translation, Paraphrase Generation, Reading Comprehension, Sentence Embedding, Sentence Embeddings, Sentiment Analysis
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1230/
PDF	https://www.aclweb.org/anthology/C18-1230
PWC	https://paperswithcode.com/paper/learning-semantic-sentence-embeddings-using
Repo	https://github.com/badripatro/PQG
Framework	pytorch

Denoising Distantly Supervised Open-Domain Question Answering


Title	Denoising Distantly Supervised Open-Domain Question Answering
Authors	Yankai Lin, Haozhe Ji, Zhiyuan Liu, Maosong Sun
Abstract	Distantly supervised open-domain question answering (DS-QA) aims to find answers in collections of unlabeled text. Existing DS-QA models usually retrieve related paragraphs from a large-scale corpus and apply reading comprehension technique to extract answers from the most relevant paragraph. They ignore the rich information contained in other paragraphs. Moreover, distant supervision data inevitably accompanies with the wrong labeling problem, and these noisy data will substantially degrade the performance of DS-QA. To address these issues, we propose a novel DS-QA model which employs a paragraph selector to filter out those noisy paragraphs and a paragraph reader to extract the correct answer from those denoised paragraphs. Experimental results on real-world datasets show that our model can capture useful information from noisy data and achieve significant improvements on DS-QA as compared to all baselines.
Tasks	Denoising, Information Retrieval, Open-Domain Question Answering, Question Answering, Reading Comprehension
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1161/
PDF	https://www.aclweb.org/anthology/P18-1161
PWC	https://paperswithcode.com/paper/denoising-distantly-supervised-open-domain
Repo	https://github.com/thunlp/OpenQA
Framework	pytorch

They Exist! Introducing Plural Mentions to Coreference Resolution and Entity Linking


Title	They Exist! Introducing Plural Mentions to Coreference Resolution and Entity Linking
Authors	Ethan Zhou, Jinho D. Choi
Abstract	This paper analyzes arguably the most challenging yet under-explored aspect of resolution tasks such as coreference resolution and entity linking, that is the resolution of plural mentions. Unlike singular mentions each of which represents one entity, plural mentions stand for multiple entities. To tackle this aspect, we take the character identification corpus from the SemEval 2018 shared task that consists of entity annotation for singular mentions, and expand it by adding annotation for plural mentions. We then introduce a novel coreference resolution algorithm that selectively creates clusters to handle both singular and plural mentions, and also a deep learning-based entity linking model that jointly handles both types of mentions through multi-task learning. Adjusted evaluation metrics are proposed for these tasks as well to handle the uniqueness of plural mentions. Our experiments show that the new coreference resolution and entity linking models significantly outperform traditional models designed only for singular mentions. To the best of our knowledge, this is the first time that plural mentions are thoroughly analyzed for these two resolution tasks.
Tasks	Coreference Resolution, Entity Linking, Machine Translation, Multi-Task Learning, Question Answering
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1003/
PDF	https://www.aclweb.org/anthology/C18-1003
PWC	https://paperswithcode.com/paper/they-exist-introducing-plural-mentions-to
Repo	https://github.com/emorynlp/character-identification
Framework	none

Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks


Title	Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks
Authors	Yao Zhao, Xiaochuan Ni, Yuanyuan Ding, Qifa Ke
Abstract	Question generation, the task of automatically creating questions that can be answered by a certain span of text within a given passage, is important for question-answering and conversational systems in digital assistants such as Alexa, Cortana, Google Assistant and Siri. Recent sequence to sequence neural models have outperformed previous rule-based systems. Existing models mainly focused on using one or two sentences as the input. Long text has posed challenges for sequence to sequence neural models in question generation {–} worse performances were reported if using the whole paragraph (with multiple sentences) as the input. In reality, however, it often requires the whole paragraph as context in order to generate high quality questions. In this paper, we propose a maxout pointer mechanism with gated self-attention encoder to address the challenges of processing long text inputs for question generation. With sentence-level inputs, our model outperforms previous approaches with either sentence-level or paragraph-level inputs. Furthermore, our model can effectively utilize paragraphs as inputs, pushing the state-of-the-art result from 13.9 to 16.3 (BLEU{_}4).
Tasks	Machine Translation, Question Answering, Question Generation, Reading Comprehension, Text Generation, Text Summarization
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1424/
PDF	https://www.aclweb.org/anthology/D18-1424
PWC	https://paperswithcode.com/paper/paragraph-level-neural-question-generation
Repo	https://github.com/seanie12/nqg
Framework	pytorch

Temporal Regularization for Markov Decision Process


Title	Temporal Regularization for Markov Decision Process
Authors	Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup
Abstract	Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games.
Tasks	Atari Games
Published	2018-12-01
URL	http://papers.nips.cc/paper/7449-temporal-regularization-for-markov-decision-process
PDF	http://papers.nips.cc/paper/7449-temporal-regularization-for-markov-decision-process.pdf
PWC	https://paperswithcode.com/paper/temporal-regularization-for-markov-decision
Repo	https://github.com/pierthodo/temporal_regularization
Framework	tf

How emotional are you? Neural Architectures for Emotion Intensity Prediction in Microblogs


Title	How emotional are you? Neural Architectures for Emotion Intensity Prediction in Microblogs
Authors	Devang Kulshreshtha, Pranav Goel, Anil Kumar Singh
Abstract	Social media based micro-blogging sites like Twitter have become a common source of real-time information (impacting organizations and their strategies, and are used for expressing emotions and opinions. Automated analysis of such content therefore rises in importance. To this end, we explore the viability of using deep neural networks on the specific task of emotion intensity prediction in tweets. We propose a neural architecture combining convolutional and fully connected layers in a non-sequential manner - done for the first time in context of natural language based tasks. Combined with lexicon-based features along with transfer learning, our model achieves state-of-the-art performance, outperforming the previous system by 0.044 or 4.4{%} Pearson correlation on the WASSA{'}17 EmoInt shared task dataset. We investigate the performance of deep multi-task learning models trained for all emotions at once in a unified architecture and get encouraging results. Experiments performed on evaluating correlation between emotion pairs offer interesting insights into the relationship between them.
Tasks	Multi-Task Learning, Transfer Learning
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1247/
PDF	https://www.aclweb.org/anthology/C18-1247
PWC	https://paperswithcode.com/paper/how-emotional-are-you-neural-architectures
Repo	https://github.com/Pranav-Goel/Neural_Emotion_Intensity_Prediction
Framework	tf

Salient Object Detection Driven by Fixation Prediction


Title	Salient Object Detection Driven by Fixation Prediction
Authors	Wenguan Wang, Jianbing Shen, Xingping Dong, Ali Borji
Abstract	Research in visual saliency has been focused on two major types of models namely fixation prediction and salient object detection. The relationship between the two, however, has been less explored. In this paper, we propose to employ the former model type to identify and segment salient objects in scenes. We build a novel neural network called Attentive Saliency Network (ASNet) that learns to detect salient objects from fixation maps. The fixation map, derived at the upper network layers, captures a high-level understanding of the scene. Salient object detection is then viewed as fine-grained object-level saliency segmentation and is progressively optimized with the guidance of the fixation map in a top-down manner. ASNet is based on a hierarchy of convolutional LSTMs (convLSTMs) that offers an efficient recurrent mechanism for sequential refinement of the segmentation map. Several loss functions are introduced for boosting the performance of the ASNet. Extensive experimental evaluation shows that our proposed ASNet is capable of generating accurate segmentation maps with the help of the computed fixation map. Our work offers a deeper insight into the mechanisms of attention and narrows the gap between salient object detection and fixation prediction.
Tasks	Object Detection, Salient Object Detection
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Salient_Object_Detection_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Salient_Object_Detection_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/salient-object-detection-driven-by-fixation
Repo	https://github.com/wenguanwang/ASNet
Framework	none

An Attribution Relations Corpus for Political News


Title	An Attribution Relations Corpus for Political News
Authors	Edward Newell, Drew Margolin, Derek Ruths
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1524/
PDF	https://www.aclweb.org/anthology/L18-1524
PWC	https://paperswithcode.com/paper/an-attribution-relations-corpus-for-political
Repo	https://github.com/networkdynamics/brat-attribution-annotation
Framework	none

Local Convergence Properties of SAGA/Prox-SVRG and Acceleration


Title	Local Convergence Properties of SAGA/Prox-SVRG and Acceleration
Authors	Clarice Poon, Jingwei Liang, Carola Schoenlieb
Abstract	In this paper, we present a local convergence anal- ysis for a class of stochastic optimisation meth- ods: the proximal variance reduced stochastic gradient methods, and mainly focus on SAGA (Defazio et al., 2014) and Prox-SVRG (Xiao & Zhang, 2014). Under the assumption that the non-smooth component of the optimisation prob- lem is partly smooth relative to a smooth mani- fold, we present a unified framework for the local convergence analysis of SAGA/Prox-SVRG: (i) the sequences generated by the methods are able to identify the smooth manifold in a finite num- ber of iterations; (ii) then the sequence enters a local linear convergence regime. Furthermore, we discuss various possibilities for accelerating these algorithms, including adapting to better lo- cal parameters, and applying higher-order deter- ministic/stochastic optimisation methods which can achieve super-linear convergence. Several concrete examples arising from machine learning are considered to demonstrate the obtained result.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1974
PDF	http://proceedings.mlr.press/v80/poon18a/poon18a.pdf
PWC	https://paperswithcode.com/paper/local-convergence-properties-of-sagaprox-svrg
Repo	https://github.com/jliang993/Local-VRSGD
Framework	none

ML-Plan: Automated machine learning via hierarchical planning


Title	ML-Plan: Automated machine learning via hierarchical planning
Authors	Felix Mohr, Marcel Wever, Eyke Hüllermeier
Abstract	Automated machine learning (AutoML) seeks to automatically select, compose, and parametrize machine learning algorithms, so as to achieve optimal performance on a given task (dataset). Although current approaches to AutoML have already produced impressive results, the field is still far from mature, and new techniques are still being developed. In this paper, we present ML-Plan, a new approach to AutoML based on hierarchical planning. To highlight the potential of this approach, we compare ML-Plan to the state-of-the-art frameworks Auto-WEKA, auto-sklearn, and TPOT. In an extensive series of experiments, we show that ML-Plan is highly competitive and often outperforms existing approaches.
Tasks	AutoML
Published	2018-07-03
URL	https://link.springer.com/article/10.1007/s10994-018-5735-z
PDF	https://link.springer.com/content/pdf/10.1007%2Fs10994-018-5735-z.pdf
PWC	https://paperswithcode.com/paper/ml-plan-automated-machine-learning-via
Repo	https://github.com/fmohr/AILIbs
Framework	none