January 30, 2020

3077 words 15 mins read

Paper Group ANR 227

Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation. Distributed Equivalent Substitution Training for Large-Scale Recommender Systems. Topic Augmented Generator for Abstractive Summarization. A robust method based on LOVO functions for solving least squares problems. Fast Color Constancy with Patch-wise Bright Pixels. Me …

Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation


Title	Curiosity-driven Reinforcement Learning for Diverse Visual Paragraph Generation
Authors	Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Jingjing Li, Yang Yang
Abstract	Visual paragraph generation aims to automatically describe a given image from different perspectives and organize sentences in a coherent way. In this paper, we address three critical challenges for this task in a reinforcement learning setting: the mode collapse, the delayed feedback, and the time-consuming warm-up for policy networks. Generally, we propose a novel Curiosity-driven Reinforcement Learning (CRL) framework to jointly enhance the diversity and accuracy of the generated paragraphs. First, by modeling the paragraph captioning as a long-term decision-making process and measuring the prediction uncertainty of state transitions as intrinsic rewards, the model is incentivized to memorize precise but rarely spotted descriptions to context, rather than being biased towards frequent fragments and generic patterns. Second, since the extrinsic reward from evaluation is only available until the complete paragraph is generated, we estimate its expected value at each time step with temporal-difference learning, by considering the correlations between successive actions. Then the estimated extrinsic rewards are complemented by dense intrinsic rewards produced from the derived curiosity module, in order to encourage the policy to fully explore action space and find a global optimum. Third, discounted imitation learning is integrated for learning from human demonstrations, without separately performing the time-consuming warm-up in advance. Extensive experiments conducted on the Standford image-paragraph dataset demonstrate the effectiveness and efficiency of the proposed method, improving the performance by 38.4% compared with state-of-the-art.
Tasks	Decision Making, Imitation Learning
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00169v2
PDF	https://arxiv.org/pdf/1908.00169v2.pdf
PWC	https://paperswithcode.com/paper/curiosity-driven-reinforcement-learning-for
Repo
Framework

Distributed Equivalent Substitution Training for Large-Scale Recommender Systems


Title	Distributed Equivalent Substitution Training for Large-Scale Recommender Systems
Authors	Haidong Rong, Yangzihao Wang, Feihu Zhou, Junjie Zhai, Haiyang Wu, Rui Lan, Fan Li, Han Zhang, Yuekui Yang, Zhenyu Guo, Di Wang
Abstract	We present Distributed Equivalent Substitution (DES) training, a novel distributed training framework for recommender systems with large-scale dynamic sparse features. Our framework achieves faster convergence with less communication overhead and better computing resource utilization. DES strategy splits a weights-rich operator into sub-operators with co-located weights and aggregates partial results with much smaller communication cost to form a computationally equivalent substitution to the original operator. We show that for different types of models that recommender systems use, we can always find computational equivalent substitutions and splitting strategies for their weights-rich operators with theoretical communication load reduced ranging from 72.26% to 99.77%. We also present an implementation of DES that outperforms state-of-the-art recommender systems. Experiments show that our framework achieves up to 83% communication savings compared to other recommender systems, and can bring up to 4.5x improvement on throughput for deep models.
Tasks	Recommendation Systems
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04823v2
PDF	https://arxiv.org/pdf/1909.04823v2.pdf
PWC	https://paperswithcode.com/paper/distributed-equivalent-substitution-training
Repo
Framework

Topic Augmented Generator for Abstractive Summarization


Title	Topic Augmented Generator for Abstractive Summarization
Authors	Melissa Ailem, Bowen Zhang, Fei Sha
Abstract	Steady progress has been made in abstractive summarization with attention-based sequence-to-sequence learning models. In this paper, we propose a new decoder where the output summary is generated by conditioning on both the input text and the latent topics of the document. The latent topics, identified by a topic model such as LDA, reveals more global semantic information that can be used to bias the decoder to generate words. In particular, they enable the decoder to have access to additional word co-occurrence statistics captured at document corpus level. We empirically validate the advantage of the proposed approach on both the CNN/Daily Mail and the WikiHow datasets. Concretely, we attain strongly improved ROUGE scores when compared to state-of-the-art models.
Tasks	Abstractive Text Summarization
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07026v1
PDF	https://arxiv.org/pdf/1908.07026v1.pdf
PWC	https://paperswithcode.com/paper/topic-augmented-generator-for-abstractive
Repo
Framework

A robust method based on LOVO functions for solving least squares problems


Title	A robust method based on LOVO functions for solving least squares problems
Authors	E. V. Castelani, R. Lopes, W. V. I. Shirabayashi, F. N. C. Sobral
Abstract	The robust adjustment of nonlinear models to data is considered in this paper. When data comes from real experiments, it is possible that measurement errors cause the appearance of discrepant values, which should be ignored when adjusting models to them. This work presents a Lower Order-value Optimization (LOVO) version of the Levenberg-Marquardt algorithm, which is well suited to deal with outliers in fitting problems. A general algorithm is presented and convergence to stationary points is demonstrated. Numerical results show that the algorithm is successfully able to detect and ignore outliers without too many specific parameters. Parallel and distributed executions of the algorithm are also possible, allowing for the use of larger datasets. Comparison against publicly available robust algorithms shows that the present approach is able to find better adjustments in well known statistical models.
Tasks
Published	2019-11-29
URL	https://arxiv.org/abs/1911.13078v1
PDF	https://arxiv.org/pdf/1911.13078v1.pdf
PWC	https://paperswithcode.com/paper/a-robust-method-based-on-lovo-functions-for
Repo
Framework

Fast Color Constancy with Patch-wise Bright Pixels


Title	Fast Color Constancy with Patch-wise Bright Pixels
Authors	Yiyao Shi, Jian Wang, Xiangyang Xue
Abstract	In this paper, a learning-free color constancy algorithm called the Patch-wise Bright Pixels (PBP) is proposed. In this algorithm, an input image is first downsampled and then cut equally into a few patches. After that, according to the modified brightness of each patch, a proper fraction of brightest pixels in the patch is selected. Finally, Gray World (GW)-based methods are applied to the selected bright pixels to estimate the illuminant of the scene. Experiments on NUS $8$-Camera Dataset show that the PBP algorithm outperforms the state-of-the-art learning-free methods as well as a broad range of learning-based ones. In particular, PBP processes a $1080$p image within two milliseconds, which is hundreds of times faster than the existing learning-free ones. Our algorithm offers a potential solution to the full-screen smart phones whose screen-to-body ratio is $100$%.
Tasks	Color Constancy
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07177v1
PDF	https://arxiv.org/pdf/1911.07177v1.pdf
PWC	https://paperswithcode.com/paper/fast-color-constancy-with-patch-wise-bright
Repo
Framework

Memory Augmented Graph Neural Networks for Sequential Recommendation


Title	Memory Augmented Graph Neural Networks for Sequential Recommendation
Authors	Chen Ma, Liheng Ma, Yingxue Zhang, Jianing Sun, Xue Liu, Mark Coates
Abstract	The chronological order of user-item interactions can reveal time-evolving and sequential user behaviors in many recommender systems. The items that users will interact with may depend on the items accessed in the past. However, the substantial increase of users and items makes sequential recommender systems still face non-trivial challenges: (1) the hardness of modeling the short-term user interests; (2) the difficulty of capturing the long-term user interests; (3) the effective modeling of item co-occurrence patterns. To tackle these challenges, we propose a memory augmented graph neural network (MA-GNN) to capture both the long- and short-term user interests. Specifically, we apply a graph neural network to model the item contextual information within a short-term period and utilize a shared memory network to capture the long-range dependencies between items. In addition to the modeling of user interests, we employ a bilinear function to capture the co-occurrence patterns of related items. We extensively evaluate our model on five real-world datasets, comparing with several state-of-the-art methods and using a variety of performance metrics. The experimental results demonstrate the effectiveness of our model for the task of Top-K sequential recommendation.
Tasks	Recommendation Systems
Published	2019-12-26
URL	https://arxiv.org/abs/1912.11730v1
PDF	https://arxiv.org/pdf/1912.11730v1.pdf
PWC	https://paperswithcode.com/paper/memory-augmented-graph-neural-networks-for
Repo
Framework

On constraint programming for a new flexible project scheduling problem with resource constraints


Title	On constraint programming for a new flexible project scheduling problem with resource constraints
Authors	Viktoria A. Hauder, Andreas Beham, Sebastian Raggl, Sophie N. Parragh, Michael Affenzeller
Abstract	Real-world project scheduling often requires flexibility in terms of the selection and the exact length of alternative production activities. Moreover, the simultaneous scheduling of multiple lots is mandatory in many production planning applications. To meet these requirements, a new flexible resource-constrained multi-project scheduling problem is introduced where both decisions (activity selection flexibility and time flexibility) are integrated. Besides the minimization of makespan, two alternative objectives inspired by a steel industry application case are presented: maximization of balanced length of selected activities (time balance) and maximization of balanced resource utilization (resource balance). New mixed integer and constraint programming (CP) models are proposed for the developed integrated flexible project scheduling problem. The real-world applicability of the suggested CP models is shown by solving large steel industry instances with the CP Optimizer of IBM ILOG CPLEX. Furthermore, benchmark instances on flexible resource-constrained project scheduling problems (RCPSP) are solved to optimality.
Tasks
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09244v1
PDF	http://arxiv.org/pdf/1902.09244v1.pdf
PWC	https://paperswithcode.com/paper/on-constraint-programming-for-a-new-flexible
Repo
Framework

Dense Color Constancy with Effective Edge Augmentation


Title	Dense Color Constancy with Effective Edge Augmentation
Authors	Yilang Zhang, Zheng Wei, Jian Wang, Xin Yuan
Abstract	Recently, computational color constancy via convolutional neural networks (CNNs) has received much attention. In this paper, we propose a color constancy algorithm called the Dense Color Constancy (DCC), which employs a self-attention DenseNet to estimate the illuminant based on the $2$D $\log$-chrominance histograms of input images and their augmented edges. The augmented edges help to tell apart the edge and non-edge pixels in the $\log$-histogram, which largely contribute to the feature extraction and color ambiguity elimination, thereby improving the accuracy of illuminant estimation. Experiments on benchmark datasets show that the DCC algorithm is very effective for illuminant estimation compared to the state-of-the-art methods.
Tasks	Color Constancy
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07163v1
PDF	https://arxiv.org/pdf/1911.07163v1.pdf
PWC	https://paperswithcode.com/paper/dense-color-constancy-with-effective-edge
Repo
Framework


Title	Multi-modal gated recurrent units for image description
Authors	Xuelong Li, Aihong Yuan, Xiaoqiang Lu
Abstract	Using a natural language sentence to describe the content of an image is a challenging but very important task. It is challenging because a description must not only capture objects contained in the image and the relationships among them, but also be relevant and grammatically correct. In this paper a multi-modal embedding model based on gated recurrent units (GRU) which can generate variable-length description for a given image. In the training step, we apply the convolutional neural network (CNN) to extract the image feature. Then the feature is imported into the multi-modal GRU as well as the corresponding sentence representations. The multi-modal GRU learns the inter-modal relations between image and sentence. And in the testing step, when an image is imported to our multi-modal GRU model, a sentence which describes the image content is generated. The experimental results demonstrate that our multi-modal GRU model obtains the state-of-the-art performance on Flickr8K, Flickr30K and MS COCO datasets.
Tasks
Published	2019-04-20
URL	http://arxiv.org/abs/1904.09421v1
PDF	http://arxiv.org/pdf/1904.09421v1.pdf
PWC	https://paperswithcode.com/paper/multi-modal-gated-recurrent-units-for-image
Repo
Framework

Inferring the quantum density matrix with machine learning


Title	Inferring the quantum density matrix with machine learning
Authors	Kyle Cranmer, Siavash Golkar, Duccio Pappadopulo
Abstract	We introduce two methods for estimating the density matrix for a quantum system: Quantum Maximum Likelihood and Quantum Variational Inference. In these methods, we construct a variational family to model the density matrix of a mixed quantum state. We also introduce quantum flows, the quantum analog of normalizing flows, which can be used to increase the expressivity of this variational family. The eigenstates and eigenvalues of interest are then derived by optimizing an appropriate loss function. The approach is qualitatively different than traditional lattice techniques that rely on the time dependence of correlation functions that summarize the lattice configurations. The resulting estimate of the density matrix can then be used to evaluate the expectation of an arbitrary operator, which opens the door to new possibilities.
Tasks
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05903v1
PDF	http://arxiv.org/pdf/1904.05903v1.pdf
PWC	https://paperswithcode.com/paper/inferring-the-quantum-density-matrix-with
Repo
Framework

Discriminative Autoencoder for Feature Extraction: Application to Character Recognition


Title	Discriminative Autoencoder for Feature Extraction: Application to Character Recognition
Authors	Anupriya Gogna, Angshul Majumdar
Abstract	Conventionally, autoencoders are unsupervised representation learning tools. In this work, we propose a novel discriminative autoencoder. Use of supervised discriminative learning ensures that the learned representation is robust to variations commonly encountered in image datasets. Using the basic discriminating autoencoder as a unit, we build a stacked architecture aimed at extracting relevant representation from the training data. The efficiency of our feature extraction algorithm ensures a high classification accuracy with even simple classification schemes like KNN (K-nearest neighbor). We demonstrate the superiority of our model for representation learning by conducting experiments on standard datasets for character/image recognition and subsequent comparison with existing supervised deep architectures like class sparse stacked autoencoder and discriminative deep belief network.
Tasks	Representation Learning, Unsupervised Representation Learning
Published	2019-12-11
URL	https://arxiv.org/abs/1912.12131v1
PDF	https://arxiv.org/pdf/1912.12131v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-autoencoder-for-feature
Repo
Framework

A Bilingual Generative Transformer for Semantic Sentence Embedding


Title	A Bilingual Generative Transformer for Semantic Sentence Embedding
Authors	John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick
Abstract	Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such embeddings: properties shared by both sentences in a translation pair are likely semantic, while divergent properties are likely stylistic or language-specific. We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors. Our proposed approach differs from past work on semantic sentence encoding in two ways. First, by using a variational probabilistic framework, we introduce priors that encourage source separation, and can use our model’s posterior to predict sentence embeddings for monolingual data at test time. Second, we use high-capacity transformers as both data generating distributions and inference networks – contrasting with most past work on sentence embeddings. In experiments, our approach substantially outperforms the state-of-the-art on a standard suite of unsupervised semantic similarity evaluations. Further, we demonstrate that our approach yields the largest gains on more difficult subsets of these evaluations where simple word overlap is not a good indicator of similarity.
Tasks	Semantic Similarity, Semantic Textual Similarity, Sentence Embedding, Sentence Embeddings
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03895v1
PDF	https://arxiv.org/pdf/1911.03895v1.pdf
PWC	https://paperswithcode.com/paper/a-bilingual-generative-transformer-for-1
Repo
Framework

Artificial Agents Learn Flexible Visual Representations by Playing a Hiding Game


Title	Artificial Agents Learn Flexible Visual Representations by Playing a Hiding Game
Authors	Luca Weihs, Aniruddha Kembhavi, Winson Han, Alvaro Herrasti, Eric Kolve, Dustin Schwenk, Roozbeh Mottaghi, Ali Farhadi
Abstract	The ubiquity of embodied gameplay, observed in a wide variety of animal species including turtles and ravens, has led researchers to question what advantages play provides to the animals engaged in it. Mounting evidence suggests that play is critical in developing the neural flexibility for creative problem solving, socialization, and can improve the plasticity of the medial prefrontal cortex. Comparatively little is known regarding the impact of gameplay upon embodied artificial agents. While recent work has produced artificial agents proficient in abstract games, the environments these agents act within are far removed the real world and thus these agents provide little insight into the advantages of embodied play. Hiding games have arisen in multiple cultures and species, and provide a rich ground for studying the impact of embodied gameplay on representation learning in the context of perspective taking, secret keeping, and false belief understanding. Here we are the first to show that embodied adversarial reinforcement learning agents playing cache, a variant of hide-and-seek, in a high fidelity, interactive, environment, learn representations of their observations encoding information such as occlusion, object permanence, free space, and containment; on par with representations learnt by the most popular modern paradigm for visual representation learning which requires large datasets independently labeled for each new task. Our representations are enhanced by intent and memory, through interaction and play, moving closer to biologically motivated learning strategies. These results serve as a model for studying how facets of vision and perspective taking develop through play, provide an experimental framework for assessing what is learned by artificial agents, and suggest that representation learning should move from static datasets and towards experiential, interactive, learning.
Tasks	Representation Learning
Published	2019-12-17
URL	https://arxiv.org/abs/1912.08195v2
PDF	https://arxiv.org/pdf/1912.08195v2.pdf
PWC	https://paperswithcode.com/paper/artificial-agents-learn-flexible-visual
Repo
Framework

Improving Siamese Networks for One Shot Learning using Kernel Based Activation functions


Title	Improving Siamese Networks for One Shot Learning using Kernel Based Activation functions
Authors	Shruti Jadon, Aditya Acrot Srinivasan
Abstract	The lack of a large amount of training data has always been the constraining factor in solving a lot of problems in machine learning, making One Shot Learning one of the most intriguing ideas in machine learning. It aims to learn information about object categories from one, or only a few training examples. This process of learning in deep learning is usually accomplished by proper objective function, i.e; loss function and embeddings extraction i.e; architecture. In this paper, we discussed about metrics based deep learning architectures for one shot learning such as Siamese neural networks and present a method to improve on their accuracy using Kafnets (kernel-based non-parametric activation functions for neural networks) by learning proper embeddings with relatively less number of epochs. Using kernel activation functions, we are able to achieve strong results which exceed those of ReLU based deep learning models in terms of embeddings structure, loss convergence, and accuracy.
Tasks	One-Shot Learning
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09798v1
PDF	https://arxiv.org/pdf/1910.09798v1.pdf
PWC	https://paperswithcode.com/paper/improving-siamese-networks-for-one-shot
Repo
Framework

Modeling neural dynamics during speech production using a state space variational autoencoder


Title	Modeling neural dynamics during speech production using a state space variational autoencoder
Authors	Pengfei Sun, David A. Moses, Edward Chang
Abstract	Characterizing the neural encoding of behavior remains a challenging task in many research areas due in part to complex and noisy spatiotemporal dynamics of evoked brain activity. An important aspect of modeling these neural encodings involves separation of robust, behaviorally relevant signals from background activity, which often contains signals from irrelevant brain processes and decaying information from previous behavioral events. To achieve this separation, we develop a two-branch State Space Variational AutoEncoder (SSVAE) model to individually describe the instantaneous evoked foreground signals and the context-dependent background signals. We modeled the spontaneous speech-evoked brain dynamics using smoothed Gaussian mixture models. By applying the proposed SSVAE model to track ECoG dynamics in one participant over multiple hours, we find that the model can predict speech-related dynamics more accurately than other latent factor inference algorithms. Our results demonstrate that separately modeling the instantaneous speech-evoked and slow context-dependent brain dynamics can enhance tracking performance, which has important implications for the development of advanced neural encoding and decoding models in various neuroscience sub-disciplines.
Tasks
Published	2019-01-13
URL	http://arxiv.org/abs/1901.04024v1
PDF	http://arxiv.org/pdf/1901.04024v1.pdf
PWC	https://paperswithcode.com/paper/modeling-neural-dynamics-during-speech
Repo
Framework