October 19, 2019

3191 words 15 mins read

Paper Group ANR 386

Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling. Pay Attention to Virality: understanding popularity of social media videos with the attention mechanism. Deep reinforcement learning for search, recommendation, and online advertising: a survey. Color Sails: Discrete-Continuous Palettes for Deep Color Explo …

Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling


Title	Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling
Authors	Jacob Menick, Nal Kalchbrenner
Abstract	The unconditional generation of high fidelity images is a longstanding benchmark for testing the performance of image decoders. Autoregressive image models have been able to generate small images unconditionally, but the extension of these methods to large images where fidelity can be more readily assessed has remained an open problem. Among the major challenges are the capacity to encode the vast previous context and the sheer difficulty of learning a distribution that preserves both global semantic coherence and exactness of detail. To address the former challenge, we propose the Subscale Pixel Network (SPN), a conditional decoder architecture that generates an image as a sequence of sub-images of equal size. The SPN compactly captures image-wide spatial dependencies and requires a fraction of the memory and the computation required by other fully autoregressive models. To address the latter challenge, we propose to use Multidimensional Upscaling to grow an image in both size and depth via intermediate stages utilising distinct SPNs. We evaluate SPNs on the unconditional generation of CelebAHQ of size 256 and of ImageNet from size 32 to 256. We achieve state-of-the-art likelihood results in multiple settings, set up new benchmark results in previously unexplored settings and are able to generate very high fidelity large scale samples on the basis of both datasets.
Tasks	Image Generation
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01608v1
PDF	http://arxiv.org/pdf/1812.01608v1.pdf
PWC	https://paperswithcode.com/paper/generating-high-fidelity-images-with-subscale
Repo
Framework


Title	Pay Attention to Virality: understanding popularity of social media videos with the attention mechanism
Authors	Adam Bielski, Tomasz Trzcinski
Abstract	Predicting popularity of social media videos before they are published is a challenging task, mainly due to the complexity of content distribution network as well as the number of factors that play part in this process. As solving this task provides tremendous help for media content creators, many successful methods were proposed to solve this problem with machine learning. In this work, we change the viewpoint and postulate that it is not only the predicted popularity that matters, but also, maybe even more importantly, understanding of how individual parts influence the final popularity score. To that end, we propose to combine the Grad-CAM visualization method with a soft attention mechanism. Our preliminary results show that this approach allows for more intuitive interpretation of the content impact on video popularity, while achieving competitive results in terms of prediction accuracy.
Tasks
Published	2018-04-26
URL	http://arxiv.org/abs/1804.09949v1
PDF	http://arxiv.org/pdf/1804.09949v1.pdf
PWC	https://paperswithcode.com/paper/pay-attention-to-virality-understanding
Repo
Framework

Deep reinforcement learning for search, recommendation, and online advertising: a survey


Title	Deep reinforcement learning for search, recommendation, and online advertising: a survey
Authors	Xiangyu Zhao, Long Xia, Jiliang Tang, Dawei Yin
Abstract	Search, recommendation, and online advertising are the three most important information-providing mechanisms on the web. These information seeking techniques, satisfying users’ information needs by suggesting users personalized objects (information or services) at the appropriate time and place, play a crucial role in mitigating the information overload problem. With recent great advances in deep reinforcement learning (DRL), there have been increasing interests in developing DRL based information seeking techniques. These DRL based techniques have two key advantages – (1) they are able to continuously update information seeking strategies according to users’ real-time feedback, and (2) they can maximize the expected cumulative long-term reward from users where reward has different definitions according to information seeking applications such as click-through rate, revenue, user satisfaction and engagement. In this paper, we give an overview of deep reinforcement learning for search, recommendation, and online advertising from methodologies to applications, review representative algorithms, and discuss some appealing research directions.
Tasks
Published	2018-12-18
URL	https://arxiv.org/abs/1812.07127v5
PDF	https://arxiv.org/pdf/1812.07127v5.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-for-search
Repo
Framework

Color Sails: Discrete-Continuous Palettes for Deep Color Exploration


Title	Color Sails: Discrete-Continuous Palettes for Deep Color Exploration
Authors	Maria Shugrina, Amlan Kar, Karan Singh, Sanja Fidler
Abstract	We present color sails, a discrete-continuous color gamut representation that extends the color gradient analogy to three dimensions and allows interactive control of the color blending behavior. Our representation models a wide variety of color distributions in a compact manner, and lends itself to applications such as color exploration for graphic design, illustration and similar fields. We propose a Neural Network that can fit a color sail to any image. Then, the user can adjust color sail parameters to change the base colors, their blending behavior and the number of colors, exploring a wide range of options for the original design. In addition, we propose a Deep Learning model that learns to automatically segment an image into color-compatible alpha masks, each equipped with its own color sail. This allows targeted color exploration by either editing their corresponding color sails or using standard software packages. Our model is trained on a custom diverse dataset of art and design. We provide both quantitative evaluations, and a user study, demonstrating the effectiveness of color sail interaction. Interactive demos are available at www.colorsails.com.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02918v1
PDF	http://arxiv.org/pdf/1806.02918v1.pdf
PWC	https://paperswithcode.com/paper/color-sails-discrete-continuous-palettes-for
Repo
Framework

Global Sum Pooling: A Generalization Trick for Object Counting with Small Datasets of Large Images


Title	Global Sum Pooling: A Generalization Trick for Object Counting with Small Datasets of Large Images
Authors	Shubhra Aich, Ian Stavness
Abstract	In this paper, we explore the problem of training one-look regression models for counting objects in datasets comprising a small number of high-resolution, variable-shaped images. We illustrate that conventional global average pooling (GAP) based models are unreliable due to the patchwise cancellation of true overestimates and underestimates for patchwise inference. To overcome this limitation and reduce overfitting caused by the training on full-resolution images, we propose to employ global sum pooling (GSP) instead of GAP or fully connected (FC) layers at the backend of a convolutional network. Although computationally equivalent to GAP, we show through comprehensive experimentation that GSP allows convolutional networks to learn the counting task as a simple linear mapping problem generalized over the input shape and the number of objects present. This generalization capability allows GSP to avoid both patchwise cancellation and overfitting by training on small patches and inference on full-resolution images as a whole. We evaluate our approach on four different aerial image datasets - two car counting datasets (CARPK and COWC), one crowd counting dataset (ShanghaiTech; parts A and B) and one new challenging dataset for wheat spike counting. Our GSP models improve upon the state-of-the-art approaches on all four datasets with a simple architecture. Also, GSP architectures trained with smaller-sized image patches exhibit better localization property due to their focus on learning from smaller regions while training.
Tasks	Crowd Counting, Object Counting
Published	2018-05-28
URL	https://arxiv.org/abs/1805.11123v2
PDF	https://arxiv.org/pdf/1805.11123v2.pdf
PWC	https://paperswithcode.com/paper/object-counting-with-small-datasets-of-large
Repo
Framework

DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization


Title	DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization
Authors	Yuan Cheng, Guangya Li, Hai-Bao Chen, Sheldon X. -D. Tan, Hao Yu
Abstract	As it requires a huge number of parameters when exposed to high dimensional inputs in video detection and classification, there is a grand challenge to develop a compact yet accurate video comprehension at terminal devices. Current works focus on optimizations of video detection and classification in a separated fashion. In this paper, we introduce a video comprehension (object detection and action recognition) system for terminal devices, namely DEEPEYE. Based on You Only Look Once (YOLO), we have developed an 8-bit quantization method when training YOLO; and also developed a tensorized-compression method of Recurrent Neural Network (RNN) composed of features extracted from YOLO. The developed quantization and tensorization can significantly compress the original network model yet with maintained accuracy. Using the challenging video datasets: MOMENTS and UCF11 as benchmarks, the results show that the proposed DEEPEYE achieves 3.994x model compression rate with only 0.47% mAP decreased; and 15,047x parameter reduction and 2.87x speed-up with 16.58% accuracy improvement.
Tasks	Model Compression, Object Detection, Quantization, Temporal Action Localization
Published	2018-05-21
URL	http://arxiv.org/abs/1805.07935v2
PDF	http://arxiv.org/pdf/1805.07935v2.pdf
PWC	https://paperswithcode.com/paper/deepeye-a-compact-and-accurate-video
Repo
Framework

Efficient Learning of Bounded-Treewidth Bayesian Networks from Complete and Incomplete Data Sets


Title	Efficient Learning of Bounded-Treewidth Bayesian Networks from Complete and Incomplete Data Sets
Authors	Mauro Scanagatta, Giorgio Corani, Marco Zaffalon, Jaemin Yoo, U Kang
Abstract	Learning a Bayesian networks with bounded treewidth is important for reducing the complexity of the inferences. We present a novel anytime algorithm (k-MAX) method for this task, which scales up to thousands of variables. Through extensive experiments we show that it consistently yields higher-scoring structures than its competitors on complete data sets. We then consider the problem of structure learning from incomplete data sets. This can be addressed by structural EM, which however is computationally very demanding. We thus adopt the novel k-MAX algorithm in the maximization step of structural EM, obtaining an efficient computation of the expected sufficient statistics. We test the resulting structural EM method on the task of imputing missing data, comparing it against the state-of-the-art approach based on random forests. Our approach achieves the same imputation accuracy of the competitors, but in about one tenth of the time. Furthermore we show that it has worst-case complexity linear in the input size, and that it is easily parallelizable.
Tasks	Imputation
Published	2018-02-07
URL	http://arxiv.org/abs/1802.02468v1
PDF	http://arxiv.org/pdf/1802.02468v1.pdf
PWC	https://paperswithcode.com/paper/efficient-learning-of-bounded-treewidth
Repo
Framework

Joint Stem Detection and Crop-Weed Classification for Plant-specific Treatment in Precision Farming


Title	Joint Stem Detection and Crop-Weed Classification for Plant-specific Treatment in Precision Farming
Authors	Philipp Lottes, Jens Behley, Nived Chebrolu, Andres Milioto, Cyrill Stachniss
Abstract	Applying agrochemicals is the default procedure for conventional weed control in crop production, but has negative impacts on the environment. Robots have the potential to treat every plant in the field individually and thus can reduce the required use of such chemicals. To achieve that, robots need the ability to identify crops and weeds in the field and must additionally select effective treatments. While certain types of weed can be treated mechanically, other types need to be treated by (selective) spraying. In this paper, we present an approach that provides the necessary information for effective plant-specific treatment. It outputs the stem location for weeds, which allows for mechanical treatments, and the covered area of the weed for selective spraying. Our approach uses an end-to-end trainable fully convolutional network that simultaneously estimates stem positions as well as the covered area of crops and weeds. It jointly learns the class-wise stem detection and the pixel-wise semantic segmentation. Experimental evaluations on different real-world datasets show that our approach is able to reliably solve this problem. Compared to state-of-the-art approaches, our approach not only substantially improves the stem detection accuracy, i.e., distinguishing crop and weed stems, but also provides an improvement in the semantic segmentation performance.
Tasks	Semantic Segmentation
Published	2018-06-09
URL	http://arxiv.org/abs/1806.03413v1
PDF	http://arxiv.org/pdf/1806.03413v1.pdf
PWC	https://paperswithcode.com/paper/joint-stem-detection-and-crop-weed
Repo
Framework

Fully Convolutional Networks with Sequential Information for Robust Crop and Weed Detection in Precision Farming


Title	Fully Convolutional Networks with Sequential Information for Robust Crop and Weed Detection in Precision Farming
Authors	Philipp Lottes, Jens Behley, Andres Milioto, Cyrill Stachniss
Abstract	Reducing the use of agrochemicals is an important component towards sustainable agriculture. Robots that can perform targeted weed control offer the potential to contribute to this goal, for example, through specialized weeding actions such as selective spraying or mechanical weed removal. A prerequisite of such systems is a reliable and robust plant classification system that is able to distinguish crop and weed in the field. A major challenge in this context is the fact that different fields show a large variability. Thus, classification systems have to robustly cope with substantial environmental changes with respect to weed pressure and weed types, growth stages of the crop, visual appearance, and soil conditions. In this paper, we propose a novel crop-weed classification system that relies on a fully convolutional network with an encoder-decoder structure and incorporates spatial information by considering image sequences. Exploiting the crop arrangement information that is observable from the image sequences enables our system to robustly estimate a pixel-wise labeling of the images into crop and weed, i.e., a semantic segmentation. We provide a thorough experimental evaluation, which shows that our system generalizes well to previously unseen fields under varying environmental conditions — a key capability to actually use such systems in precision framing. We provide comparisons to other state-of-the-art approaches and show that our system substantially improves the accuracy of crop-weed classification without requiring a retraining of the model.
Tasks	Semantic Segmentation
Published	2018-06-09
URL	http://arxiv.org/abs/1806.03412v1
PDF	http://arxiv.org/pdf/1806.03412v1.pdf
PWC	https://paperswithcode.com/paper/fully-convolutional-networks-with-sequential
Repo
Framework

Benchmark Visual Question Answer Models by using Focus Map


Title	Benchmark Visual Question Answer Models by using Focus Map
Authors	Wenda Qiu, Yueyang Xianzang, Zhekai Zhang
Abstract	Inferring and Executing Programs for Visual Reasoning proposes a model for visual reasoning that consists of a program generator and an execution engine to avoid end-to-end models. To show that the model actually learns which objects to focus on to answer the questions, the authors give a visualization of the norm of the gradient of the sum of the predicted answer scores with respect to the final feature map. However, the authors do not evaluate the efficiency of focus map. This paper purposed a method for evaluating it. We generate several kinds of questions to test different keywords. We infer focus maps from the model by asking these questions and evaluate them by comparing with the segmentation graph. Furthermore, this method can be applied to any model if focus maps can be inferred from it. By evaluating focus map of different models on the CLEVR dataset, we will show that CLEVR-iep model has learned where to focus more than end-to-end models.
Tasks	Visual Reasoning
Published	2018-01-13
URL	http://arxiv.org/abs/1801.05302v1
PDF	http://arxiv.org/pdf/1801.05302v1.pdf
PWC	https://paperswithcode.com/paper/benchmark-visual-question-answer-models-by
Repo
Framework

The Key Concepts of Ethics of Artificial Intelligence - A Keyword based Systematic Mapping Study


Title	The Key Concepts of Ethics of Artificial Intelligence - A Keyword based Systematic Mapping Study
Authors	Ville Vakkuri, Pekka Abrahamsson
Abstract	The growing influence and decision-making capacities of Autonomous systems and Artificial Intelligence in our lives force us to consider the values embedded in these systems. But how ethics should be implemented into these systems? In this study, the solution is seen on philosophical conceptualization as a framework to form practical implementation model for ethics of AI. To take the first steps on conceptualization main concepts used on the field needs to be identified. A keyword based Systematic Mapping Study (SMS) on the keywords used in AI and ethics was conducted to help in identifying, defying and comparing main concepts used in current AI ethics discourse. Out of 1062 papers retrieved SMS discovered 37 re-occurring keywords in 83 academic papers. We suggest that the focus on finding keywords is the first step in guiding and providing direction for future research in the AI ethics field.
Tasks	Decision Making
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07027v1
PDF	http://arxiv.org/pdf/1809.07027v1.pdf
PWC	https://paperswithcode.com/paper/the-key-concepts-of-ethics-of-artificial
Repo
Framework


Title	Computational Social Choice Meets Databases
Authors	Benny Kimelfeld, Phokion G. Kolaitis, Julia Stoyanovich
Abstract	We develop a novel framework that aims to create bridges between the computational social choice and the database management communities. This framework enriches the tasks currently supported in computational social choice with relational database context, thus making it possible to formulate sophisticated queries about voting rules, candidates, voters, issues, and positions. At the conceptual level, we give rigorous semantics to queries in this framework by introducing the notions of necessary answers and possible answers to queries. At the technical level, we embark on an investigation of the computational complexity of the necessary answers. We establish a number of results about the complexity of the necessary answers of conjunctive queries involving positional scoring rules that contrast sharply with earlier results about the complexity of the necessary winners.
Tasks
Published	2018-05-10
URL	http://arxiv.org/abs/1805.04156v1
PDF	http://arxiv.org/pdf/1805.04156v1.pdf
PWC	https://paperswithcode.com/paper/computational-social-choice-meets-databases
Repo
Framework

The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks


Title	The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks
Authors	Arjun Sondhi, Ali Shojaie
Abstract	We consider the task of estimating a high-dimensional directed acyclic graph, given observations from a linear structural equation model with arbitrary noise distribution. By exploiting properties of common random graphs, we develop a new algorithm that requires conditioning only on small sets of variables. The proposed algorithm, which is essentially a modified version of the PC-Algorithm, offers significant gains in both computational complexity and estimation accuracy. In particular, it results in more efficient and accurate estimation in large networks containing hub nodes, which are common in biological systems. We prove the consistency of the proposed algorithm, and show that it also requires a less stringent faithfulness assumption than the PC-Algorithm. Simulations in low and high-dimensional settings are used to illustrate these findings. An application to gene expression data suggests that the proposed algorithm can identify a greater number of clinically relevant genes than current methods.
Tasks
Published	2018-06-16
URL	https://arxiv.org/abs/1806.06209v2
PDF	https://arxiv.org/pdf/1806.06209v2.pdf
PWC	https://paperswithcode.com/paper/the-reduced-pc-algorithm-improved-causal
Repo
Framework

Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge


Title	Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge
Authors	Steven Derby, Paul Miller, Brian Murphy, Barry Devereux
Abstract	Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic knowledge. Moreover, embeddings are often built from a single source of information (typically text data), even though neurocognitive research suggests that semantics is deeply linked to both language and perception. In this paper, we combine multimodal information from both text and image-based representations derived from state-of-the-art distributional models to produce sparse, interpretable vectors using Joint Non-Negative Sparse Embedding. Through in-depth analyses comparing these sparse models to human-derived behavioural and neuroimaging data, we demonstrate their ability to predict interpretable linguistic descriptions of human ground-truth semantic knowledge.
Tasks
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02534v3
PDF	http://arxiv.org/pdf/1809.02534v3.pdf
PWC	https://paperswithcode.com/paper/using-sparse-semantic-embeddings-learned-from
Repo
Framework

Expansional Retrofitting for Word Vector Enrichment


Title	Expansional Retrofitting for Word Vector Enrichment
Authors	Hwiyeol Jo
Abstract	Retrofitting techniques, which inject external resources into word representations, have compensated the weakness of distributed representations in semantic and relational knowledge between words. Implicitly retrofitting word vectors by expansional technique outperforms retrofitting in word similarity tasks with word vector generalization. In this paper, we propose unsupervised extrofitting: expansional retrofitting (extrofitting) without external semantic lexicons. We also propose deep extrofitting: in-depth stacking of extrofitting and further combinations of extrofitting with retrofitting. When experimenting with GloVe, we show that our methods outperform the previous methods on most of word similarity tasks while requiring only synonyms as an external resource. Lastly, we show the effect of word vector enrichment on text classification task, as a downstream task.
Tasks	Text Classification
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07337v3
PDF	http://arxiv.org/pdf/1808.07337v3.pdf
PWC	https://paperswithcode.com/paper/expansional-retrofitting-for-word-vector
Repo
Framework