Paper Group ANR 386
Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling. Pay Attention to Virality: understanding popularity of social media videos with the attention mechanism. Deep reinforcement learning for search, recommendation, and online advertising: a survey. Color Sails: Discrete-Continuous Palettes for Deep Color Explo …
Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling
Title | Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling |
Authors | Jacob Menick, Nal Kalchbrenner |
Abstract | The unconditional generation of high fidelity images is a longstanding benchmark for testing the performance of image decoders. Autoregressive image models have been able to generate small images unconditionally, but the extension of these methods to large images where fidelity can be more readily assessed has remained an open problem. Among the major challenges are the capacity to encode the vast previous context and the sheer difficulty of learning a distribution that preserves both global semantic coherence and exactness of detail. To address the former challenge, we propose the Subscale Pixel Network (SPN), a conditional decoder architecture that generates an image as a sequence of sub-images of equal size. The SPN compactly captures image-wide spatial dependencies and requires a fraction of the memory and the computation required by other fully autoregressive models. To address the latter challenge, we propose to use Multidimensional Upscaling to grow an image in both size and depth via intermediate stages utilising distinct SPNs. We evaluate SPNs on the unconditional generation of CelebAHQ of size 256 and of ImageNet from size 32 to 256. We achieve state-of-the-art likelihood results in multiple settings, set up new benchmark results in previously unexplored settings and are able to generate very high fidelity large scale samples on the basis of both datasets. |
Tasks | Image Generation |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01608v1 |
http://arxiv.org/pdf/1812.01608v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-high-fidelity-images-with-subscale |
Repo | |
Framework | |
Pay Attention to Virality: understanding popularity of social media videos with the attention mechanism
Title | Pay Attention to Virality: understanding popularity of social media videos with the attention mechanism |
Authors | Adam Bielski, Tomasz Trzcinski |
Abstract | Predicting popularity of social media videos before they are published is a challenging task, mainly due to the complexity of content distribution network as well as the number of factors that play part in this process. As solving this task provides tremendous help for media content creators, many successful methods were proposed to solve this problem with machine learning. In this work, we change the viewpoint and postulate that it is not only the predicted popularity that matters, but also, maybe even more importantly, understanding of how individual parts influence the final popularity score. To that end, we propose to combine the Grad-CAM visualization method with a soft attention mechanism. Our preliminary results show that this approach allows for more intuitive interpretation of the content impact on video popularity, while achieving competitive results in terms of prediction accuracy. |
Tasks | |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.09949v1 |
http://arxiv.org/pdf/1804.09949v1.pdf | |
PWC | https://paperswithcode.com/paper/pay-attention-to-virality-understanding |
Repo | |
Framework | |
Deep reinforcement learning for search, recommendation, and online advertising: a survey
Title | Deep reinforcement learning for search, recommendation, and online advertising: a survey |
Authors | Xiangyu Zhao, Long Xia, Jiliang Tang, Dawei Yin |
Abstract | Search, recommendation, and online advertising are the three most important information-providing mechanisms on the web. These information seeking techniques, satisfying users’ information needs by suggesting users personalized objects (information or services) at the appropriate time and place, play a crucial role in mitigating the information overload problem. With recent great advances in deep reinforcement learning (DRL), there have been increasing interests in developing DRL based information seeking techniques. These DRL based techniques have two key advantages – (1) they are able to continuously update information seeking strategies according to users’ real-time feedback, and (2) they can maximize the expected cumulative long-term reward from users where reward has different definitions according to information seeking applications such as click-through rate, revenue, user satisfaction and engagement. In this paper, we give an overview of deep reinforcement learning for search, recommendation, and online advertising from methodologies to applications, review representative algorithms, and discuss some appealing research directions. |
Tasks | |
Published | 2018-12-18 |
URL | https://arxiv.org/abs/1812.07127v5 |
https://arxiv.org/pdf/1812.07127v5.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-for-search |
Repo | |
Framework | |
Color Sails: Discrete-Continuous Palettes for Deep Color Exploration
Title | Color Sails: Discrete-Continuous Palettes for Deep Color Exploration |
Authors | Maria Shugrina, Amlan Kar, Karan Singh, Sanja Fidler |
Abstract | We present color sails, a discrete-continuous color gamut representation that extends the color gradient analogy to three dimensions and allows interactive control of the color blending behavior. Our representation models a wide variety of color distributions in a compact manner, and lends itself to applications such as color exploration for graphic design, illustration and similar fields. We propose a Neural Network that can fit a color sail to any image. Then, the user can adjust color sail parameters to change the base colors, their blending behavior and the number of colors, exploring a wide range of options for the original design. In addition, we propose a Deep Learning model that learns to automatically segment an image into color-compatible alpha masks, each equipped with its own color sail. This allows targeted color exploration by either editing their corresponding color sails or using standard software packages. Our model is trained on a custom diverse dataset of art and design. We provide both quantitative evaluations, and a user study, demonstrating the effectiveness of color sail interaction. Interactive demos are available at www.colorsails.com. |
Tasks | |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02918v1 |
http://arxiv.org/pdf/1806.02918v1.pdf | |
PWC | https://paperswithcode.com/paper/color-sails-discrete-continuous-palettes-for |
Repo | |
Framework | |
Global Sum Pooling: A Generalization Trick for Object Counting with Small Datasets of Large Images
Title | Global Sum Pooling: A Generalization Trick for Object Counting with Small Datasets of Large Images |
Authors | Shubhra Aich, Ian Stavness |
Abstract | In this paper, we explore the problem of training one-look regression models for counting objects in datasets comprising a small number of high-resolution, variable-shaped images. We illustrate that conventional global average pooling (GAP) based models are unreliable due to the patchwise cancellation of true overestimates and underestimates for patchwise inference. To overcome this limitation and reduce overfitting caused by the training on full-resolution images, we propose to employ global sum pooling (GSP) instead of GAP or fully connected (FC) layers at the backend of a convolutional network. Although computationally equivalent to GAP, we show through comprehensive experimentation that GSP allows convolutional networks to learn the counting task as a simple linear mapping problem generalized over the input shape and the number of objects present. This generalization capability allows GSP to avoid both patchwise cancellation and overfitting by training on small patches and inference on full-resolution images as a whole. We evaluate our approach on four different aerial image datasets - two car counting datasets (CARPK and COWC), one crowd counting dataset (ShanghaiTech; parts A and B) and one new challenging dataset for wheat spike counting. Our GSP models improve upon the state-of-the-art approaches on all four datasets with a simple architecture. Also, GSP architectures trained with smaller-sized image patches exhibit better localization property due to their focus on learning from smaller regions while training. |
Tasks | Crowd Counting, Object Counting |
Published | 2018-05-28 |
URL | https://arxiv.org/abs/1805.11123v2 |
https://arxiv.org/pdf/1805.11123v2.pdf | |
PWC | https://paperswithcode.com/paper/object-counting-with-small-datasets-of-large |
Repo | |
Framework | |
DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization
Title | DEEPEYE: A Compact and Accurate Video Comprehension at Terminal Devices Compressed with Quantization and Tensorization |
Authors | Yuan Cheng, Guangya Li, Hai-Bao Chen, Sheldon X. -D. Tan, Hao Yu |
Abstract | As it requires a huge number of parameters when exposed to high dimensional inputs in video detection and classification, there is a grand challenge to develop a compact yet accurate video comprehension at terminal devices. Current works focus on optimizations of video detection and classification in a separated fashion. In this paper, we introduce a video comprehension (object detection and action recognition) system for terminal devices, namely DEEPEYE. Based on You Only Look Once (YOLO), we have developed an 8-bit quantization method when training YOLO; and also developed a tensorized-compression method of Recurrent Neural Network (RNN) composed of features extracted from YOLO. The developed quantization and tensorization can significantly compress the original network model yet with maintained accuracy. Using the challenging video datasets: MOMENTS and UCF11 as benchmarks, the results show that the proposed DEEPEYE achieves 3.994x model compression rate with only 0.47% mAP decreased; and 15,047x parameter reduction and 2.87x speed-up with 16.58% accuracy improvement. |
Tasks | Model Compression, Object Detection, Quantization, Temporal Action Localization |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.07935v2 |
http://arxiv.org/pdf/1805.07935v2.pdf | |
PWC | https://paperswithcode.com/paper/deepeye-a-compact-and-accurate-video |
Repo | |
Framework | |
Efficient Learning of Bounded-Treewidth Bayesian Networks from Complete and Incomplete Data Sets
Title | Efficient Learning of Bounded-Treewidth Bayesian Networks from Complete and Incomplete Data Sets |
Authors | Mauro Scanagatta, Giorgio Corani, Marco Zaffalon, Jaemin Yoo, U Kang |
Abstract | Learning a Bayesian networks with bounded treewidth is important for reducing the complexity of the inferences. We present a novel anytime algorithm (k-MAX) method for this task, which scales up to thousands of variables. Through extensive experiments we show that it consistently yields higher-scoring structures than its competitors on complete data sets. We then consider the problem of structure learning from incomplete data sets. This can be addressed by structural EM, which however is computationally very demanding. We thus adopt the novel k-MAX algorithm in the maximization step of structural EM, obtaining an efficient computation of the expected sufficient statistics. We test the resulting structural EM method on the task of imputing missing data, comparing it against the state-of-the-art approach based on random forests. Our approach achieves the same imputation accuracy of the competitors, but in about one tenth of the time. Furthermore we show that it has worst-case complexity linear in the input size, and that it is easily parallelizable. |
Tasks | Imputation |
Published | 2018-02-07 |
URL | http://arxiv.org/abs/1802.02468v1 |
http://arxiv.org/pdf/1802.02468v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-learning-of-bounded-treewidth |
Repo | |
Framework | |
Joint Stem Detection and Crop-Weed Classification for Plant-specific Treatment in Precision Farming
Title | Joint Stem Detection and Crop-Weed Classification for Plant-specific Treatment in Precision Farming |
Authors | Philipp Lottes, Jens Behley, Nived Chebrolu, Andres Milioto, Cyrill Stachniss |
Abstract | Applying agrochemicals is the default procedure for conventional weed control in crop production, but has negative impacts on the environment. Robots have the potential to treat every plant in the field individually and thus can reduce the required use of such chemicals. To achieve that, robots need the ability to identify crops and weeds in the field and must additionally select effective treatments. While certain types of weed can be treated mechanically, other types need to be treated by (selective) spraying. In this paper, we present an approach that provides the necessary information for effective plant-specific treatment. It outputs the stem location for weeds, which allows for mechanical treatments, and the covered area of the weed for selective spraying. Our approach uses an end-to-end trainable fully convolutional network that simultaneously estimates stem positions as well as the covered area of crops and weeds. It jointly learns the class-wise stem detection and the pixel-wise semantic segmentation. Experimental evaluations on different real-world datasets show that our approach is able to reliably solve this problem. Compared to state-of-the-art approaches, our approach not only substantially improves the stem detection accuracy, i.e., distinguishing crop and weed stems, but also provides an improvement in the semantic segmentation performance. |
Tasks | Semantic Segmentation |
Published | 2018-06-09 |
URL | http://arxiv.org/abs/1806.03413v1 |
http://arxiv.org/pdf/1806.03413v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-stem-detection-and-crop-weed |
Repo | |
Framework | |
Fully Convolutional Networks with Sequential Information for Robust Crop and Weed Detection in Precision Farming
Title | Fully Convolutional Networks with Sequential Information for Robust Crop and Weed Detection in Precision Farming |
Authors | Philipp Lottes, Jens Behley, Andres Milioto, Cyrill Stachniss |
Abstract | Reducing the use of agrochemicals is an important component towards sustainable agriculture. Robots that can perform targeted weed control offer the potential to contribute to this goal, for example, through specialized weeding actions such as selective spraying or mechanical weed removal. A prerequisite of such systems is a reliable and robust plant classification system that is able to distinguish crop and weed in the field. A major challenge in this context is the fact that different fields show a large variability. Thus, classification systems have to robustly cope with substantial environmental changes with respect to weed pressure and weed types, growth stages of the crop, visual appearance, and soil conditions. In this paper, we propose a novel crop-weed classification system that relies on a fully convolutional network with an encoder-decoder structure and incorporates spatial information by considering image sequences. Exploiting the crop arrangement information that is observable from the image sequences enables our system to robustly estimate a pixel-wise labeling of the images into crop and weed, i.e., a semantic segmentation. We provide a thorough experimental evaluation, which shows that our system generalizes well to previously unseen fields under varying environmental conditions — a key capability to actually use such systems in precision framing. We provide comparisons to other state-of-the-art approaches and show that our system substantially improves the accuracy of crop-weed classification without requiring a retraining of the model. |
Tasks | Semantic Segmentation |
Published | 2018-06-09 |
URL | http://arxiv.org/abs/1806.03412v1 |
http://arxiv.org/pdf/1806.03412v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-convolutional-networks-with-sequential |
Repo | |
Framework | |
Benchmark Visual Question Answer Models by using Focus Map
Title | Benchmark Visual Question Answer Models by using Focus Map |
Authors | Wenda Qiu, Yueyang Xianzang, Zhekai Zhang |
Abstract | Inferring and Executing Programs for Visual Reasoning proposes a model for visual reasoning that consists of a program generator and an execution engine to avoid end-to-end models. To show that the model actually learns which objects to focus on to answer the questions, the authors give a visualization of the norm of the gradient of the sum of the predicted answer scores with respect to the final feature map. However, the authors do not evaluate the efficiency of focus map. This paper purposed a method for evaluating it. We generate several kinds of questions to test different keywords. We infer focus maps from the model by asking these questions and evaluate them by comparing with the segmentation graph. Furthermore, this method can be applied to any model if focus maps can be inferred from it. By evaluating focus map of different models on the CLEVR dataset, we will show that CLEVR-iep model has learned where to focus more than end-to-end models. |
Tasks | Visual Reasoning |
Published | 2018-01-13 |
URL | http://arxiv.org/abs/1801.05302v1 |
http://arxiv.org/pdf/1801.05302v1.pdf | |
PWC | https://paperswithcode.com/paper/benchmark-visual-question-answer-models-by |
Repo | |
Framework | |
The Key Concepts of Ethics of Artificial Intelligence - A Keyword based Systematic Mapping Study
Title | The Key Concepts of Ethics of Artificial Intelligence - A Keyword based Systematic Mapping Study |
Authors | Ville Vakkuri, Pekka Abrahamsson |
Abstract | The growing influence and decision-making capacities of Autonomous systems and Artificial Intelligence in our lives force us to consider the values embedded in these systems. But how ethics should be implemented into these systems? In this study, the solution is seen on philosophical conceptualization as a framework to form practical implementation model for ethics of AI. To take the first steps on conceptualization main concepts used on the field needs to be identified. A keyword based Systematic Mapping Study (SMS) on the keywords used in AI and ethics was conducted to help in identifying, defying and comparing main concepts used in current AI ethics discourse. Out of 1062 papers retrieved SMS discovered 37 re-occurring keywords in 83 academic papers. We suggest that the focus on finding keywords is the first step in guiding and providing direction for future research in the AI ethics field. |
Tasks | Decision Making |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07027v1 |
http://arxiv.org/pdf/1809.07027v1.pdf | |
PWC | https://paperswithcode.com/paper/the-key-concepts-of-ethics-of-artificial |
Repo | |
Framework | |
Computational Social Choice Meets Databases
Title | Computational Social Choice Meets Databases |
Authors | Benny Kimelfeld, Phokion G. Kolaitis, Julia Stoyanovich |
Abstract | We develop a novel framework that aims to create bridges between the computational social choice and the database management communities. This framework enriches the tasks currently supported in computational social choice with relational database context, thus making it possible to formulate sophisticated queries about voting rules, candidates, voters, issues, and positions. At the conceptual level, we give rigorous semantics to queries in this framework by introducing the notions of necessary answers and possible answers to queries. At the technical level, we embark on an investigation of the computational complexity of the necessary answers. We establish a number of results about the complexity of the necessary answers of conjunctive queries involving positional scoring rules that contrast sharply with earlier results about the complexity of the necessary winners. |
Tasks | |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.04156v1 |
http://arxiv.org/pdf/1805.04156v1.pdf | |
PWC | https://paperswithcode.com/paper/computational-social-choice-meets-databases |
Repo | |
Framework | |
The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks
Title | The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks |
Authors | Arjun Sondhi, Ali Shojaie |
Abstract | We consider the task of estimating a high-dimensional directed acyclic graph, given observations from a linear structural equation model with arbitrary noise distribution. By exploiting properties of common random graphs, we develop a new algorithm that requires conditioning only on small sets of variables. The proposed algorithm, which is essentially a modified version of the PC-Algorithm, offers significant gains in both computational complexity and estimation accuracy. In particular, it results in more efficient and accurate estimation in large networks containing hub nodes, which are common in biological systems. We prove the consistency of the proposed algorithm, and show that it also requires a less stringent faithfulness assumption than the PC-Algorithm. Simulations in low and high-dimensional settings are used to illustrate these findings. An application to gene expression data suggests that the proposed algorithm can identify a greater number of clinically relevant genes than current methods. |
Tasks | |
Published | 2018-06-16 |
URL | https://arxiv.org/abs/1806.06209v2 |
https://arxiv.org/pdf/1806.06209v2.pdf | |
PWC | https://paperswithcode.com/paper/the-reduced-pc-algorithm-improved-causal |
Repo | |
Framework | |
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge
Title | Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge |
Authors | Steven Derby, Paul Miller, Brian Murphy, Barry Devereux |
Abstract | Distributional models provide a convenient way to model semantics using dense embedding spaces derived from unsupervised learning algorithms. However, the dimensions of dense embedding spaces are not designed to resemble human semantic knowledge. Moreover, embeddings are often built from a single source of information (typically text data), even though neurocognitive research suggests that semantics is deeply linked to both language and perception. In this paper, we combine multimodal information from both text and image-based representations derived from state-of-the-art distributional models to produce sparse, interpretable vectors using Joint Non-Negative Sparse Embedding. Through in-depth analyses comparing these sparse models to human-derived behavioural and neuroimaging data, we demonstrate their ability to predict interpretable linguistic descriptions of human ground-truth semantic knowledge. |
Tasks | |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02534v3 |
http://arxiv.org/pdf/1809.02534v3.pdf | |
PWC | https://paperswithcode.com/paper/using-sparse-semantic-embeddings-learned-from |
Repo | |
Framework | |
Expansional Retrofitting for Word Vector Enrichment
Title | Expansional Retrofitting for Word Vector Enrichment |
Authors | Hwiyeol Jo |
Abstract | Retrofitting techniques, which inject external resources into word representations, have compensated the weakness of distributed representations in semantic and relational knowledge between words. Implicitly retrofitting word vectors by expansional technique outperforms retrofitting in word similarity tasks with word vector generalization. In this paper, we propose unsupervised extrofitting: expansional retrofitting (extrofitting) without external semantic lexicons. We also propose deep extrofitting: in-depth stacking of extrofitting and further combinations of extrofitting with retrofitting. When experimenting with GloVe, we show that our methods outperform the previous methods on most of word similarity tasks while requiring only synonyms as an external resource. Lastly, we show the effect of word vector enrichment on text classification task, as a downstream task. |
Tasks | Text Classification |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07337v3 |
http://arxiv.org/pdf/1808.07337v3.pdf | |
PWC | https://paperswithcode.com/paper/expansional-retrofitting-for-word-vector |
Repo | |
Framework | |