Paper Group ANR 86
Learning to Importance Sample in Primary Sample Space. Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy. Mapping Informal Settlements in Developing Countries with Multi-resolution, Multi-spectral Data. Recall Traces: Backtracking Models for Efficient Reinforcement Learning. Replicating Active Appearance Model by …
Learning to Importance Sample in Primary Sample Space
Title | Learning to Importance Sample in Primary Sample Space |
Authors | Quan Zheng, Matthias Zwicker |
Abstract | Importance sampling is one of the most widely used variance reduction strategies in Monte Carlo rendering. In this paper, we propose a novel importance sampling technique that uses a neural network to learn how to sample from a desired density represented by a set of samples. Our approach considers an existing Monte Carlo rendering algorithm as a black box. During a scene-dependent training phase, we learn to generate samples with a desired density in the primary sample space of the rendering algorithm using maximum likelihood estimation. We leverage a recent neural network architecture that was designed to represent real-valued non-volume preserving (‘Real NVP’) transformations in high dimensional spaces. We use Real NVP to non-linearly warp primary sample space and obtain desired densities. In addition, Real NVP efficiently computes the determinant of the Jacobian of the warp, which is required to implement the change of integration variables implied by the warp. A main advantage of our approach is that it is agnostic of underlying light transport effects, and can be combined with many existing rendering techniques by treating them as a black box. We show that our approach leads to effective variance reduction in several practical scenarios. |
Tasks | |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07840v1 |
http://arxiv.org/pdf/1808.07840v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-importance-sample-in-primary |
Repo | |
Framework | |
Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy
Title | Optimize Deep Convolutional Neural Network with Ternarized Weights and High Accuracy |
Authors | Zhezhi He, Boqing Gong, Deliang Fan |
Abstract | Deep convolution neural network has achieved great success in many artificial intelligence applications. However, its enormous model size and massive computation cost have become the main obstacle for deployment of such powerful algorithm in the low power and resource-limited embedded systems. As the countermeasure to this problem, in this work, we propose statistical weight scaling and residual expansion methods to reduce the bit-width of the whole network weight parameters to ternary values (i.e. -1, 0, +1), with the objectives to greatly reduce model size, computation cost and accuracy degradation caused by the model compression. With about 16x model compression rate, our ternarized ResNet-32/44/56 could outperform full-precision counterparts by 0.12%, 0.24% and 0.18% on CIFAR- 10 dataset. We also test our ternarization method with AlexNet and ResNet-18 on ImageNet dataset, which both achieve the best top-1 accuracy compared to recent similar works, with the same 16x compression rate. If further incorporating our residual expansion method, compared to the full-precision counterpart, our ternarized ResNet-18 even improves the top-5 accuracy by 0.61% and merely degrades the top-1 accuracy only by 0.42% for the ImageNet dataset, with 8x model compression rate. It outperforms the recent ABC-Net by 1.03% in top-1 accuracy and 1.78% in top-5 accuracy, with around 1.25x higher compression rate and more than 6x computation reduction due to the weight sparsity. |
Tasks | Model Compression |
Published | 2018-07-20 |
URL | http://arxiv.org/abs/1807.07948v1 |
http://arxiv.org/pdf/1807.07948v1.pdf | |
PWC | https://paperswithcode.com/paper/optimize-deep-convolutional-neural-network |
Repo | |
Framework | |
Mapping Informal Settlements in Developing Countries with Multi-resolution, Multi-spectral Data
Title | Mapping Informal Settlements in Developing Countries with Multi-resolution, Multi-spectral Data |
Authors | Patrick Helber, Bradley Gram-Hansen, Indhu Varatharajan, Faiza Azam, Alejandro Coca-Castro, Veronika Kopackova, Piotr Bilinski |
Abstract | Detecting and mapping informal settlements encompasses several of the United Nations sustainable development goals. This is because informal settlements are home to the most socially and economically vulnerable people on the planet. Thus, understanding where these settlements are is of paramount importance to both government and non-government organizations (NGOs), such as the United Nations Children’s Fund (UNICEF), who can use this information to deliver effective social and economic aid. We propose two effective methods for detecting and mapping the locations of informal settlements. One uses only low-resolution (LR), freely available, Sentinel-2 multispectral satellite imagery with noisy annotations, whilst the other is a deep learning approach that uses only costly very-high-resolution (VHR) satellite imagery. To our knowledge, we are the first to map informal settlements successfully with low-resolution satellite imagery. We extensively evaluate and compare the proposed methods. Please find additional material at https://frontierdevelopmentlab.github.io/informal-settlements/. |
Tasks | |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1812.00812v1 |
http://arxiv.org/pdf/1812.00812v1.pdf | |
PWC | https://paperswithcode.com/paper/mapping-informal-settlements-in-developing-1 |
Repo | |
Framework | |
Recall Traces: Backtracking Models for Efficient Reinforcement Learning
Title | Recall Traces: Backtracking Models for Efficient Reinforcement Learning |
Authors | Anirudh Goyal, Philemon Brakel, William Fedus, Soumye Singhal, Timothy Lillicrap, Sergey Levine, Hugo Larochelle, Yoshua Bengio |
Abstract | In many environments only a tiny subset of all states yield high reward. In these cases, few of the interactions with the environment provide a relevant learning signal. Hence, we may want to preferentially train on those high-reward states and the probable trajectories leading to them. To this end, we advocate for the use of a backtracking model that predicts the preceding states that terminate at a given high-reward state. We can train a model which, starting from a high value state (or one that is estimated to have high value), predicts and sample for which the (state, action)-tuples may have led to that high value state. These traces of (state, action) pairs, which we refer to as Recall Traces, sampled from this backtracking model starting from a high value state, are informative as they terminate in good states, and hence we can use these traces to improve a policy. We provide a variational interpretation for this idea and a practical algorithm in which the backtracking model samples from an approximate posterior distribution over trajectories which lead to large rewards. Our method improves the sample efficiency of both on- and off-policy RL algorithms across several environments and tasks. |
Tasks | |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.00379v2 |
http://arxiv.org/pdf/1804.00379v2.pdf | |
PWC | https://paperswithcode.com/paper/recall-traces-backtracking-models-for |
Repo | |
Framework | |
Replicating Active Appearance Model by Generator Network
Title | Replicating Active Appearance Model by Generator Network |
Authors | Tian Han, Jiawen Wu, Ying Nian Wu |
Abstract | A recent Cell paper [Chang and Tsao, 2017] reports an interesting discovery. For the face stimuli generated by a pre-trained active appearance model (AAM), the responses of neurons in the areas of the primate brain that are responsible for face recognition exhibit strong linear relationship with the shape variables and appearance variables of the AAM that generates the face stimuli. In this paper, we show that this behavior can be replicated by a deep generative model called the generator network, which assumes that the observed signals are generated by latent random variables via a top-down convolutional neural network. Specifically, we learn the generator network from the face images generated by a pre-trained AAM model using variational auto-encoder, and we show that the inferred latent variables of the learned generator network have strong linear relationship with the shape and appearance variables of the AAM model that generates the face images. Unlike the AAM model that has an explicit shape model where the shape variables generate the control points or landmarks, the generator network has no such shape model and shape variables. Yet the generator network can learn the shape knowledge in the sense that some of the latent variables of the learned generator network capture the shape variations in the face images generated by AAM. |
Tasks | Face Recognition |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.08704v1 |
http://arxiv.org/pdf/1805.08704v1.pdf | |
PWC | https://paperswithcode.com/paper/replicating-active-appearance-model-by |
Repo | |
Framework | |
An Empirical Exploration of Curriculum Learning for Neural Machine Translation
Title | An Empirical Exploration of Curriculum Learning for Neural Machine Translation |
Authors | Xuan Zhang, Gaurav Kumar, Huda Khayrallah, Kenton Murray, Jeremy Gwinnup, Marianna J Martindale, Paul McNamee, Kevin Duh, Marine Carpuat |
Abstract | Machine translation systems based on deep neural networks are expensive to train. Curriculum learning aims to address this issue by choosing the order in which samples are presented during training to help train better models faster. We adopt a probabilistic view of curriculum learning, which lets us flexibly evaluate the impact of curricula design, and perform an extensive exploration on a German-English translation task. Results show that it is possible to improve convergence time at no loss in translation quality. However, results are highly sensitive to the choice of sample difficulty criteria, curriculum schedule and other hyperparameters. |
Tasks | Machine Translation |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00739v1 |
http://arxiv.org/pdf/1811.00739v1.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-exploration-of-curriculum |
Repo | |
Framework | |
A Similarity Measure for Weaving Patterns in Textiles
Title | A Similarity Measure for Weaving Patterns in Textiles |
Authors | Sven Helmer, Vuong M. Ngo |
Abstract | We propose a novel approach for measuring the similarity between weaving patterns that can provide similarity-based search functionality for textile archives. We represent textile structures using hypergraphs and extract multisets of k-neighborhoods from these graphs. The resulting multisets are then compared using Jaccard coefficients, Hamming distances, and cosine measures. We evaluate the different variants of our similarity measure experimentally, showing that it can be implemented efficiently and illustrating its quality using it to cluster and query a data set containing more than a thousand textile samples. |
Tasks | |
Published | 2018-10-10 |
URL | http://arxiv.org/abs/1810.04604v1 |
http://arxiv.org/pdf/1810.04604v1.pdf | |
PWC | https://paperswithcode.com/paper/a-similarity-measure-for-weaving-patterns-in |
Repo | |
Framework | |
Making BREAD: Biomimetic strategies for Artificial Intelligence Now and in the Future
Title | Making BREAD: Biomimetic strategies for Artificial Intelligence Now and in the Future |
Authors | Jeffrey L. Krichmar, William Severa, Salar M. Khan, James L. Olds |
Abstract | The Artificial Intelligence (AI) revolution foretold of during the 1960s is well underway in the second decade of the 21st century. Its period of phenomenal growth likely lies ahead. Still, we believe, there are crucial lessons that biology can offer that will enable a prosperous future for AI. For machines in general, and for AI’s especially, operating over extended periods or in extreme environments will require energy usage orders of magnitudes more efficient than exists today. In many operational environments, energy sources will be constrained. Any plans for AI devices operating in a challenging environment must begin with the question of how they are powered, where fuel is located, how energy is stored and made available to the machine, and how long the machine can operate on specific energy units. Hence, the materials and technologies that provide the needed energy represent a critical challenge towards future use-scenarios of AI and should be integrated into their design. Here we make four recommendations for stakeholders and especially decision makers to facilitate a successful trajectory for this technology. First, that scientific societies and governments coordinate Biomimetic Research for Energy-efficient, AI Designs (BREAD); a multinational initiative and a funding strategy for investments in the future integrated design of energetics into AI. Second, that biomimetic energetic solutions be central to design consideration for future AI. Third, that a pre-competitive space be organized between stakeholder partners and fourth, that a trainee pipeline be established to ensure the human capital required for success in this area. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01184v1 |
http://arxiv.org/pdf/1812.01184v1.pdf | |
PWC | https://paperswithcode.com/paper/making-bread-biomimetic-strategies-for |
Repo | |
Framework | |
iDriveSense: Dynamic Route Planning Involving Roads Quality Information
Title | iDriveSense: Dynamic Route Planning Involving Roads Quality Information |
Authors | Amr S. El-Wakeel, Aboelmagd Noureldin, Hossam S. Hassanein, Nizar Zorba |
Abstract | Owing to the expeditious growth in the information and communication technologies, smart cities have raised the expectations in terms of efficient functioning and management. One key aspect of residents’ daily comfort is assured through affording reliable traffic management and route planning. Comprehensively, the majority of the present trip planning applications and service providers are enabling their trip planning recommendations relying on shortest paths and/or fastest routes. However, such suggestions may discount drivers’ preferences with respect to safe and less disturbing trips. Road anomalies such as cracks, potholes, and manholes induce risky driving scenarios and can lead to vehicles damages and costly repairs. Accordingly, in this paper, we propose a crowdsensing based dynamic route planning system. Leveraging both the vehicle motion sensors and the inertial sensors within the smart devices, road surface types and anomalies have been detected and categorized. In addition, the monitored events are geo-referenced utilizing GPS receivers on both vehicles and smart devices. Consequently, road segments assessments are conducted using fuzzy system models based on aspects such as the number of anomalies and their severity levels in each road segment. Afterward, another fuzzy model is adopted to recommend the best trip routes based on the road segments quality in each potential route. Extensive road experiments are held to build and show the potential of the proposed system. |
Tasks | |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02855v1 |
http://arxiv.org/pdf/1809.02855v1.pdf | |
PWC | https://paperswithcode.com/paper/idrivesense-dynamic-route-planning-involving |
Repo | |
Framework | |
Solving the Steiner Tree Problem in graphs with Variable Neighborhood Descent
Title | Solving the Steiner Tree Problem in graphs with Variable Neighborhood Descent |
Authors | Matthieu De Laere, San Tu Pham, Patrick De Causmaecker |
Abstract | The Steiner Tree Problem (STP) in graphs is an important problem with various applications in many areas such as design of integrated circuits, evolution theory, networking, etc. In this paper, we propose an algorithm to solve the STP. The algorithm includes a reducer and a solver using Variable Neighborhood Descent (VND), interacting with each other during the search. New constructive heuristics and a vertex score system for intensification purpose are proposed. The algorithm is tested on a set of benchmarks which shows encouraging results. |
Tasks | |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.06685v1 |
http://arxiv.org/pdf/1806.06685v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-the-steiner-tree-problem-in-graphs |
Repo | |
Framework | |
Analysis of Fleet Modularity in an Artificial Intelligence-Based Attacker-Defender Game
Title | Analysis of Fleet Modularity in an Artificial Intelligence-Based Attacker-Defender Game |
Authors | Xingyu Li, Bogdan I. Epureanu |
Abstract | Because combat environments change over time and technology upgrades are widespread for ground vehicles, a large number of vehicles and equipment become quickly obsolete. A possible solution for the U.S. Army is to develop fleets of modular military vehicles, which are built by interchangeable substantial components also known as modules. One of the typical characteristics of module is their ease of assembly and disassembly through simple means such as plug-in/pull-out actions, which allows for real-time fleet reconfiguration to meet dynamic demands. Moreover, military demands are time-varying and highly stochastic because commanders keep reacting to enemy’s actions. To capture these characteristics, we formulated an intelligent agent-based model to imitate decision making process during fleet operation, which combines real-time optimization with artificial intelligence. The agents are capable of inferring enemy’s future move based on historical data and optimize dispatch/operation decisions accordingly. We implement our model to simulate an attacker-defender game between two adversarial and intelligent players, representing the commanders from modularized fleet and conventional fleet respectively. Given the same level of combat resources and intelligence, we highlight the tactical advantages of fleet modularity in terms of win rate, unpredictability and suffered damage. |
Tasks | Decision Making |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03742v2 |
http://arxiv.org/pdf/1811.03742v2.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-fleet-modularity-in-an-artificial |
Repo | |
Framework | |
Generalization Properties of hyper-RKHS and its Application to Out-of-Sample Extensions
Title | Generalization Properties of hyper-RKHS and its Application to Out-of-Sample Extensions |
Authors | Fanghui Liu, Lei Shi, Xiaolin Huang, Jie Yang, Johan A. K. Suykens |
Abstract | Hyper-kernels endowed by hyper-Reproducing Kernel Hilbert Space (hyper-RKHS) formulate the kernel learning task as learning on the space of kernels itself, which provides significant model flexibility for kernel learning with outstanding performance in real-world applications. However, the convergence behavior of these learning algorithms in hyper-RKHS has not been investigated in learning theory. In this paper, we conduct approximation analysis of kernel ridge regression (KRR) and support vector regression (SVR) in this space. To the best of our knowledge, this is the first work to study the approximation performance of regression in hyper-RKHS. For applications, we propose a general kernel learning framework conducted by the introduced two regression models to deal with the out-of-sample extensions problem, i.e., to learn a underlying general kernel from the pre-given kernel/similarity matrix in hyper-RKHS. Experimental results on several benchmark datasets suggest that our methods are able to learn a general kernel function from an arbitrary given kernel matrix. |
Tasks | |
Published | 2018-09-26 |
URL | http://arxiv.org/abs/1809.09910v1 |
http://arxiv.org/pdf/1809.09910v1.pdf | |
PWC | https://paperswithcode.com/paper/generalization-properties-of-hyper-rkhs-and |
Repo | |
Framework | |
Optimal Sketching Bounds for Exp-concave Stochastic Minimization
Title | Optimal Sketching Bounds for Exp-concave Stochastic Minimization |
Authors | Naman Agarwal, Alon Gonen |
Abstract | We derive optimal statistical and computational complexity bounds for exp-concave stochastic minimization in terms of the effective dimension. For common eigendecay patterns of the population covariance matrix, this quantity is significantly smaller than the ambient dimension. Our results reveal interesting connections to sketching results in numerical linear algebra. In particular, our statistical analysis highlights a novel and natural relationship between algorithmic stability of empirical risk minimization and ridge leverage scores, which play significant role in sketching-based methods. Our main computational result is a fast implementation of a sketch-to-precondition approach in the context of exp-concave empirical risk minimization. |
Tasks | |
Published | 2018-05-21 |
URL | https://arxiv.org/abs/1805.08268v7 |
https://arxiv.org/pdf/1805.08268v7.pdf | |
PWC | https://paperswithcode.com/paper/effective-dimension-of-exp-concave |
Repo | |
Framework | |
A Survey on Sentiment and Emotion Analysis for Computational Literary Studies
Title | A Survey on Sentiment and Emotion Analysis for Computational Literary Studies |
Authors | Evgeny Kim, Roman Klinger |
Abstract | Emotions are a crucial part of compelling narratives: literature tells us about people with goals, desires, passions, and intentions. In the past, the affective dimension of literature was mainly studied in the context of literary hermeneutics. However, with the emergence of the research field known as Digital Humanities (DH), some studies of emotions in a literary context have taken a computational turn. Given the fact that DH is still being formed as a field, this direction of research can be rendered relatively new. In this survey, we offer an overview of the existing body of research on sentiment and emotion analysis as applied to literature. The research under review deals with a variety of topics including tracking dramatic changes of a plot development, network analysis of a literary text, and understanding the emotionality of texts, among other topics. |
Tasks | Emotion Recognition, Sentiment Analysis |
Published | 2018-08-09 |
URL | https://arxiv.org/abs/1808.03137v2 |
https://arxiv.org/pdf/1808.03137v2.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-sentiment-and-emotion-analysis |
Repo | |
Framework | |
Deep Reinforcement Learning via L-BFGS Optimization
Title | Deep Reinforcement Learning via L-BFGS Optimization |
Authors | Jacob Rafati, Roummel F. Marcia |
Abstract | Reinforcement Learning (RL) algorithms allow artificial agents to improve their action selections so as to increase rewarding experiences in their environments. Deep Reinforcement Learning algorithms require solving a nonconvex and nonlinear unconstrained optimization problem. Methods for solving the optimization problems in deep RL are restricted to the class of first-order algorithms, such as stochastic gradient descent (SGD). The major drawback of the SGD methods is that they have the undesirable effect of not escaping saddle points and their performance can be seriously obstructed by ill-conditioning. Furthermore, SGD methods require exhaustive trial and error to fine-tune many learning parameters. Using second derivative information can result in improved convergence properties, but computing the Hessian matrix for large-scale problems is not practical. Quasi-Newton methods require only first-order gradient information, like SGD, but they can construct a low rank approximation of the Hessian matrix and result in superlinear convergence. The limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) approach is one of the most popular quasi-Newton methods that construct positive definite Hessian approximations. In this paper, we introduce an efficient optimization method, based on the limited memory BFGS quasi-Newton method using line search strategy – as an alternative to SGD methods. Our method bridges the disparity between first order methods and second order methods by continuing to use gradient information to calculate a low-rank Hessian approximations. We provide formal convergence analysis as well as empirical results on a subset of the classic ATARI 2600 games. Our results show a robust convergence with preferred generalization characteristics, as well as fast training time and no need for the experience replaying mechanism. |
Tasks | Atari Games, Q-Learning |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02693v2 |
http://arxiv.org/pdf/1811.02693v2.pdf | |
PWC | https://paperswithcode.com/paper/quasi-newton-optimization-in-deep-q-learning |
Repo | |
Framework | |