Paper Group ANR 428
Multi-Path Policy Optimization. Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets. MAME : Model-Agnostic Meta-Exploration. Gumbel-Matrix Routing for Flexible Multi-task Learning. Thompson Sampling for Combinatorial Network Optimization in Unknown Environments. A Lexical, Syntactic, and Semantic Perspective …
Multi-Path Policy Optimization
Title | Multi-Path Policy Optimization |
Authors | Ling Pan, Qingpeng Cai, Longbo Huang |
Abstract | Recent years have witnessed a tremendous improvement of deep reinforcement learning. However, a challenging problem is that an agent may suffer from inefficient exploration, particularly for on-policy methods. Previous exploration methods either rely on complex structure to estimate the novelty of states, or incur sensitive hyper-parameters causing instability. We propose an efficient exploration method, Multi-Path Policy Optimization (MPPO), which does not incur high computation cost and ensures stability. MPPO maintains an efficient mechanism that effectively utilizes a population of diverse policies to enable better exploration, especially in sparse environments. We also give a theoretical guarantee of the stable performance. We build our scheme upon two widely-adopted on-policy methods, the Trust-Region Policy Optimization algorithm and Proximal Policy Optimization algorithm. We conduct extensive experiments on several MuJoCo tasks and their sparsified variants to fairly evaluate the proposed method. Results show that MPPO significantly outperforms state-of-the-art exploration methods in terms of both sample efficiency and final performance. |
Tasks | Efficient Exploration |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04207v3 |
https://arxiv.org/pdf/1911.04207v3.pdf | |
PWC | https://paperswithcode.com/paper/multi-path-policy-optimization |
Repo | |
Framework | |
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets
Title | Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets |
Authors | Mingrui Liu, Youssef Mroueh, Jerret Ross, Wei Zhang, Xiaodong Cui, Payel Das, Tianbao Yang |
Abstract | Adaptive gradient algorithms perform gradient-based updates using the history of gradients and are ubiquitous in training deep neural networks. While adaptive gradient methods theory is well understood for minimization problems, the underlying factors driving their empirical success in min-max problems such as GANs remain unclear. In this paper, we aim at bridging this gap from both theoretical and empirical perspectives. First, we analyze a variant of Optimistic Stochastic Gradient (OSG) proposed in~\citep{daskalakis2017training} for solving a class of non-convex non-concave min-max problem and establish $O(\epsilon^{-4})$ complexity for finding $\epsilon$-first-order stationary point, in which the algorithm only requires invoking one stochastic first-order oracle while enjoying state-of-the-art iteration complexity achieved by stochastic extragradient method by~\citep{iusem2017extragradient}. Then we propose an adaptive variant of OSG named Optimistic Adagrad (OAdagrad) and reveal an \emph{improved} adaptive complexity $\widetilde{O}\left(\epsilon^{-\frac{2}{1-\alpha}}\right)$~\footnote{Here $\widetilde{O}(\cdot)$ compresses a logarithmic factor of $\epsilon$.}, where $\alpha$ characterizes the growth rate of the cumulative stochastic gradient and $0\leq \alpha\leq 1/2$. To the best of our knowledge, this is the first work for establishing adaptive complexity in non-convex non-concave min-max optimization. Empirically, our experiments show that indeed adaptive gradient algorithms outperform their non-adaptive counterparts in GAN training. Moreover, this observation can be explained by the slow growth rate of the cumulative stochastic gradient, as observed empirically. |
Tasks | |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11940v1 |
https://arxiv.org/pdf/1912.11940v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-better-understanding-of-adaptive-1 |
Repo | |
Framework | |
MAME : Model-Agnostic Meta-Exploration
Title | MAME : Model-Agnostic Meta-Exploration |
Authors | Swaminathan Gurumurthy, Sumit Kumar, Katia Sycara |
Abstract | Meta-Reinforcement learning approaches aim to develop learning procedures that can adapt quickly to a distribution of tasks with the help of a few examples. Developing efficient exploration strategies capable of finding the most useful samples becomes critical in such settings. Existing approaches towards finding efficient exploration strategies add auxiliary objectives to promote exploration by the pre-update policy, however, this makes the adaptation using a few gradient steps difficult as the pre-update (exploration) and post-update (exploitation) policies are often quite different. Instead, we propose to explicitly model a separate exploration policy for the task distribution. Having two different policies gives more flexibility in training the exploration policy and also makes adaptation to any specific task easier. We show that using self-supervised or supervised learning objectives for adaptation allows for more efficient inner-loop updates and also demonstrate the superior performance of our model compared to prior works in this domain. |
Tasks | Efficient Exploration |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04024v1 |
https://arxiv.org/pdf/1911.04024v1.pdf | |
PWC | https://paperswithcode.com/paper/mame-model-agnostic-meta-exploration |
Repo | |
Framework | |
Gumbel-Matrix Routing for Flexible Multi-task Learning
Title | Gumbel-Matrix Routing for Flexible Multi-task Learning |
Authors | Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent |
Abstract | This paper proposes a novel per-task routing method for multi-task applications. Multi-task neural networks can learn to transfer knowledge across different tasks by using parameter sharing. However, sharing parameters between unrelated tasks can hurt performance. To address this issue, we advocate the use of routing networks to learn flexible parameter sharing, where each group of parameters is shared with a different subset of tasks in order to better leverage tasks relatedness. At the same time, it is known that routing networks are notoriously hard to train. We propose the Gumbel-Matrix routing: a novel multi-task routing method, designed to learn fine-grained patterns of parameter sharing. The routing is learned jointly with the model parameters by standard back-propagation thanks to the Gumbel-Softmax trick. When applied to the Omniglot benchmark, the proposed method reduces the state-of-the-art error rate by 17%. |
Tasks | Multi-Task Learning, Omniglot |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04915v1 |
https://arxiv.org/pdf/1910.04915v1.pdf | |
PWC | https://paperswithcode.com/paper/gumbel-matrix-routing-for-flexible-multi-task-1 |
Repo | |
Framework | |
Thompson Sampling for Combinatorial Network Optimization in Unknown Environments
Title | Thompson Sampling for Combinatorial Network Optimization in Unknown Environments |
Authors | Alihan Hüyük, Cem Tekin |
Abstract | Influence maximization, item recommendation, adaptive routing and dynamic spectrum allocation all require choosing the right action from a large set of alternatives. Thanks to the advances in combinatorial optimization, these and many similar problems can be efficiently solved given that the stochasticity of the environment is perfectly known. In this paper, we take this one step further and focus on combinatorial optimization in unknown environments. All of these settings fit into the general combinatorial learning framework called combinatorial multi-armed bandit with probabilistically triggered arms. We consider a very powerful Bayesian algorithm, Combinatorial Thompson Sampling (CTS), and analyze its regret under the semi-bandit feedback model. Assuming that the learner does not know the expected base arm outcomes beforehand but has access to an exact oracle, we show that when the expected reward is Lipschitz continuous in the expected base arm outcomes CTS achieves $O(\sum_{i =1}^m \log T / (p_i \Delta_i))$ regret, where $m$ denotes the number of base arms, $p_i$ denotes the minimum non-zero triggering probability of base arm $i$, $\Delta_i$ denotes the minimum suboptimality gap of base arm $i$ and $T$ denotes the time horizon. In addition, we prove that when triggering probabilities are at least $p^*>0$ for all base arms, CTS achieves $O(1/p^*\log(1/p^*))$ regret independent of the time horizon. We also numerically compare CTS with algorithms that use the principle of optimism in the face of uncertainty in several combinatorial networking problems, and show that CTS outperforms these algorithms by at least an order of magnitude in the majority of the cases. |
Tasks | Combinatorial Optimization |
Published | 2019-07-07 |
URL | https://arxiv.org/abs/1907.04201v2 |
https://arxiv.org/pdf/1907.04201v2.pdf | |
PWC | https://paperswithcode.com/paper/thompson-sampling-for-combinatorial-network |
Repo | |
Framework | |
A Lexical, Syntactic, and Semantic Perspective for Understanding Style in Text
Title | A Lexical, Syntactic, and Semantic Perspective for Understanding Style in Text |
Authors | Gaurav Verma, Balaji Vasan Srinivasan |
Abstract | With a growing interest in modeling inherent subjectivity in natural language, we present a linguistically-motivated process to understand and analyze the writing style of individuals from three perspectives: lexical, syntactic, and semantic. We discuss the stylistically expressive elements within each of these levels and use existing methods to quantify the linguistic intuitions related to some of these elements. We show that such a multi-level analysis is useful for developing a well-knit understanding of style - which is independent of the natural language task at hand, and also demonstrate its value in solving three downstream tasks: authors’ style analysis, authorship attribution, and emotion prediction. We conduct experiments on a variety of datasets, comprising texts from social networking sites, user reviews, legal documents, literary books, and newswire. The results on the aforementioned tasks and datasets illustrate that such a multi-level understanding of style, which has been largely ignored in recent works, models style-related subjectivity in text and can be leveraged to improve performance on multiple downstream tasks both qualitatively and quantitatively. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08349v1 |
https://arxiv.org/pdf/1909.08349v1.pdf | |
PWC | https://paperswithcode.com/paper/a-lexical-syntactic-and-semantic-perspective |
Repo | |
Framework | |
Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference
Title | Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference |
Authors | Jonathan N. Lee, Aldo Pacchiano, Michael I. Jordan |
Abstract | Maximum a posteriori (MAP) inference is a fundamental computational paradigm for statistical inference. In the setting of graphical models, MAP inference entails solving a combinatorial optimization problem to find the most likely configuration of the discrete-valued model. Linear programming (LP) relaxations in the Sherali-Adams hierarchy are widely used to attempt to solve this problem, and smooth message passing algorithms have been proposed to solve regularized versions of these LPs with great success. This paper leverages recent work in entropy-regularized LPs to analyze convergence rates of a class of edge-based smooth message passing algorithms to $\epsilon$-optimality in the relaxation. With an appropriately chosen regularization constant, we present a theoretical guarantee on the number of iterations sufficient to recover the true integral MAP solution when the LP is tight and the solution is unique. |
Tasks | Combinatorial Optimization |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01127v2 |
https://arxiv.org/pdf/1907.01127v2.pdf | |
PWC | https://paperswithcode.com/paper/approximate-sherali-adams-relaxations-for-map |
Repo | |
Framework | |
Neural Network Design for Energy-Autonomous AI Applications using Temporal Encoding
Title | Neural Network Design for Energy-Autonomous AI Applications using Temporal Encoding |
Authors | Sergey Mileiko, Thanasin Bunnam, Fei Xia, Rishad Shafik, Alex Yakovlev, Shidhartha Das |
Abstract | Neural Networks (NNs) are steering a new generation of artificial intelligence (AI) applications at the micro-edge. Examples include wireless sensors, wearables and cybernetic systems that collect data and process them to support real-world decisions and controls. For energy autonomy, these applications are typically powered by energy harvesters. As harvesters and other power sources which provide energy autonomy inevitably have power variations, the circuits need to robustly operate over a dynamic power envelope. In other words, the NN hardware needs to be able to function correctly under unpredictable and variable supply voltages. In this paper, we propose a novel NN design approach using the principle of pulse width modulation (PWM). PWM signals represent information with their duty cycle values which may be made independent of the voltages and frequencies of the carrier signals. We design a PWM-based perceptron which can serve as the fundamental building block for NNs, by using an entirely new method of realising arithmetic in the PWM domain. We analyse the proposed approach building from a 3x3 perceptron circuit to a complex multi-layer NN. Using handwritten character recognition as an exemplar of AI applications, we demonstrate the power elasticity, resilience and efficiency of the proposed NN design in the presence of functional and parametric variations including large voltage variations in the power supply. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.07492v1 |
https://arxiv.org/pdf/1910.07492v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-design-for-energy-autonomous |
Repo | |
Framework | |
Emergent Structures and Lifetime Structure Evolution in Artificial Neural Networks
Title | Emergent Structures and Lifetime Structure Evolution in Artificial Neural Networks |
Authors | Siavash Golkar |
Abstract | Motivated by the flexibility of biological neural networks whose connectivity structure changes significantly during their lifetime, we introduce the Unstructured Recursive Network (URN) and demonstrate that it can exhibit similar flexibility during training via gradient descent. We show empirically that many of the different neural network structures commonly used in practice today (including fully connected, locally connected and residual networks of different depths and widths) can emerge dynamically from the same URN. These different structures can be derived using gradient descent on a single general loss function where the structure of the data and the relative strengths of various regulator terms determine the structure of the emergent network. We show that this loss function and the regulators arise naturally when considering the symmetries of the network as well as the geometric properties of the input data. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11691v1 |
https://arxiv.org/pdf/1911.11691v1.pdf | |
PWC | https://paperswithcode.com/paper/emergent-structures-and-lifetime-structure |
Repo | |
Framework | |
Which Factors Impact Engagement on News Articles on Facebook?
Title | Which Factors Impact Engagement on News Articles on Facebook? |
Authors | Marc Faddoul |
Abstract | Social media is increasingly being used as a news-platform. To reach their intended audience, newspapers need for their articles to be well ranked by Facebook’s news-feed algorithm. The number of likes, shares and other reactions determine the lead scoring criteria. This paper will try to assess how the reaction volume is impacted by the following criteria: (1) Delay between event and post release; (2) Time of the day the post is published; and (3) Post format: video, photo or text. To isolate the effect of the publication time and post format on a post, we need to control for the news-event and the publishing newspaper. For that end, a news-aggregator is designed and implemented, to group together posts that relate to the same news-event. This tool gave some spin-off results, allowing the ability to map newspapers by similarity and to detect some topic omissions. |
Tasks | |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14651v1 |
https://arxiv.org/pdf/1910.14651v1.pdf | |
PWC | https://paperswithcode.com/paper/which-factors-impact-engagement-on-news |
Repo | |
Framework | |
Toward Joint Image Generation and Compression using Generative Adversarial Networks
Title | Toward Joint Image Generation and Compression using Generative Adversarial Networks |
Authors | Byeongkeun Kang, Subarna Tripathi, Truong Q. Nguyen |
Abstract | In this paper, we present a generative adversarial network framework that generates compressed images instead of synthesizing raw RGB images and compressing them separately. In the real world, most images and videos are stored and transferred in a compressed format to save storage capacity and data transfer bandwidth. However, since typical generative adversarial networks generate raw RGB images, those generated images need to be compressed by a post-processing stage to reduce the data size. Among image compression methods, JPEG has been one of the most commonly used lossy compression methods for still images. Hence, we propose a novel framework that generates JPEG compressed images using generative adversarial networks. The novel generator consists of the proposed locally connected layers, chroma subsampling layers, quantization layers, residual blocks, and convolution layers. The locally connected layer is proposed to enable block-based operations. We also discuss training strategies for the proposed architecture including the loss function and the transformation between its generator and its discriminator. The proposed method is evaluated using the publicly available CIFAR-10 dataset and LSUN bedroom dataset. The results demonstrate that the proposed method is able to generate compressed data with competitive qualities. The proposed method is a promising baseline method for joint image generation and compression using generative adversarial networks. |
Tasks | Image Compression, Image Generation, Quantization |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.07838v1 |
http://arxiv.org/pdf/1901.07838v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-joint-image-generation-and-compression |
Repo | |
Framework | |
Learning Transferable Graph Exploration
Title | Learning Transferable Graph Exploration |
Authors | Hanjun Dai, Yujia Li, Chenglong Wang, Rishabh Singh, Po-Sen Huang, Pushmeet Kohli |
Abstract | This paper considers the problem of efficient exploration of unseen environments, a key challenge in AI. We propose a learning to explore' framework where we learn a policy from a distribution of environments. At test time, presented with an unseen environment from the same distribution, the policy aims to generalize the exploration strategy to visit the maximum number of unique states in a limited number of steps. We particularly focus on environments with graph-structured state-spaces that are encountered in many important real-world applications like software testing and map building. We formulate this task as a reinforcement learning problem where the exploration’ agent is rewarded for transitioning to previously unseen environment states and employ a graph-structured memory to encode the agent’s past trajectory. Experimental results demonstrate that our approach is extremely effective for exploration of spatial maps; and when applied on the challenging problems of coverage-guided software-testing of domain-specific programs and real-world mobile applications, it outperforms methods that have been hand-engineered by human experts. |
Tasks | Efficient Exploration |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12980v1 |
https://arxiv.org/pdf/1910.12980v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-transferable-graph-exploration |
Repo | |
Framework | |
Learning Index Selection with Structured Action Spaces
Title | Learning Index Selection with Structured Action Spaces |
Authors | Jeremy Welborn, Michael Schaarschmidt, Eiko Yoneki |
Abstract | Configuration spaces for computer systems can be challenging for traditional and automatic tuning strategies. Injecting task-specific knowledge into the tuner for a task may allow for more efficient exploration of candidate configurations. We apply this idea to the task of index set selection to accelerate database workloads. Index set selection has been amenable to recent applications of vanilla deep RL, but real deployments remain out of reach. In this paper, we explore how learning index selection can be enhanced with task-specific inductive biases, specifically by encoding these inductive biases in better action structures. Index selection-specific action representations arise when the problem is reformulated in terms of permutation learning and we rely on recent work for learning RL policies on permutations. Through this approach, we build an indexing agent that is able to achieve improved indexing and validate its behavior with task-specific statistics. Early experiments reveal that our agent can find configurations that are up to 40% smaller for the same levels of latency as compared with other approaches and indicate more intuitive indexing behavior. |
Tasks | Efficient Exploration |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07440v1 |
https://arxiv.org/pdf/1909.07440v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-index-selection-with-structured |
Repo | |
Framework | |
Comparison Network for One-Shot Conditional Object Detection
Title | Comparison Network for One-Shot Conditional Object Detection |
Authors | Tengfei Zhang, Yue Zhang, Xian Sun, Hao Sun, Menglong Yan, Xue Yang, Kun Fu |
Abstract | The current advances in object detection depend on large-scale datasets to get good performance. However, there may not always be sufficient samples in many scenarios, which leads to the research on few-shot detection as well as its extreme variation one-shot detection. In this paper, the one-shot detection has been formulated as a conditional probability problem. With this insight, a novel one-shot conditional object detection (OSCD) framework, referred as Comparison Network (ComparisonNet), has been proposed. Specifically, query and target image features are extracted through a Siamese network as mapped metrics of marginal probabilities. A two-stage detector for OSCD is introduced to compare the extracted query and target features with the learnable metric to approach the optimized non-linear conditional probability. Once trained, ComparisonNet can detect objects of both seen and unseen classes without further training, which also has the advantages including class-agnostic, training-free for unseen classes, and without catastrophic forgetting. Experiments show that the proposed approach achieves state-of-the-art performance on the proposed datasets of Fashion-MNIST and PASCAL VOC. |
Tasks | Object Detection |
Published | 2019-04-04 |
URL | https://arxiv.org/abs/1904.02317v2 |
https://arxiv.org/pdf/1904.02317v2.pdf | |
PWC | https://paperswithcode.com/paper/comparison-network-for-one-shot-conditional |
Repo | |
Framework | |
Fused Detection of Retinal Biomarkers in OCT Volumes
Title | Fused Detection of Retinal Biomarkers in OCT Volumes |
Authors | Thomas Kurmann, Pablo Márquez-Neila, Siqing Yu, Marion Munk, Sebastian Wolf, Raphael Sznitman |
Abstract | Optical Coherence Tomography (OCT) is the primary imaging modality for detecting pathological biomarkers associated to retinal diseases such as Age-Related Macular Degeneration. In practice, clinical diagnosis and treatment strategies are closely linked to biomarkers visible in OCT volumes and the ability to identify these plays an important role in the development of ophthalmic pharmaceutical products. In this context, we present a method that automatically predicts the presence of biomarkers in OCT cross-sections by incorporating information from the entire volume. We do so by adding a bidirectional LSTM to fuse the outputs of a Convolutional Neural Network that predicts individual biomarkers. We thus avoid the need to use pixel-wise annotations to train our method, and instead provide fine-grained biomarker information regardless. On a dataset of 416 volumes, we show that our approach imposes coherence between biomarker predictions across volume slices and our predictions are superior to several existing approaches. |
Tasks | |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.06955v1 |
https://arxiv.org/pdf/1907.06955v1.pdf | |
PWC | https://paperswithcode.com/paper/fused-detection-of-retinal-biomarkers-in-oct |
Repo | |
Framework | |