January 26, 2020

3025 words 15 mins read

Paper Group ANR 1345

Paper Group ANR 1345

Federated User Representation Learning. Learning Novel Policies For Tasks. Bayesian Networks based Hybrid Quantum-Classical Machine Learning Approach to Elucidate Gene Regulatory Pathways. TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks. Fractional-order Backpropagation Neural Networks: Modified Fractional-order Steepest Descent Met …

Federated User Representation Learning

Title Federated User Representation Learning
Authors Duc Bui, Kshitiz Malik, Jack Goetz, Honglei Liu, Seungwhan Moon, Anuj Kumar, Kang G. Shin
Abstract Collaborative personalization, such as through learned user representations (embeddings), can improve the prediction accuracy of neural-network-based models significantly. We propose Federated User Representation Learning (FURL), a simple, scalable, privacy-preserving and resource-efficient way to utilize existing neural personalization techniques in the Federated Learning (FL) setting. FURL divides model parameters into federated and private parameters. Private parameters, such as private user embeddings, are trained locally, but unlike federated parameters, they are not transferred to or averaged on the server. We show theoretically that this parameter split does not affect training for most model personalization approaches. Storing user embeddings locally not only preserves user privacy, but also improves memory locality of personalization compared to on-server training. We evaluate FURL on two datasets, demonstrating a significant improvement in model quality with 8% and 51% performance increases, and approximately the same level of performance as centralized training with only 0% and 4% reductions. Furthermore, we show that user embeddings learned in FL and the centralized setting have a very similar structure, indicating that FURL can learn collaboratively through the shared parameters while preserving user privacy.
Tasks Representation Learning
Published 2019-09-27
URL https://arxiv.org/abs/1909.12535v1
PDF https://arxiv.org/pdf/1909.12535v1.pdf
PWC https://paperswithcode.com/paper/federated-user-representation-learning
Repo
Framework

Learning Novel Policies For Tasks

Title Learning Novel Policies For Tasks
Authors Yunbo Zhang, Wenhao Yu, Greg Turk
Abstract In this work, we present a reinforcement learning algorithm that can find a variety of policies (novel policies) for a task that is given by a task reward function. Our method does this by creating a second reward function that recognizes previously seen state sequences and rewards those by novelty, which is measured using autoencoders that have been trained on state sequences from previously discovered policies. We present a two-objective update technique for policy gradient algorithms in which each update of the policy is a compromise between improving the task reward and improving the novelty reward. Using this method, we end up with a collection of policies that solves a given task as well as carrying out action sequences that are distinct from one another. We demonstrate this method on maze navigation tasks, a reaching task for a simulated robot arm, and a locomotion task for a hopper. We also demonstrate the effectiveness of our approach on deceptive tasks in which policy gradient methods often get stuck.
Tasks Policy Gradient Methods
Published 2019-05-13
URL https://arxiv.org/abs/1905.05252v2
PDF https://arxiv.org/pdf/1905.05252v2.pdf
PWC https://paperswithcode.com/paper/learning-novel-policies-for-tasks
Repo
Framework

Bayesian Networks based Hybrid Quantum-Classical Machine Learning Approach to Elucidate Gene Regulatory Pathways

Title Bayesian Networks based Hybrid Quantum-Classical Machine Learning Approach to Elucidate Gene Regulatory Pathways
Authors Radhakrishnan Balu, Ajinkya Borle
Abstract We report a scalable hybrid quantum-classical machine learning framework to build Bayesian networks (BN) that captures the conditional dependence and causal relationships of random variables. The generation of a BN consists of finding a directed acyclic graph (DAG) and the associated joint probability distribution of the nodes consistent with a given dataset. This is a combinatorial problem of structural learning of the underlying graph, starting from a single node and building one arc at a time, that fits a given ensemble using maximum likelihood estimators (MLE). It is cast as an optimization problem that consists of a scoring step performed on a classical computer, penalties for acyclicity and number of parents allowed constraints, and a search step implemented using a quantum annealer. We have assumed uniform priors in deriving the Bayesian network that can be relaxed by formulating the problem as an estimation Dirichlet parameters. We demonstrate the utility of the framework by applying to the problem of elucidating the gene regulatory network for the MAPK/Raf pathway in human T-cells using proteomics data where the concentration of proteins, nodes of the BN, are interpreted as probabilities.
Tasks
Published 2019-01-23
URL http://arxiv.org/abs/1901.10557v1
PDF http://arxiv.org/pdf/1901.10557v1.pdf
PWC https://paperswithcode.com/paper/bayesian-networks-based-hybrid-quantum
Repo
Framework

TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks

Title TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks
Authors Shubham Jain, Sumeet Kumar Gupta, Anand Raghunathan
Abstract The use of lower precision has emerged as a popular technique to optimize the compute and storage requirements of complex Deep Neural Networks (DNNs). In the quest for lower precision, recent studies have shown that ternary DNNs, which represent weights and activations by signed ternary values, represent a promising sweet spot, and achieve accuracy close to full-precision networks on complex tasks such as language modeling and image classification. We propose TiM-DNN, a programmable, in-memory accelerator that is specifically designed to execute ternary DNNs. TiM-DNN supports various ternary representations including unweighted (-1,0,1), symmetric weighted (-a,0,a), and asymmetric weighted (-a,0,b) ternary systems. TiM-DNN is designed using TiM tiles – specialized memory arrays that perform massively parallel signed vector-matrix multiplications on ternary values with a single access. TiM tiles are in turn composed of Ternary Processing Cells (TPCs), new bit-cells that function as both ternary storage units and signed scalar multiplication units. We evaluate an implementation of TiM-DNN in 32nm technology using an architectural simulator calibrated with SPICE simulations and RTL synthesis. TiM-DNN achieves a peak performance of 114 TOPs/s, consumes 0.9W power, and occupies 1.96mm2 chip area, representing a 300X and 388X improvement in TOPS/W and TOPS/mm2, respectively, compared to a state-of-the-art NVIDIA Tesla V100 GPU. In comparison to popular DNN accelerators, TiM-DNN achieves 55.2X-240X and 160X-291X improvement in TOPS/W and TOPS/mm2, respectively. We compare TiM-DNN with a well-optimized near-memory accelerator for ternary DNNs across a suite of state-of-the-art DNN benchmarks including both deep convolutional and recurrent neural networks, demonstrating 3.9x-4.7x improvement in system-level energy and 3.2x-4.2x speedup.
Tasks Image Classification, Language Modelling
Published 2019-09-15
URL https://arxiv.org/abs/1909.06892v2
PDF https://arxiv.org/pdf/1909.06892v2.pdf
PWC https://paperswithcode.com/paper/tim-dnn-ternary-in-memory-accelerator-for
Repo
Framework

Fractional-order Backpropagation Neural Networks: Modified Fractional-order Steepest Descent Method for Family of Backpropagation Neural Networks

Title Fractional-order Backpropagation Neural Networks: Modified Fractional-order Steepest Descent Method for Family of Backpropagation Neural Networks
Authors Yi-Fei PU, Jian Wang
Abstract This paper offers a novel mathematical approach, the modified Fractional-order Steepest Descent Method (FSDM) for training BackPropagation Neural Networks (BPNNs); this differs from the majority of the previous approaches and as such. A promising mathematical method, fractional calculus, has the potential to assume a prominent role in the applications of neural networks and cybernetics because of its inherent strengths such as long-term memory, nonlocality, and weak singularity. Therefore, to improve the optimization performance of classic first-order BPNNs, in this paper we study whether it could be possible to modified FSDM and generalize classic first-order BPNNs to modified FSDM based Fractional-order Backpropagation Neural Networks (FBPNNs). Motivated by this inspiration, this paper proposes a state-of-the-art application of fractional calculus to implement a modified FSDM based FBPNN whose reverse incremental search is in the negative directions of the approximate fractional-order partial derivatives of the square error. At first, the theoretical concept of a modified FSDM based FBPNN is described mathematically. Then, the mathematical proof of the fractional-order global optimal convergence, an assumption of the structure, and the fractional-order multi-scale global optimization of a modified FSDM based FBPNN are analysed in detail. Finally, we perform comparative experiments and compare a modified FSDM based FBPNN with a classic first-order BPNN, i.e., an example function approximation, fractional-order multi-scale global optimization, and two comparative performances with real data. The more efficient optimal searching capability of the fractional-order multi-scale global optimization of a modified FSDM based FBPNN to determine the global optimal solution is the major advantage being superior to a classic first-order BPNN.
Tasks
Published 2019-06-23
URL https://arxiv.org/abs/1906.09524v2
PDF https://arxiv.org/pdf/1906.09524v2.pdf
PWC https://paperswithcode.com/paper/fractional-order-backpropagation-neural
Repo
Framework

Responses to a Critique of Artificial Moral Agents

Title Responses to a Critique of Artificial Moral Agents
Authors Adam Poulsen, Michael Anderson, Susan L. Anderson, Ben Byford, Fabio Fossa, Erica L. Neely, Alejandro Rosas, Alan Winfield
Abstract The field of machine ethics is concerned with the question of how to embed ethical behaviors, or a means to determine ethical behaviors, into artificial intelligence (AI) systems. The goal is to produce artificial moral agents (AMAs) that are either implicitly ethical (designed to avoid unethical consequences) or explicitly ethical (designed to behave ethically). Van Wynsberghe and Robbins’ (2018) paper Critiquing the Reasons for Making Artificial Moral Agents critically addresses the reasons offered by machine ethicists for pursuing AMA research; this paper, co-authored by machine ethicists and commentators, aims to contribute to the machine ethics conversation by responding to that critique. The reasons for developing AMAs discussed in van Wynsberghe and Robbins (2018) are: it is inevitable that they will be developed; the prevention of harm; the necessity for public trust; the prevention of immoral use; such machines are better moral reasoners than humans, and building these machines would lead to a better understanding of human morality. In this paper, each co-author addresses those reasons in turn. In so doing, this paper demonstrates that the reasons critiqued are not shared by all co-authors; each machine ethicist has their own reasons for researching AMAs. But while we express a diverse range of views on each of the six reasons in van Wynsberghe and Robbins’ critique, we nevertheless share the opinion that the scientific study of AMAs has considerable value.
Tasks
Published 2019-03-17
URL http://arxiv.org/abs/1903.07021v1
PDF http://arxiv.org/pdf/1903.07021v1.pdf
PWC https://paperswithcode.com/paper/responses-to-a-critique-of-artificial-moral
Repo
Framework

Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control

Title Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control
Authors Marta Sarrico, Kai Arulkumaran, Andrea Agostinelli, Pierre Richemond, Anil Anthony Bharath
Abstract Deep networks have enabled reinforcement learning to scale to more complex and challenging domains, but these methods typically require large quantities of training data. An alternative is to use sample-efficient episodic control methods: neuro-inspired algorithms which use non-/semi-parametric models that predict values based on storing and retrieving previously experienced transitions. One way to further improve the sample efficiency of these approaches is to use more principled exploration strategies. In this work, we therefore propose maximum entropy mellowmax episodic control (MEMEC), which samples actions according to a Boltzmann policy with a state-dependent temperature. We demonstrate that MEMEC outperforms other uncertainty- and softmax-based exploration methods on classic reinforcement learning environments and Atari games, achieving both more rapid learning and higher final rewards.
Tasks Atari Games
Published 2019-11-21
URL https://arxiv.org/abs/1911.09615v1
PDF https://arxiv.org/pdf/1911.09615v1.pdf
PWC https://paperswithcode.com/paper/sample-efficient-reinforcement-learning-with-2
Repo
Framework

Word Embedding for Response-To-Text Assessment of Evidence

Title Word Embedding for Response-To-Text Assessment of Evidence
Authors Haoran Zhang, Diane Litman
Abstract Manually grading the Response to Text Assessment (RTA) is labor intensive. Therefore, an automatic method is being developed for scoring analytical writing when the RTA is administered in large numbers of classrooms. Our long-term goal is to also use this scoring method to provide formative feedback to students and teachers about students’ writing quality. As a first step towards this goal, interpretable features for automatically scoring the evidence rubric of the RTA have been developed. In this paper, we present a simple but promising method for improving evidence scoring by employing the word embedding model. We evaluate our method on corpora of responses written by upper elementary students.
Tasks
Published 2019-08-06
URL https://arxiv.org/abs/1908.01969v1
PDF https://arxiv.org/pdf/1908.01969v1.pdf
PWC https://paperswithcode.com/paper/word-embedding-for-response-to-text-1
Repo
Framework

Comparing Direct and Indirect Representations for Environment-Specific Robot Component Design

Title Comparing Direct and Indirect Representations for Environment-Specific Robot Component Design
Authors Jack Collins, Ben Cottier, David Howard
Abstract We compare two representations used to define the morphology of legs for a hexapod robot, which are subsequently 3D printed. A leg morphology occupies a set of voxels in a voxel grid. One method, a direct representation, uses a collection of Bezier splines. The second, an indirect method, utilises CPPN-NEAT. In our first experiment, we investigate two strategies to post-process the CPPN output and ensure leg length constraints are met. The first uses an adaptive threshold on the output neuron, the second, previously reported in the literature, scales the largest generated artefact to our desired length. In our second experiment, we build on our past work that evolves the tibia of a hexapod to provide environment-specific performance benefits. We compare the performance of our direct and indirect legs across three distinct environments, represented in a high-fidelity simulator. Results are significant and support our hypothesis that the indirect representation allows for further exploration of the design space leading to improved fitness.
Tasks
Published 2019-01-21
URL http://arxiv.org/abs/1901.06775v1
PDF http://arxiv.org/pdf/1901.06775v1.pdf
PWC https://paperswithcode.com/paper/comparing-direct-and-indirect-representations
Repo
Framework

Panoptic-DeepLab

Title Panoptic-DeepLab
Authors Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen
Abstract We present Panoptic-DeepLab, a bottom-up and single-shot approach for panoptic segmentation. Our Panoptic-DeepLab is conceptually simple and delivers state-of-the-art results. In particular, we adopt the dual-ASPP and dual-decoder structures specific to semantic, and instance segmentation, respectively. The semantic segmentation branch is the same as the typical design of any semantic segmentation model (e.g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression. Our single Panoptic-DeepLab sets the new state-of-art at all three Cityscapes benchmarks, reaching 84.2% mIoU, 39.0% AP, and 65.5% PQ on test set, and advances results on the other challenging Mapillary Vistas.
Tasks Instance Segmentation, Panoptic Segmentation, Semantic Segmentation
Published 2019-10-10
URL https://arxiv.org/abs/1910.04751v3
PDF https://arxiv.org/pdf/1910.04751v3.pdf
PWC https://paperswithcode.com/paper/panoptic-deeplab
Repo
Framework

Augmenting Neural Nets with Symbolic Synthesis: Applications to Few-Shot Learning

Title Augmenting Neural Nets with Symbolic Synthesis: Applications to Few-Shot Learning
Authors Adithya Murali, P. Madhusudan
Abstract We propose symbolic learning as extensions to standard inductive learning models such as neural nets as a means to solve few shot learning problems. We device a class of visual discrimination puzzles that calls for recognizing objects and object relationships as well learning higher-level concepts from very few images. We propose a two-phase learning framework that combines models learned from large data sets using neural nets and symbolic first-order logic formulas learned from a few shot learning instance. We develop first-order logic synthesis techniques for discriminating images by using symbolic search and logic constraint solvers. By augmenting neural nets with them, we develop and evaluate a tool that can solve few shot visual discrimination puzzles with interpretable concepts.
Tasks Few-Shot Learning
Published 2019-07-12
URL https://arxiv.org/abs/1907.05878v1
PDF https://arxiv.org/pdf/1907.05878v1.pdf
PWC https://paperswithcode.com/paper/augmenting-neural-nets-with-symbolic
Repo
Framework

Combine PPO with NES to Improve Exploration

Title Combine PPO with NES to Improve Exploration
Authors Lianjiang Li, Yunrong Yang, Bingna Li
Abstract We introduce two approaches for combining neural evolution strategy (NES) and proximal policy optimization (PPO): parameter transfer and parameter space noise. Parameter transfer is a PPO agent with parameters transferred from a NES agent. Parameter space noise is to directly add noise to the PPO agent`s parameters. We demonstrate that PPO could benefit from both methods through experimental comparison on discrete action environments as well as continuous control tasks |
Tasks Continuous Control
Published 2019-05-23
URL https://arxiv.org/abs/1905.09492v2
PDF https://arxiv.org/pdf/1905.09492v2.pdf
PWC https://paperswithcode.com/paper/combine-ppo-with-nes-to-improve-exploration
Repo
Framework

Evolving Rewards to Automate Reinforcement Learning

Title Evolving Rewards to Automate Reinforcement Learning
Authors Aleksandra Faust, Anthony Francis, Dar Mehta
Abstract Many continuous control tasks have easily formulated objectives, yet using them directly as a reward in reinforcement learning (RL) leads to suboptimal policies. Therefore, many classical control tasks guide RL training using complex rewards, which require tedious hand-tuning. We automate the reward search with AutoRL, an evolutionary layer over standard RL that treats reward tuning as hyperparameter optimization and trains a population of RL agents to find a reward that maximizes the task objective. AutoRL, evaluated on four Mujoco continuous control tasks over two RL algorithms, shows improvements over baselines, with the the biggest uplift for more complex tasks. The video can be found at: \url{https://youtu.be/svdaOFfQyC8}.
Tasks Continuous Control, Hyperparameter Optimization
Published 2019-05-18
URL https://arxiv.org/abs/1905.07628v1
PDF https://arxiv.org/pdf/1905.07628v1.pdf
PWC https://paperswithcode.com/paper/evolving-rewards-to-automate-reinforcement
Repo
Framework

Parallel Scheduled Sampling

Title Parallel Scheduled Sampling
Authors Daniel Duckworth, Arvind Neelakantan, Ben Goodrich, Lukasz Kaiser, Samy Bengio
Abstract Auto-regressive models are widely used in sequence generation problems. The output sequence is typically generated in a predetermined order, one discrete unit (pixel or word or character) at a time. The models are trained by teacher-forcing where ground-truth history is fed to the model as input, which at test time is replaced by the model prediction. Scheduled Sampling aims to mitigate this discrepancy between train and test time by randomly replacing some discrete units in the history with the model’s prediction. While teacher-forced training works well with ML accelerators as the computation can be parallelized across time, Scheduled Sampling involves undesirable sequential processing. In this paper, we introduce a simple technique to parallelize Scheduled Sampling across time. Experimentally, we find the proposed technique leads to equivalent or better performance on image generation, summarization, dialog generation, and translation compared to teacher-forced training. In dialog response generation task, Parallel Scheduled Sampling achieves 1.6 BLEU score (11.5%) improvement over teacher-forcing while in image generation it achieves 20% and 13.8% improvement in Frechet Inception Distance (FID) and Inception Score (IS) respectively. Further, we discuss the effects of different hyper-parameters associated with Scheduled Sampling on the model performance.
Tasks Image Generation
Published 2019-06-11
URL https://arxiv.org/abs/1906.04331v2
PDF https://arxiv.org/pdf/1906.04331v2.pdf
PWC https://paperswithcode.com/paper/parallel-scheduled-sampling
Repo
Framework

Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning

Title Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
Authors Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
Abstract Recent success in deep reinforcement learning for continuous control has been dominated by model-free approaches which, unlike model-based approaches, do not suffer from representational limitations in making assumptions about the world dynamics and model errors inevitable in complex domains. However, they require a lot of experiences compared to model-based approaches that are typically more sample-efficient. We propose to combine the benefits of the two approaches by presenting an integrated approach called Curious Meta-Controller. Our approach alternates adaptively between model-based and model-free control using a curiosity feedback based on the learning progress of a neural model of the dynamics in a learned latent space. We demonstrate that our approach can significantly improve the sample efficiency and achieve near-optimal performance on learning robotic reaching and grasping tasks from raw-pixel input in both dense and sparse reward settings.
Tasks Continuous Control
Published 2019-05-05
URL https://arxiv.org/abs/1905.01718v1
PDF https://arxiv.org/pdf/1905.01718v1.pdf
PWC https://paperswithcode.com/paper/curious-meta-controller-adaptive-alternation
Repo
Framework
comments powered by Disqus