January 26, 2020

3025 words 15 mins read

Paper Group ANR 1345

Federated User Representation Learning. Learning Novel Policies For Tasks. Bayesian Networks based Hybrid Quantum-Classical Machine Learning Approach to Elucidate Gene Regulatory Pathways. TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks. Fractional-order Backpropagation Neural Networks: Modified Fractional-order Steepest Descent Met …

Federated User Representation Learning


Title	Federated User Representation Learning
Authors	Duc Bui, Kshitiz Malik, Jack Goetz, Honglei Liu, Seungwhan Moon, Anuj Kumar, Kang G. Shin
Abstract	Collaborative personalization, such as through learned user representations (embeddings), can improve the prediction accuracy of neural-network-based models significantly. We propose Federated User Representation Learning (FURL), a simple, scalable, privacy-preserving and resource-efficient way to utilize existing neural personalization techniques in the Federated Learning (FL) setting. FURL divides model parameters into federated and private parameters. Private parameters, such as private user embeddings, are trained locally, but unlike federated parameters, they are not transferred to or averaged on the server. We show theoretically that this parameter split does not affect training for most model personalization approaches. Storing user embeddings locally not only preserves user privacy, but also improves memory locality of personalization compared to on-server training. We evaluate FURL on two datasets, demonstrating a significant improvement in model quality with 8% and 51% performance increases, and approximately the same level of performance as centralized training with only 0% and 4% reductions. Furthermore, we show that user embeddings learned in FL and the centralized setting have a very similar structure, indicating that FURL can learn collaboratively through the shared parameters while preserving user privacy.
Tasks	Representation Learning
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12535v1
PDF	https://arxiv.org/pdf/1909.12535v1.pdf
PWC	https://paperswithcode.com/paper/federated-user-representation-learning
Repo
Framework

Learning Novel Policies For Tasks


Title	Learning Novel Policies For Tasks
Authors	Yunbo Zhang, Wenhao Yu, Greg Turk
Abstract	In this work, we present a reinforcement learning algorithm that can find a variety of policies (novel policies) for a task that is given by a task reward function. Our method does this by creating a second reward function that recognizes previously seen state sequences and rewards those by novelty, which is measured using autoencoders that have been trained on state sequences from previously discovered policies. We present a two-objective update technique for policy gradient algorithms in which each update of the policy is a compromise between improving the task reward and improving the novelty reward. Using this method, we end up with a collection of policies that solves a given task as well as carrying out action sequences that are distinct from one another. We demonstrate this method on maze navigation tasks, a reaching task for a simulated robot arm, and a locomotion task for a hopper. We also demonstrate the effectiveness of our approach on deceptive tasks in which policy gradient methods often get stuck.
Tasks	Policy Gradient Methods
Published	2019-05-13
URL	https://arxiv.org/abs/1905.05252v2
PDF	https://arxiv.org/pdf/1905.05252v2.pdf
PWC	https://paperswithcode.com/paper/learning-novel-policies-for-tasks
Repo
Framework

Bayesian Networks based Hybrid Quantum-Classical Machine Learning Approach to Elucidate Gene Regulatory Pathways


Title	Bayesian Networks based Hybrid Quantum-Classical Machine Learning Approach to Elucidate Gene Regulatory Pathways
Authors	Radhakrishnan Balu, Ajinkya Borle
Abstract	We report a scalable hybrid quantum-classical machine learning framework to build Bayesian networks (BN) that captures the conditional dependence and causal relationships of random variables. The generation of a BN consists of finding a directed acyclic graph (DAG) and the associated joint probability distribution of the nodes consistent with a given dataset. This is a combinatorial problem of structural learning of the underlying graph, starting from a single node and building one arc at a time, that fits a given ensemble using maximum likelihood estimators (MLE). It is cast as an optimization problem that consists of a scoring step performed on a classical computer, penalties for acyclicity and number of parents allowed constraints, and a search step implemented using a quantum annealer. We have assumed uniform priors in deriving the Bayesian network that can be relaxed by formulating the problem as an estimation Dirichlet parameters. We demonstrate the utility of the framework by applying to the problem of elucidating the gene regulatory network for the MAPK/Raf pathway in human T-cells using proteomics data where the concentration of proteins, nodes of the BN, are interpreted as probabilities.
Tasks
Published	2019-01-23
URL	http://arxiv.org/abs/1901.10557v1
PDF	http://arxiv.org/pdf/1901.10557v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-networks-based-hybrid-quantum
Repo
Framework

TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks


Title	TiM-DNN: Ternary in-Memory accelerator for Deep Neural Networks
Authors	Shubham Jain, Sumeet Kumar Gupta, Anand Raghunathan
Abstract	The use of lower precision has emerged as a popular technique to optimize the compute and storage requirements of complex Deep Neural Networks (DNNs). In the quest for lower precision, recent studies have shown that ternary DNNs, which represent weights and activations by signed ternary values, represent a promising sweet spot, and achieve accuracy close to full-precision networks on complex tasks such as language modeling and image classification. We propose TiM-DNN, a programmable, in-memory accelerator that is specifically designed to execute ternary DNNs. TiM-DNN supports various ternary representations including unweighted (-1,0,1), symmetric weighted (-a,0,a), and asymmetric weighted (-a,0,b) ternary systems. TiM-DNN is designed using TiM tiles – specialized memory arrays that perform massively parallel signed vector-matrix multiplications on ternary values with a single access. TiM tiles are in turn composed of Ternary Processing Cells (TPCs), new bit-cells that function as both ternary storage units and signed scalar multiplication units. We evaluate an implementation of TiM-DNN in 32nm technology using an architectural simulator calibrated with SPICE simulations and RTL synthesis. TiM-DNN achieves a peak performance of 114 TOPs/s, consumes 0.9W power, and occupies 1.96mm2 chip area, representing a 300X and 388X improvement in TOPS/W and TOPS/mm2, respectively, compared to a state-of-the-art NVIDIA Tesla V100 GPU. In comparison to popular DNN accelerators, TiM-DNN achieves 55.2X-240X and 160X-291X improvement in TOPS/W and TOPS/mm2, respectively. We compare TiM-DNN with a well-optimized near-memory accelerator for ternary DNNs across a suite of state-of-the-art DNN benchmarks including both deep convolutional and recurrent neural networks, demonstrating 3.9x-4.7x improvement in system-level energy and 3.2x-4.2x speedup.
Tasks	Image Classification, Language Modelling
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06892v2
PDF	https://arxiv.org/pdf/1909.06892v2.pdf
PWC	https://paperswithcode.com/paper/tim-dnn-ternary-in-memory-accelerator-for
Repo
Framework

Fractional-order Backpropagation Neural Networks: Modified Fractional-order Steepest Descent Method for Family of Backpropagation Neural Networks


Title	Fractional-order Backpropagation Neural Networks: Modified Fractional-order Steepest Descent Method for Family of Backpropagation Neural Networks
Authors	Yi-Fei PU, Jian Wang
Abstract	This paper offers a novel mathematical approach, the modified Fractional-order Steepest Descent Method (FSDM) for training BackPropagation Neural Networks (BPNNs); this differs from the majority of the previous approaches and as such. A promising mathematical method, fractional calculus, has the potential to assume a prominent role in the applications of neural networks and cybernetics because of its inherent strengths such as long-term memory, nonlocality, and weak singularity. Therefore, to improve the optimization performance of classic first-order BPNNs, in this paper we study whether it could be possible to modified FSDM and generalize classic first-order BPNNs to modified FSDM based Fractional-order Backpropagation Neural Networks (FBPNNs). Motivated by this inspiration, this paper proposes a state-of-the-art application of fractional calculus to implement a modified FSDM based FBPNN whose reverse incremental search is in the negative directions of the approximate fractional-order partial derivatives of the square error. At first, the theoretical concept of a modified FSDM based FBPNN is described mathematically. Then, the mathematical proof of the fractional-order global optimal convergence, an assumption of the structure, and the fractional-order multi-scale global optimization of a modified FSDM based FBPNN are analysed in detail. Finally, we perform comparative experiments and compare a modified FSDM based FBPNN with a classic first-order BPNN, i.e., an example function approximation, fractional-order multi-scale global optimization, and two comparative performances with real data. The more efficient optimal searching capability of the fractional-order multi-scale global optimization of a modified FSDM based FBPNN to determine the global optimal solution is the major advantage being superior to a classic first-order BPNN.
Tasks
Published	2019-06-23
URL	https://arxiv.org/abs/1906.09524v2
PDF	https://arxiv.org/pdf/1906.09524v2.pdf
PWC	https://paperswithcode.com/paper/fractional-order-backpropagation-neural
Repo
Framework

Responses to a Critique of Artificial Moral Agents


Title	Responses to a Critique of Artificial Moral Agents
Authors	Adam Poulsen, Michael Anderson, Susan L. Anderson, Ben Byford, Fabio Fossa, Erica L. Neely, Alejandro Rosas, Alan Winfield
Abstract	The field of machine ethics is concerned with the question of how to embed ethical behaviors, or a means to determine ethical behaviors, into artificial intelligence (AI) systems. The goal is to produce artificial moral agents (AMAs) that are either implicitly ethical (designed to avoid unethical consequences) or explicitly ethical (designed to behave ethically). Van Wynsberghe and Robbins’ (2018) paper Critiquing the Reasons for Making Artificial Moral Agents critically addresses the reasons offered by machine ethicists for pursuing AMA research; this paper, co-authored by machine ethicists and commentators, aims to contribute to the machine ethics conversation by responding to that critique. The reasons for developing AMAs discussed in van Wynsberghe and Robbins (2018) are: it is inevitable that they will be developed; the prevention of harm; the necessity for public trust; the prevention of immoral use; such machines are better moral reasoners than humans, and building these machines would lead to a better understanding of human morality. In this paper, each co-author addresses those reasons in turn. In so doing, this paper demonstrates that the reasons critiqued are not shared by all co-authors; each machine ethicist has their own reasons for researching AMAs. But while we express a diverse range of views on each of the six reasons in van Wynsberghe and Robbins’ critique, we nevertheless share the opinion that the scientific study of AMAs has considerable value.
Tasks
Published	2019-03-17
URL	http://arxiv.org/abs/1903.07021v1
PDF	http://arxiv.org/pdf/1903.07021v1.pdf
PWC	https://paperswithcode.com/paper/responses-to-a-critique-of-artificial-moral
Repo
Framework

Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control


Title	Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control
Authors	Marta Sarrico, Kai Arulkumaran, Andrea Agostinelli, Pierre Richemond, Anil Anthony Bharath
Abstract	Deep networks have enabled reinforcement learning to scale to more complex and challenging domains, but these methods typically require large quantities of training data. An alternative is to use sample-efficient episodic control methods: neuro-inspired algorithms which use non-/semi-parametric models that predict values based on storing and retrieving previously experienced transitions. One way to further improve the sample efficiency of these approaches is to use more principled exploration strategies. In this work, we therefore propose maximum entropy mellowmax episodic control (MEMEC), which samples actions according to a Boltzmann policy with a state-dependent temperature. We demonstrate that MEMEC outperforms other uncertainty- and softmax-based exploration methods on classic reinforcement learning environments and Atari games, achieving both more rapid learning and higher final rewards.
Tasks	Atari Games
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09615v1
PDF	https://arxiv.org/pdf/1911.09615v1.pdf
PWC	https://paperswithcode.com/paper/sample-efficient-reinforcement-learning-with-2
Repo
Framework

Word Embedding for Response-To-Text Assessment of Evidence


Title	Word Embedding for Response-To-Text Assessment of Evidence
Authors	Haoran Zhang, Diane Litman
Abstract	Manually grading the Response to Text Assessment (RTA) is labor intensive. Therefore, an automatic method is being developed for scoring analytical writing when the RTA is administered in large numbers of classrooms. Our long-term goal is to also use this scoring method to provide formative feedback to students and teachers about students’ writing quality. As a first step towards this goal, interpretable features for automatically scoring the evidence rubric of the RTA have been developed. In this paper, we present a simple but promising method for improving evidence scoring by employing the word embedding model. We evaluate our method on corpora of responses written by upper elementary students.
Tasks
Published	2019-08-06
URL	https://arxiv.org/abs/1908.01969v1
PDF	https://arxiv.org/pdf/1908.01969v1.pdf
PWC	https://paperswithcode.com/paper/word-embedding-for-response-to-text-1
Repo
Framework

Comparing Direct and Indirect Representations for Environment-Specific Robot Component Design


Title	Comparing Direct and Indirect Representations for Environment-Specific Robot Component Design
Authors	Jack Collins, Ben Cottier, David Howard
Abstract	We compare two representations used to define the morphology of legs for a hexapod robot, which are subsequently 3D printed. A leg morphology occupies a set of voxels in a voxel grid. One method, a direct representation, uses a collection of Bezier splines. The second, an indirect method, utilises CPPN-NEAT. In our first experiment, we investigate two strategies to post-process the CPPN output and ensure leg length constraints are met. The first uses an adaptive threshold on the output neuron, the second, previously reported in the literature, scales the largest generated artefact to our desired length. In our second experiment, we build on our past work that evolves the tibia of a hexapod to provide environment-specific performance benefits. We compare the performance of our direct and indirect legs across three distinct environments, represented in a high-fidelity simulator. Results are significant and support our hypothesis that the indirect representation allows for further exploration of the design space leading to improved fitness.
Tasks
Published	2019-01-21
URL	http://arxiv.org/abs/1901.06775v1
PDF	http://arxiv.org/pdf/1901.06775v1.pdf
PWC	https://paperswithcode.com/paper/comparing-direct-and-indirect-representations
Repo
Framework

Panoptic-DeepLab


Title	Panoptic-DeepLab
Authors	Bowen Cheng, Maxwell D. Collins, Yukun Zhu, Ting Liu, Thomas S. Huang, Hartwig Adam, Liang-Chieh Chen
Abstract	We present Panoptic-DeepLab, a bottom-up and single-shot approach for panoptic segmentation. Our Panoptic-DeepLab is conceptually simple and delivers state-of-the-art results. In particular, we adopt the dual-ASPP and dual-decoder structures specific to semantic, and instance segmentation, respectively. The semantic segmentation branch is the same as the typical design of any semantic segmentation model (e.g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression. Our single Panoptic-DeepLab sets the new state-of-art at all three Cityscapes benchmarks, reaching 84.2% mIoU, 39.0% AP, and 65.5% PQ on test set, and advances results on the other challenging Mapillary Vistas.
Tasks	Instance Segmentation, Panoptic Segmentation, Semantic Segmentation
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04751v3
PDF	https://arxiv.org/pdf/1910.04751v3.pdf
PWC	https://paperswithcode.com/paper/panoptic-deeplab
Repo
Framework

Augmenting Neural Nets with Symbolic Synthesis: Applications to Few-Shot Learning


Title	Augmenting Neural Nets with Symbolic Synthesis: Applications to Few-Shot Learning
Authors	Adithya Murali, P. Madhusudan
Abstract	We propose symbolic learning as extensions to standard inductive learning models such as neural nets as a means to solve few shot learning problems. We device a class of visual discrimination puzzles that calls for recognizing objects and object relationships as well learning higher-level concepts from very few images. We propose a two-phase learning framework that combines models learned from large data sets using neural nets and symbolic first-order logic formulas learned from a few shot learning instance. We develop first-order logic synthesis techniques for discriminating images by using symbolic search and logic constraint solvers. By augmenting neural nets with them, we develop and evaluate a tool that can solve few shot visual discrimination puzzles with interpretable concepts.
Tasks	Few-Shot Learning
Published	2019-07-12
URL	https://arxiv.org/abs/1907.05878v1
PDF	https://arxiv.org/pdf/1907.05878v1.pdf
PWC	https://paperswithcode.com/paper/augmenting-neural-nets-with-symbolic
Repo
Framework

Combine PPO with NES to Improve Exploration


Title	Combine PPO with NES to Improve Exploration
Authors	Lianjiang Li, Yunrong Yang, Bingna Li
Abstract	We introduce two approaches for combining neural evolution strategy (NES) and proximal policy optimization (PPO): parameter transfer and parameter space noise. Parameter transfer is a PPO agent with parameters transferred from a NES agent. Parameter space noise is to directly add noise to the PPO agent`s parameters. We demonstrate that PPO could benefit from both methods through experimental comparison on discrete action environments as well as continuous control tasks \|
Tasks	Continuous Control
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09492v2
PDF	https://arxiv.org/pdf/1905.09492v2.pdf
PWC	https://paperswithcode.com/paper/combine-ppo-with-nes-to-improve-exploration
Repo
Framework

Evolving Rewards to Automate Reinforcement Learning


Title	Evolving Rewards to Automate Reinforcement Learning
Authors	Aleksandra Faust, Anthony Francis, Dar Mehta
Abstract	Many continuous control tasks have easily formulated objectives, yet using them directly as a reward in reinforcement learning (RL) leads to suboptimal policies. Therefore, many classical control tasks guide RL training using complex rewards, which require tedious hand-tuning. We automate the reward search with AutoRL, an evolutionary layer over standard RL that treats reward tuning as hyperparameter optimization and trains a population of RL agents to find a reward that maximizes the task objective. AutoRL, evaluated on four Mujoco continuous control tasks over two RL algorithms, shows improvements over baselines, with the the biggest uplift for more complex tasks. The video can be found at: \url{https://youtu.be/svdaOFfQyC8}.
Tasks	Continuous Control, Hyperparameter Optimization
Published	2019-05-18
URL	https://arxiv.org/abs/1905.07628v1
PDF	https://arxiv.org/pdf/1905.07628v1.pdf
PWC	https://paperswithcode.com/paper/evolving-rewards-to-automate-reinforcement
Repo
Framework

Parallel Scheduled Sampling


Title	Parallel Scheduled Sampling
Authors	Daniel Duckworth, Arvind Neelakantan, Ben Goodrich, Lukasz Kaiser, Samy Bengio
Abstract	Auto-regressive models are widely used in sequence generation problems. The output sequence is typically generated in a predetermined order, one discrete unit (pixel or word or character) at a time. The models are trained by teacher-forcing where ground-truth history is fed to the model as input, which at test time is replaced by the model prediction. Scheduled Sampling aims to mitigate this discrepancy between train and test time by randomly replacing some discrete units in the history with the model’s prediction. While teacher-forced training works well with ML accelerators as the computation can be parallelized across time, Scheduled Sampling involves undesirable sequential processing. In this paper, we introduce a simple technique to parallelize Scheduled Sampling across time. Experimentally, we find the proposed technique leads to equivalent or better performance on image generation, summarization, dialog generation, and translation compared to teacher-forced training. In dialog response generation task, Parallel Scheduled Sampling achieves 1.6 BLEU score (11.5%) improvement over teacher-forcing while in image generation it achieves 20% and 13.8% improvement in Frechet Inception Distance (FID) and Inception Score (IS) respectively. Further, we discuss the effects of different hyper-parameters associated with Scheduled Sampling on the model performance.
Tasks	Image Generation
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04331v2
PDF	https://arxiv.org/pdf/1906.04331v2.pdf
PWC	https://paperswithcode.com/paper/parallel-scheduled-sampling
Repo
Framework

Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning


Title	Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
Authors	Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan Wermter
Abstract	Recent success in deep reinforcement learning for continuous control has been dominated by model-free approaches which, unlike model-based approaches, do not suffer from representational limitations in making assumptions about the world dynamics and model errors inevitable in complex domains. However, they require a lot of experiences compared to model-based approaches that are typically more sample-efficient. We propose to combine the benefits of the two approaches by presenting an integrated approach called Curious Meta-Controller. Our approach alternates adaptively between model-based and model-free control using a curiosity feedback based on the learning progress of a neural model of the dynamics in a learned latent space. We demonstrate that our approach can significantly improve the sample efficiency and achieve near-optimal performance on learning robotic reaching and grasping tasks from raw-pixel input in both dense and sparse reward settings.
Tasks	Continuous Control
Published	2019-05-05
URL	https://arxiv.org/abs/1905.01718v1
PDF	https://arxiv.org/pdf/1905.01718v1.pdf
PWC	https://paperswithcode.com/paper/curious-meta-controller-adaptive-alternation
Repo
Framework