July 28, 2019

2909 words 14 mins read

Paper Group ANR 227

Paper Group ANR 227

An Analog Neural Network Computing Engine using CMOS-Compatible Charge-Trap-Transistor (CTT). Inverse Reward Design. A Saak Transform Approach to Efficient, Scalable and Robust Handwritten Digits Recognition. Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages. Extracting Formal Models from Nor …

An Analog Neural Network Computing Engine using CMOS-Compatible Charge-Trap-Transistor (CTT)

Title An Analog Neural Network Computing Engine using CMOS-Compatible Charge-Trap-Transistor (CTT)
Authors Yuan Du, Li Du, Xuefeng Gu, Jieqiong Du, X. Shawn Wang, Boyu Hu, Mingzhe Jiang, Xiaoliang Chen, Junjie Su, Subramanian S. Iyer, Mau-Chung Frank Chang
Abstract An analog neural network computing engine based on CMOS-compatible charge-trap transistor (CTT) is proposed in this paper. CTT devices are used as analog multipliers. Compared to digital multipliers, CTT-based analog multiplier shows significant area and power reduction. The proposed computing engine is composed of a scalable CTT multiplier array and energy efficient analog-digital interfaces. Through implementing the sequential analog fabric (SAF), the engine mixed-signal interfaces are simplified and hardware overhead remains constant regardless of the size of the array. A proof-of-concept 784 by 784 CTT computing engine is implemented using TSMC 28nm CMOS technology and occupied 0.68mm2. The simulated performance achieves 76.8 TOPS (8-bit) with 500 MHz clock frequency and consumes 14.8 mW. As an example, we utilize this computing engine to address a classic pattern recognition problem – classifying handwritten digits on MNIST database and obtained a performance comparable to state-of-the-art fully connected neural networks using 8-bit fixed-point resolution.
Tasks
Published 2017-09-19
URL http://arxiv.org/abs/1709.06614v4
PDF http://arxiv.org/pdf/1709.06614v4.pdf
PWC https://paperswithcode.com/paper/an-analog-neural-network-computing-engine
Repo
Framework

Inverse Reward Design

Title Inverse Reward Design
Authors Dylan Hadfield-Menell, Smitha Milli, Pieter Abbeel, Stuart Russell, Anca Dragan
Abstract Autonomous agents optimize the reward function we give them. What they don’t know is how hard it is for us to design a reward function that actually captures what we want. When designing the reward, we might think of some specific training scenarios, and make sure that the reward will lead to the right behavior in those scenarios. Inevitably, agents encounter new scenarios (e.g., new types of terrain) where optimizing that same reward may lead to undesired behavior. Our insight is that reward functions are merely observations about what the designer actually wants, and that they should be interpreted in the context in which they were designed. We introduce inverse reward design (IRD) as the problem of inferring the true objective based on the designed reward and the training MDP. We introduce approximate methods for solving IRD problems, and use their solution to plan risk-averse behavior in test MDPs. Empirical results suggest that this approach can help alleviate negative side effects of misspecified reward functions and mitigate reward hacking.
Tasks
Published 2017-11-08
URL http://arxiv.org/abs/1711.02827v1
PDF http://arxiv.org/pdf/1711.02827v1.pdf
PWC https://paperswithcode.com/paper/inverse-reward-design
Repo
Framework

A Saak Transform Approach to Efficient, Scalable and Robust Handwritten Digits Recognition

Title A Saak Transform Approach to Efficient, Scalable and Robust Handwritten Digits Recognition
Authors Yueru Chen, Zhuwei Xu, Shanshan Cai, Yujian Lang, C. -C. Jay Kuo
Abstract An efficient, scalable and robust approach to the handwritten digits recognition problem based on the Saak transform is proposed in this work. First, multi-stage Saak transforms are used to extract a family of joint spatial-spectral representations of input images. Then, the Saak coefficients are used as features and fed into the SVM classifier for the classification task. In order to control the size of Saak coefficients, we adopt a lossy Saak transform that uses the principal component analysis (PCA) to select a smaller set of transform kernels. The handwritten digits recognition problem is well solved by the convolutional neural network (CNN) such as the LeNet-5. We conduct a comparative study on the performance of the LeNet-5 and the Saak-transform-based solutions in terms of scalability and robustness as well as the efficiency of lossless and lossy Saak transforms under a comparable accuracy level.
Tasks
Published 2017-10-29
URL http://arxiv.org/abs/1710.10714v1
PDF http://arxiv.org/pdf/1710.10714v1.pdf
PWC https://paperswithcode.com/paper/a-saak-transform-approach-to-efficient
Repo
Framework

Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages

Title Improving a Multi-Source Neural Machine Translation Model with Corpus Extension for Low-Resource Languages
Authors Gyu-Hyeon Choi, Jong-Hun Shin, Young-Kil Kim
Abstract In machine translation, we often try to collect resources to improve performance. However, most of the language pairs, such as Korean-Arabic and Korean-Vietnamese, do not have enough resources to train machine translation systems. In this paper, we propose the use of synthetic methods for extending a low-resource corpus and apply it to a multi-source neural machine translation model. We showed the improvement of machine translation performance through corpus extension using the synthetic method. We specifically focused on how to create source sentences that can make better target sentences, including the use of synthetic methods. We found that the corpus extension could also improve the performance of multi-source neural machine translation. We showed the corpus extension and multi-source model to be efficient methods for a low-resource language pair. Furthermore, when both methods were used together, we found better machine translation performance.
Tasks Machine Translation
Published 2017-09-26
URL http://arxiv.org/abs/1709.08898v2
PDF http://arxiv.org/pdf/1709.08898v2.pdf
PWC https://paperswithcode.com/paper/improving-a-multi-source-neural-machine-1
Repo
Framework

Extracting Formal Models from Normative Texts

Title Extracting Formal Models from Normative Texts
Authors John J. Camilleri, Normunds Grūzītis, Gerardo Schneider
Abstract We are concerned with the analysis of normative texts - documents based on the deontic notions of obligation, permission, and prohibition. Our goal is to make queries about these notions and verify that a text satisfies certain properties concerning causality of actions and timing constraints. This requires taking the original text and building a representation (model) of it in a formal language, in our case the C-O Diagram formalism. We present an experimental, semi-automatic aid that helps to bridge the gap between a normative text in natural language and its C-O Diagram representation. Our approach consists of using dependency structures obtained from the state-of-the-art Stanford Parser, and applying our own rules and heuristics in order to extract the relevant components. The result is a tabular data structure where each sentence is split into suitable fields, which can then be converted into a C-O Diagram. The process is not fully automatic however, and some post-editing is generally required of the user. We apply our tool and perform experiments on documents from different domains, and report an initial evaluation of the accuracy and feasibility of our approach.
Tasks
Published 2017-06-15
URL http://arxiv.org/abs/1706.04997v1
PDF http://arxiv.org/pdf/1706.04997v1.pdf
PWC https://paperswithcode.com/paper/extracting-formal-models-from-normative-texts-1
Repo
Framework

Distributed Training Large-Scale Deep Architectures

Title Distributed Training Large-Scale Deep Architectures
Authors Shang-Xuan Zou, Chun-Yen Chen, Jui-Lin Wu, Chun-Nan Chou, Chia-Chin Tsao, Kuan-Chieh Tung, Ting-Wei Lin, Cheng-Lung Sung, Edward Y. Chang
Abstract Scale of data and scale of computation infrastructures together enable the current deep learning renaissance. However, training large-scale deep architectures demands both algorithmic improvement and careful system configuration. In this paper, we focus on employing the system approach to speed up large-scale training. Via lessons learned from our routine benchmarking effort, we first identify bottlenecks and overheads that hinter data parallelism. We then devise guidelines that help practitioners to configure an effective system and fine-tune parameters to achieve desired speedup. Specifically, we develop a procedure for setting minibatch size and choosing computation algorithms. We also derive lemmas for determining the quantity of key components such as the number of GPUs and parameter servers. Experiments and examples show that these guidelines help effectively speed up large-scale deep learning training.
Tasks
Published 2017-08-10
URL http://arxiv.org/abs/1709.06622v1
PDF http://arxiv.org/pdf/1709.06622v1.pdf
PWC https://paperswithcode.com/paper/distributed-training-large-scale-deep
Repo
Framework

Tangent: Automatic Differentiation Using Source Code Transformation in Python

Title Tangent: Automatic Differentiation Using Source Code Transformation in Python
Authors Bart van Merriënboer, Alexander B. Wiltschko, Dan Moldovan
Abstract Automatic differentiation (AD) is an essential primitive for machine learning programming systems. Tangent is a new library that performs AD using source code transformation (SCT) in Python. It takes numeric functions written in a syntactic subset of Python and NumPy as input, and generates new Python functions which calculate a derivative. This approach to automatic differentiation is different from existing packages popular in machine learning, such as TensorFlow and Autograd. Advantages are that Tangent generates gradient code in Python which is readable by the user, easy to understand and debug, and has no runtime overhead. Tangent also introduces abstractions for easily injecting logic into the generated gradient code, further improving usability.
Tasks
Published 2017-11-07
URL http://arxiv.org/abs/1711.02712v1
PDF http://arxiv.org/pdf/1711.02712v1.pdf
PWC https://paperswithcode.com/paper/tangent-automatic-differentiation-using-1
Repo
Framework

An Anthropic Argument against the Future Existence of Superintelligent Artificial Intelligence

Title An Anthropic Argument against the Future Existence of Superintelligent Artificial Intelligence
Authors Toby Pereira
Abstract This paper uses anthropic reasoning to argue for a reduced likelihood that superintelligent AI will come into existence in the future. To make this argument, a new principle is introduced: the Super-Strong Self-Sampling Assumption (SSSSA), building on the Self-Sampling Assumption (SSA) and the Strong Self-Sampling Assumption (SSSA). SSA uses as its sample the relevant observers, whereas SSSA goes further by using observer-moments. SSSSA goes further still and weights each sample proportionally, according to the size of a mind in cognitive terms. SSSSA is required for human observer-samples to be typical, given by how much non-human animals outnumber humans. Given SSSSA, the assumption that humans experience typical observer-samples relies on a future where superintelligent AI does not dominate, which in turn reduces the likelihood of it being created at all.
Tasks
Published 2017-05-08
URL http://arxiv.org/abs/1705.03078v1
PDF http://arxiv.org/pdf/1705.03078v1.pdf
PWC https://paperswithcode.com/paper/an-anthropic-argument-against-the-future
Repo
Framework

Gradual Tuning: a better way of Fine Tuning the parameters of a Deep Neural Network

Title Gradual Tuning: a better way of Fine Tuning the parameters of a Deep Neural Network
Authors Guglielmo Montone, J. Kevin O’Regan, Alexander V. Terekhov
Abstract In this paper we present an alternative strategy for fine-tuning the parameters of a network. We named the technique Gradual Tuning. Once trained on a first task, the network is fine-tuned on a second task by modifying a progressively larger set of the network’s parameters. We test Gradual Tuning on different transfer learning tasks, using networks of different sizes trained with different regularization techniques. The result shows that compared to the usual fine tuning, our approach significantly reduces catastrophic forgetting of the initial task, while still retaining comparable if not better performance on the new task.
Tasks Transfer Learning
Published 2017-11-28
URL http://arxiv.org/abs/1711.10177v1
PDF http://arxiv.org/pdf/1711.10177v1.pdf
PWC https://paperswithcode.com/paper/gradual-tuning-a-better-way-of-fine-tuning
Repo
Framework

Information, Privacy and Stability in Adaptive Data Analysis

Title Information, Privacy and Stability in Adaptive Data Analysis
Authors Adam Smith
Abstract Traditional statistical theory assumes that the analysis to be performed on a given data set is selected independently of the data themselves. This assumption breaks downs when data are re-used across analyses and the analysis to be performed at a given stage depends on the results of earlier stages. Such dependency can arise when the same data are used by several scientific studies, or when a single analysis consists of multiple stages. How can we draw statistically valid conclusions when data are re-used? This is the focus of a recent and active line of work. At a high level, these results show that limiting the information revealed by earlier stages of analysis controls the bias introduced in later stages by adaptivity. Here we review some known results in this area and highlight the role of information-theoretic concepts, notably several one-shot notions of mutual information.
Tasks
Published 2017-06-02
URL http://arxiv.org/abs/1706.00820v1
PDF http://arxiv.org/pdf/1706.00820v1.pdf
PWC https://paperswithcode.com/paper/information-privacy-and-stability-in-adaptive
Repo
Framework

A Nonconvex Splitting Method for Symmetric Nonnegative Matrix Factorization: Convergence Analysis and Optimality

Title A Nonconvex Splitting Method for Symmetric Nonnegative Matrix Factorization: Convergence Analysis and Optimality
Authors Songtao Lu, Mingyi Hong, Zhengdao Wang
Abstract Symmetric nonnegative matrix factorization (SymNMF) has important applications in data analytics problems such as document clustering, community detection and image segmentation. In this paper, we propose a novel nonconvex variable splitting method for solving SymNMF. The proposed algorithm is guaranteed to converge to the set of Karush-Kuhn-Tucker (KKT) points of the nonconvex SymNMF problem. Furthermore, it achieves a global sublinear convergence rate. We also show that the algorithm can be efficiently implemented in parallel. Further, sufficient conditions are provided which guarantee the global and local optimality of the obtained solutions. Extensive numerical results performed on both synthetic and real data sets suggest that the proposed algorithm converges quickly to a local minimum solution.
Tasks Community Detection, Semantic Segmentation
Published 2017-03-24
URL http://arxiv.org/abs/1703.08267v1
PDF http://arxiv.org/pdf/1703.08267v1.pdf
PWC https://paperswithcode.com/paper/a-nonconvex-splitting-method-for-symmetric
Repo
Framework

An Encoder-Decoder Model for ICD-10 Coding of Death Certificates

Title An Encoder-Decoder Model for ICD-10 Coding of Death Certificates
Authors Elena Tutubalina, Zulfat Miftahutdinov
Abstract Information extraction from textual documents such as hospital records and healthrelated user discussions has become a topic of intense interest. The task of medical concept coding is to map a variable length text to medical concepts and corresponding classification codes in some external system or ontology. In this work, we utilize recurrent neural networks to automatically assign ICD-10 codes to fragments of death certificates written in English. We develop end-to-end neural architectures directly tailored to the task, including basic encoder-decoder architecture for statistical translation. In order to incorporate prior knowledge, we concatenate cosine similarities vector among the text and dictionary entry to the encoded state. Being applied to a standard benchmark from CLEF eHealth 2017 challenge, our model achieved F-measure of 85.01% on a full test set with significant improvement as compared to the average score of 62.2% for all official participants approaches.
Tasks
Published 2017-12-04
URL http://arxiv.org/abs/1712.01213v1
PDF http://arxiv.org/pdf/1712.01213v1.pdf
PWC https://paperswithcode.com/paper/an-encoder-decoder-model-for-icd-10-coding-of
Repo
Framework

Mixture of Counting CNNs: Adaptive Integration of CNNs Specialized to Specific Appearance for Crowd Counting

Title Mixture of Counting CNNs: Adaptive Integration of CNNs Specialized to Specific Appearance for Crowd Counting
Authors Shohei Kumagai, Kazuhiro Hotta, Takio Kurita
Abstract This paper proposes a crowd counting method. Crowd counting is difficult because of large appearance changes of a target which caused by density and scale changes. Conventional crowd counting methods generally utilize one predictor (e,g., regression and multi-class classifier). However, such only one predictor can not count targets with large appearance changes well. In this paper, we propose to predict the number of targets using multiple CNNs specialized to a specific appearance, and those CNNs are adaptively selected according to the appearance of a test image. By integrating the selected CNNs, the proposed method has the robustness to large appearance changes. In experiments, we confirm that the proposed method can count crowd with lower counting error than a CNN and integration of CNNs with fixed weights. Moreover, we confirm that each predictor automatically specialized to a specific appearance.
Tasks Crowd Counting
Published 2017-03-28
URL http://arxiv.org/abs/1703.09393v1
PDF http://arxiv.org/pdf/1703.09393v1.pdf
PWC https://paperswithcode.com/paper/mixture-of-counting-cnns-adaptive-integration
Repo
Framework

Intrinsically Motivated Acquisition of Modular Slow Features for Humanoids in Continuous and Non-Stationary Environments

Title Intrinsically Motivated Acquisition of Modular Slow Features for Humanoids in Continuous and Non-Stationary Environments
Authors Varun Raj Kompella, Laurenz Wiskott
Abstract A compact information-rich representation of the environment, also called a feature abstraction, can simplify a robot’s task of mapping its raw sensory inputs to useful action sequences. However, in environments that are non-stationary and only partially observable, a single abstraction is probably not sufficient to encode most variations. Therefore, learning multiple sets of spatially or temporally local, modular abstractions of the inputs would be beneficial. How can a robot learn these local abstractions without a teacher? More specifically, how can it decide from where and when to start learning a new abstraction? A recently proposed algorithm called Curious Dr. MISFA addresses this problem. The algorithm is based on two underlying learning principles called artificial curiosity and slowness. The former is used to make the robot self-motivated to explore by rewarding itself whenever it makes progress learning an abstraction; the later is used to update the abstraction by extracting slowly varying components from raw sensory inputs. Curious Dr. MISFA’s application is, however, limited to discrete domains constrained by a pre-defined state space and has design limitations that make it unstable in certain situations. This paper presents a significant improvement that is applicable to continuous environments, is computationally less expensive, simpler to use with fewer hyper parameters, and stable in certain non-stationary environments. We demonstrate the efficacy and stability of our method in a vision-based robot simulator.
Tasks
Published 2017-01-17
URL http://arxiv.org/abs/1701.04663v1
PDF http://arxiv.org/pdf/1701.04663v1.pdf
PWC https://paperswithcode.com/paper/intrinsically-motivated-acquisition-of
Repo
Framework

Spiking neurons with short-term synaptic plasticity form superior generative networks

Title Spiking neurons with short-term synaptic plasticity form superior generative networks
Authors Luziwei Leng, Roman Martel, Oliver Breitwieser, Ilja Bytschok, Walter Senn, Johannes Schemmel, Karlheinz Meier, Mihai A. Petrovici
Abstract Spiking networks that perform probabilistic inference have been proposed both as models of cortical computation and as candidates for solving problems in machine learning. However, the evidence for spike-based computation being in any way superior to non-spiking alternatives remains scarce. We propose that short-term plasticity can provide spiking networks with distinct computational advantages compared to their classical counterparts. In this work, we use networks of leaky integrate-and-fire neurons that are trained to perform both discriminative and generative tasks in their forward and backward information processing paths, respectively. During training, the energy landscape associated with their dynamics becomes highly diverse, with deep attractor basins separated by high barriers. Classical algorithms solve this problem by employing various tempering techniques, which are both computationally demanding and require global state updates. We demonstrate how similar results can be achieved in spiking networks endowed with local short-term synaptic plasticity. Additionally, we discuss how these networks can even outperform tempering-based approaches when the training data is imbalanced. We thereby show how biologically inspired, local, spike-triggered synaptic dynamics based simply on a limited pool of synaptic resources can allow spiking networks to outperform their non-spiking relatives.
Tasks
Published 2017-09-24
URL http://arxiv.org/abs/1709.08166v3
PDF http://arxiv.org/pdf/1709.08166v3.pdf
PWC https://paperswithcode.com/paper/spiking-neurons-with-short-term-synaptic
Repo
Framework
comments powered by Disqus