April 2, 2020

3430 words 17 mins read

Paper Group ANR 136

Paper Group ANR 136

Geocoding of trees from street addresses and street-level images. Variational Dropout Sparsification for Particle Identification speed-up. Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning. Pruning Neural Belief Propagation Decoders. Is my Neural Network Neuromorphic? Taxonomy, Recent Trends and Future Directi …

Geocoding of trees from street addresses and street-level images

Title Geocoding of trees from street addresses and street-level images
Authors Daniel Laumer, Nico Lang, Natalie van Doorn, Oisin Mac Aodha, Pietro Perona, Jan Dirk Wegner
Abstract We introduce an approach for updating older tree inventories with geographic coordinates using street-level panorama images and a global optimization framework for tree instance matching. Geolocations of trees in inventories until the early 2000s where recorded using street addresses whereas newer inventories use GPS. Our method retrofits older inventories with geographic coordinates to allow connecting them with newer inventories to facilitate long-term studies on tree mortality etc. What makes this problem challenging is the different number of trees per street address, the heterogeneous appearance of different tree instances in the images, ambiguous tree positions if viewed from multiple images and occlusions. To solve this assignment problem, we (i) detect trees in Google street-view panoramas using deep learning, (ii) combine multi-view detections per tree into a single representation, (iii) and match detected trees with given trees per street address with a global optimization approach. Experiments for > 50000 trees in 5 cities in California, USA, show that we are able to assign geographic coordinates to 38 % of the street trees, which is a good starting point for long-term studies on the ecosystem services value of street trees at large scale.
Published 2020-02-05
URL https://arxiv.org/abs/2002.01708v1
PDF https://arxiv.org/pdf/2002.01708v1.pdf
PWC https://paperswithcode.com/paper/geocoding-of-trees-from-street-addresses-and

Variational Dropout Sparsification for Particle Identification speed-up

Title Variational Dropout Sparsification for Particle Identification speed-up
Authors Artem Ryzhikov, Denis Derkach, Mikhail Hushchyn
Abstract Accurate particle identification (PID) is one of the most important aspects of the LHCb experiment. Modern machine learning techniques such as neural networks (NNs) are efficiently applied to this problem and are integrated into the LHCb software. In this research, we discuss novel applications of neural network speed-up techniques to achieve faster PID in LHC upgrade conditions. We show that the best results are obtained using variational dropout sparsification, which provides a prediction (feedforward pass) speed increase of up to a factor of sixteen even when compared to a model with shallow networks.
Published 2020-01-21
URL https://arxiv.org/abs/2001.07493v1
PDF https://arxiv.org/pdf/2001.07493v1.pdf
PWC https://paperswithcode.com/paper/variational-dropout-sparsification-for

Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning

Title Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning
Authors Wei Ye, Rui Xie, Jinglei Zhang, Tianxiang Hu, Xiaoyin Wang, Shikun Zhang
Abstract Code summarization generates brief natural language description given a source code snippet, while code retrieval fetches relevant source code given a natural language query. Since both tasks aim to model the association between natural language and programming language, recent studies have combined these two tasks to improve their performance. However, researchers have yet been able to effectively leverage the intrinsic connection between the two tasks as they train these tasks in a separate or pipeline manner, which means their performance can not be well balanced. In this paper, we propose a novel end-to-end model for the two tasks by introducing an additional code generation task. More specifically, we explicitly exploit the probabilistic correlation between code summarization and code generation with dual learning, and utilize the two encoders for code summarization and code generation to train the code retrieval task via multi-task learning. We have carried out extensive experiments on an existing dataset of SQL and Python, and results show that our model can significantly improve the results of the code retrieval task over the-state-of-art models, as well as achieve competitive performance in terms of BLEU score for the code summarization task.
Tasks Code Generation, Code Summarization, Multi-Task Learning
Published 2020-02-24
URL https://arxiv.org/abs/2002.10198v2
PDF https://arxiv.org/pdf/2002.10198v2.pdf
PWC https://paperswithcode.com/paper/leveraging-code-generation-to-improve-code

Pruning Neural Belief Propagation Decoders

Title Pruning Neural Belief Propagation Decoders
Authors Andreas Buchberger, Christian Häger, Henry D. Pfister, Laurent Schmalen, Alexandre Graell i Amat
Abstract We consider near maximum-likelihood (ML) decoding of short linear block codes based on neural belief propagation (BP) decoding recently introduced by Nachmani et al.. While this method significantly outperforms conventional BP decoding, the underlying parity-check matrix may still limit the overall performance. In this paper, we introduce a method to tailor an overcomplete parity-check matrix to (neural) BP decoding using machine learning. We consider the weights in the Tanner graph as an indication of the importance of the connected check nodes (CNs) to decoding and use them to prune unimportant CNs. As the pruning is not tied over iterations, the final decoder uses a different parity-check matrix in each iteration. For Reed-Muller and short low-density parity-check codes, we achieve performance within 0.27 dB and 1.5 dB of the ML performance while reducing the complexity of the decoder.
Published 2020-01-21
URL https://arxiv.org/abs/2001.07464v1
PDF https://arxiv.org/pdf/2001.07464v1.pdf
PWC https://paperswithcode.com/paper/pruning-neural-belief-propagation-decoders
Title Is my Neural Network Neuromorphic? Taxonomy, Recent Trends and Future Directions in Neuromorphic Engineering
Authors Sumon Kumar Bose, Jyotibdha Acharya, Arindam Basu
Abstract In this paper, we review recent work published over the last 3 years under the umbrella of Neuromorphic engineering to analyze what are the common features among such systems. We see that there is no clear consensus but each system has one or more of the following features:(1) Analog computing (2) Non vonNeumann Architecture and low-precision digital processing (3) Spiking Neural Networks (SNN) with components closely related to biology. We compare recent machine learning accelerator chips to show that indeed analog processing and reduced bit precision architectures have best throughput, energy and area efficiencies. However, pure digital architectures can also achieve quite high efficiencies by just adopting a non von-Neumann architecture. Given the design automation tools for digital hardware design, it raises a question on the likelihood of adoption of analog processing in the near future for industrial designs. Next, we argue about the importance of defining standards and choosing proper benchmarks for the progress of neuromorphic system designs and propose some desired characteristics of such benchmarks. Finally, we show brain-machine interfaces as a potential task that fulfils all the criteria of such benchmarks.
Published 2020-02-27
URL https://arxiv.org/abs/2002.11945v1
PDF https://arxiv.org/pdf/2002.11945v1.pdf
PWC https://paperswithcode.com/paper/is-my-neural-network-neuromorphic-taxonomy

A Robot that Learns Connect Four Using Game Theory and Demonstrations

Title A Robot that Learns Connect Four Using Game Theory and Demonstrations
Authors Ali Ayub, Alan Wagner
Abstract Teaching robots new skills using minimal time and effort has long been a goal of artificial intelligence. This paper investigates the use of game theoretic representations to represent and learn how to play interactive games such as Connect Four. We combine aspects of learning by demonstration, active learning, and game theory allowing a robot to learn by presenting its understanding of the structure of the game and conducting a question/answer session with a person. The paper demonstrates how a robot can be taught the win conditions of the game Connect Four and its variants using a single demonstration and a few trial examples with a question and answer session led by the robot. Our results show that the robot can learn any arbitrary win conditions for the Connect Four game without any prior knowledge of the win conditions and then play the game with a human utilizing the learned win conditions. Our experiments also show that some questions are more important for learning the game’s win conditions.
Tasks Active Learning
Published 2020-01-03
URL https://arxiv.org/abs/2001.01004v1
PDF https://arxiv.org/pdf/2001.01004v1.pdf
PWC https://paperswithcode.com/paper/a-robot-that-learns-connect-four-using-game

A logic-based relational learning approach to relation extraction: The OntoILPER system

Title A logic-based relational learning approach to relation extraction: The OntoILPER system
Authors Rinaldo Lima, Bernard Espinasse, Fred Freitas
Abstract Relation Extraction (RE), the task of detecting and characterizing semantic relations between entities in text, has gained much importance in the last two decades, mainly in the biomedical domain. Many papers have been published on Relation Extraction using supervised machine learning techniques. Most of these techniques rely on statistical methods, such as feature-based and tree-kernels-based methods. Such statistical learning techniques are usually based on a propositional hypothesis space for representing examples, i.e., they employ an attribute-value representation of features. This kind of representation has some drawbacks, particularly in the extraction of complex relations which demand more contextual information about the involving instances, i.e., it is not able to effectively capture structural information from parse trees without loss of information. In this work, we present OntoILPER, a logic-based relational learning approach to Relation Extraction that uses Inductive Logic Programming for generating extraction models in the form of symbolic extraction rules. OntoILPER takes profit of a rich relational representation of examples, which can alleviate the aforementioned drawbacks. The proposed relational approach seems to be more suitable for Relation Extraction than statistical ones for several reasons that we argue. Moreover, OntoILPER uses a domain ontology that guides the background knowledge generation process and is used for storing the extracted relation instances. The induced extraction rules were evaluated on three protein-protein interaction datasets from the biomedical domain. The performance of OntoILPER extraction models was compared with other state-of-the-art RE systems. The encouraging results seem to demonstrate the effectiveness of the proposed solution.
Tasks Relational Reasoning, Relation Extraction
Published 2020-01-13
URL https://arxiv.org/abs/2001.04192v1
PDF https://arxiv.org/pdf/2001.04192v1.pdf
PWC https://paperswithcode.com/paper/a-logic-based-relational-learning-approach-to

Assessing Graph-based Deep Learning Models for Predicting Flash Point

Title Assessing Graph-based Deep Learning Models for Predicting Flash Point
Authors Xiaoyu Sun, Nathaniel J. Krakauer, Alexander Politowicz, Wei-Ting Chen, Qiying Li, Zuoyi Li, Xianjia Shao, Alfred Sunaryo, Mingren Shen, James Wang, Dane Morgan
Abstract Flash points of organic molecules play an important role in preventing flammability hazards and large databases of measured values exist, although millions of compounds remain unmeasured. To rapidly extend existing data to new compounds many researchers have used quantitative structure-property relationship (QSPR) analysis to effectively predict flash points. In recent years graph-based deep learning (GBDL) has emerged as a powerful alternative method to traditional QSPR. In this paper, GBDL models were implemented in predicting flash point for the first time. We assessed the performance of two GBDL models, message-passing neural network (MPNN) and graph convolutional neural network (GCNN), by comparing methods. Our result shows that MPNN both outperforms GCNN and yields slightly worse but comparable performance with previous QSPR studies. The average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3% lower and 2.0 K higher than previous comparable studies. To further explore GBDL models, we collected the largest flash point dataset to date, which contains 10575 unique molecules. The optimized MPNN gives a test data R2 of 0.803 and MAE of 17.8 K on the complete dataset. We also extracted 5 datasets from our integrated dataset based on molecular types (acids, organometallics, organogermaniums, organosilicons, and organotins) and explore the quality of the model in these classes.against 12 previous QSPR studies using more traditional
Published 2020-02-26
URL https://arxiv.org/abs/2002.11315v1
PDF https://arxiv.org/pdf/2002.11315v1.pdf
PWC https://paperswithcode.com/paper/assessing-graph-based-deep-learning-models

Turbulence Enrichment using Physics-informed Generative Adversarial Networks

Title Turbulence Enrichment using Physics-informed Generative Adversarial Networks
Authors Akshay Subramaniam, Man Long Wong, Raunak D Borker, Sravya Nimmagadda, Sanjiva K Lele
Abstract Generative Adversarial Networks (GANs) have been widely used for generating photo-realistic images. A variant of GANs called super-resolution GAN (SRGAN) has already been used successfully for image super-resolution where low resolution images can be upsampled to a $4\times$ larger image that is perceptually more realistic. However, when such generative models are used for data describing physical processes, there are additional known constraints that models must satisfy including governing equations and boundary conditions. In general, these constraints may not be obeyed by the generated data. In this work, we develop physics-based methods for generative enrichment of turbulence. We incorporate a physics-informed learning approach by a modification to the loss function to minimize the residuals of the governing equations for the generated data. We have analyzed two trained physics-informed models: a supervised model based on convolutional neural networks (CNN) and a generative model based on SRGAN: Turbulence Enrichment GAN (TEGAN), and show that they both outperform simple bicubic interpolation in turbulence enrichment. We have also shown that using the physics-informed learning can also significantly improve the model’s ability in generating data that satisfies the physical governing equations. Finally, we compare the enriched data from TEGAN to show that it is able to recover statistical metrics of the flow field including energy metrics and well as inter-scale energy dynamics and flow morphology.
Tasks Image Super-Resolution, Super-Resolution
Published 2020-03-04
URL https://arxiv.org/abs/2003.01907v2
PDF https://arxiv.org/pdf/2003.01907v2.pdf
PWC https://paperswithcode.com/paper/turbulence-enrichment-using-generative

Scout: Rapid Exploration of Interface Layout Alternatives through High-Level Design Constraints

Title Scout: Rapid Exploration of Interface Layout Alternatives through High-Level Design Constraints
Authors Amanda Swearngin, Chenglong Wang, Alannah Oleson, James Fogarty, Amy J. Ko
Abstract Although exploring alternatives is fundamental to creating better interface designs, current processes for creating alternatives are generally manual, limiting the alternatives a designer can explore. We present Scout, a system that helps designers rapidly explore alternatives through mixed-initiative interaction with high-level constraints and design feedback. Prior constraint-based layout systems use low-level spatial constraints and generally produce a single design. Tosupport designer exploration of alternatives, Scout introduces high-level constraints based on design concepts (e.g.,~semantic structure, emphasis, order) and formalizes them into low-level spatial constraints that a solver uses to generate potential layouts. In an evaluation with 18 interface designers, we found that Scout: (1) helps designers create more spatially diverse layouts with similar quality to those created with a baseline tool and (2) can help designers avoid a linear design process and quickly ideate layouts they do not believe they would have thought of on their own.
Published 2020-01-15
URL https://arxiv.org/abs/2001.05424v1
PDF https://arxiv.org/pdf/2001.05424v1.pdf
PWC https://paperswithcode.com/paper/scout-rapid-exploration-of-interface-layout

Fine-grain atlases of functional modes for fMRI analysis

Title Fine-grain atlases of functional modes for fMRI analysis
Authors Kamalaker Dadi, Gaël Varoquaux, Antonia Machlouzarides-Shalit, Krzysztof J. Gorgolewski, Demian Wassermann, Bertrand Thirion, Arthur Mensch
Abstract Population imaging markedly increased the size of functional-imaging datasets, shedding new light on the neural basis of inter-individual differences. Analyzing these large data entails new scalability challenges, computational and statistical. For this reason, brain images are typically summarized in a few signals, for instance reducing voxel-level measures with brain atlases or functional modes. A good choice of the corresponding brain networks is important, as most data analyses start from these reduced signals. We contribute finely-resolved atlases of functional modes, comprising from 64 to 1024 networks. These dictionaries of functional modes (DiFuMo) are trained on millions of fMRI functional brain volumes of total size 2.4TB, spanned over 27 studies and many research groups. We demonstrate the benefits of extracting reduced signals on our fine-grain atlases for many classic functional data analysis pipelines: stimuli decoding from 12,334 brain responses, standard GLM analysis of fMRI across sessions and individuals, extraction of resting-state functional-connectomes biomarkers for 2,500 individuals, data compression and meta-analysis over more than 15,000 statistical maps. In each of these analysis scenarii, we compare the performance of our functional atlases with that of other popular references, and to a simple voxel-level analysis. Results highlight the importance of using high-dimensional “soft” functional atlases, to represent and analyse brain activity while capturing its functional gradients. Analyses on high-dimensional modes achieve similar statistical performance as at the voxel level, but with much reduced computational cost and higher interpretability. In addition to making them available, we provide meaningful names for these modes, based on their anatomical location. It will facilitate reporting of results.
Published 2020-03-05
URL https://arxiv.org/abs/2003.05405v1
PDF https://arxiv.org/pdf/2003.05405v1.pdf
PWC https://paperswithcode.com/paper/fine-grain-atlases-of-functional-modes-for

Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems

Title Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems
Authors Molei Tao, Tomoki Ohsawa
Abstract The article considers smooth optimization of functions on Lie groups. By generalizing NAG variational principle in vector space (Wibisono et al., 2016) to Lie groups, continuous Lie-NAG dynamics which are guaranteed to converge to local optimum are obtained. They correspond to momentum versions of gradient flow on Lie groups. A particular case of $\mathsf{SO}(n)$ is then studied in details, with objective functions corresponding to leading Generalized EigenValue problems: the Lie-NAG dynamics are first made explicit in coordinates, and then discretized in structure preserving fashions, resulting in optimization algorithms with faithful energy behavior (due to conformal symplecticity) and exactly remaining on the Lie group. Stochastic gradient versions are also investigated. Numerical experiments on both synthetic data and practical problem (LDA for MNIST) demonstrate the effectiveness of the proposed methods as optimization algorithms ($not$ as a classification method).
Published 2020-01-27
URL https://arxiv.org/abs/2001.10006v1
PDF https://arxiv.org/pdf/2001.10006v1.pdf
PWC https://paperswithcode.com/paper/variational-optimization-on-lie-groups-with

AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs

Title AutoDNNchip: An Automated DNN Chip Predictor and Builder for Both FPGAs and ASICs
Authors Pengfei Xu, Xiaofan Zhang, Cong Hao, Yang Zhao, Yongan Zhang, Yue Wang, Chaojian Li, Zetong Guan, Deming Chen, Yingyan Lin
Abstract Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a growing demand for DNN chips. However, designing DNN chips is non-trivial because: (1) mainstream DNNs have millions of parameters and operations; (2) the large design space due to the numerous design choices of dataflows, processing elements, memory hierarchy, etc.; and (3) an algorithm/hardware co-design is needed to allow the same DNN functionality to have a different decomposition, which would require different hardware IPs to meet the application specifications. Therefore, DNN chips take a long time to design and require cross-disciplinary experts. To enable fast and effective DNN chip design, we propose AutoDNNchip - a DNN chip generator that can automatically generate both FPGA- and ASIC-based DNN chip implementation given DNNs from machine learning frameworks (e.g., PyTorch) for a designated application and dataset. Specifically, AutoDNNchip consists of two integrated enablers: (1) a Chip Predictor, built on top of a graph-based accelerator representation, which can accurately and efficiently predict a DNN accelerator’s energy, throughput, and area based on the DNN model parameters, hardware configuration, technology-based IPs, and platform constraints; and (2) a Chip Builder, which can automatically explore the design space of DNN chips (including IP selection, block configuration, resource balancing, etc.), optimize chip design via the Chip Predictor, and then generate optimized synthesizable RTL to achieve the target design metrics. Experimental results show that our Chip Predictor’s predicted performance differs from real-measured ones by < 10% when validated using 15 DNN models and 4 platforms (edge-FPGA/TPU/GPU and ASIC). Furthermore, accelerators generated by our AutoDNNchip can achieve better (up to 3.86X improvement) performance than that of expert-crafted state-of-the-art accelerators.
Published 2020-01-06
URL https://arxiv.org/abs/2001.03535v3
PDF https://arxiv.org/pdf/2001.03535v3.pdf
PWC https://paperswithcode.com/paper/autodnnchip-an-automated-dnn-chip-predictor

Truncated Hilbert Transform: Uniqueness and a Chebyshev series Expansion Approach

Title Truncated Hilbert Transform: Uniqueness and a Chebyshev series Expansion Approach
Authors Jason You
Abstract We derive a stronger uniqueness result if a function with compact support and its truncated Hilbert transform are known on the same interval by using the Sokhotski-Plemelj formulas. To find a function from its truncated Hilbert transform, we express them in the Chebyshev polynomial series and then suggest two methods to numerically estimate the coefficients. We present computer simulation results to show that the extrapolative procedure numerically works well.
Published 2020-02-06
URL https://arxiv.org/abs/2002.02073v2
PDF https://arxiv.org/pdf/2002.02073v2.pdf
PWC https://paperswithcode.com/paper/truncated-hilbert-transform-uniqueness-and-a

Meaning updating of density matrices

Title Meaning updating of density matrices
Authors Bob Coecke, Konstantinos Meichanetzidis
Abstract The DisCoCat model of natural language meaning assigns meaning to a sentence given: (i) the meanings of its words, and, (ii) its grammatical structure. The recently introduced DisCoCirc model extends this to text consisting of multiple sentences. While in DisCoCat all meanings are fixed, in DisCoCirc each sentence updates meanings of words. In this paper we explore different update mechanisms for DisCoCirc, in the case where meaning is encoded in density matrices—which come with several advantages as compared to vectors. Our starting point are two non-commutative update mechanisms, borrowing one from quantum foundations research, from Leifer and Spekkens. Unfortunately, neither of these satisfies any desirable algebraic properties, nor are internal to the meaning category. By passing to double density matrices we do get an elegant internal diagrammatic update mechanism. We also show that (commutative) spiders can be cast as an instance of the Leifer-Spekkens update mechanism. This result is of interest to quantum foundations, as it bridges the work in Categorical Quantum Mechanics (CQM) with that on conditional quantum states. Our work also underpins implementation of text-level natural language processing on quantum hardware (a.k.a. QNLP), for which exponential space-gain and quadratic speed-up have previously been identified.
Published 2020-01-03
URL https://arxiv.org/abs/2001.00862v1
PDF https://arxiv.org/pdf/2001.00862v1.pdf
PWC https://paperswithcode.com/paper/meaning-updating-of-density-matrices
comments powered by Disqus