January 31, 2020

3584 words 17 mins read

Paper Group ANR 108

Paper Group ANR 108

Tracking the circulation routes of fresh coins in Bitcoin: A way of identifying coin miners with transaction network structural properties. HiCoRe: Visual Hierarchical Context-Reasoning. Convergence Analysis of Machine Learning Algorithms for the Numerical Solution of Mean Field Control and Games: I – The Ergodic Case. Constrained Reinforcement Le …

Tracking the circulation routes of fresh coins in Bitcoin: A way of identifying coin miners with transaction network structural properties

Title Tracking the circulation routes of fresh coins in Bitcoin: A way of identifying coin miners with transaction network structural properties
Authors Zeng-Xian Lin, Xiao Fan Liu
Abstract Bitcoin draws the highest degree of attention among cryptocurrencies, while coin mining is one of the most important fashion of profiting in the Bitcoin ecosystem. This paper constructs fresh coin circulation networks by tracking the fresh coin transfer routes with transaction referencing in Bitcoin blockchain. This paper proposes a heuristic algorithm to identifying coin miners by comparing coin circulation networks from different mining pools and thereby inferring the common profit distribution schemes of Bitcoin mining pools. Furthermore, this paper characterizes the increasing trend of Bitcoin miner numbers during recent years.
Tasks
Published 2019-10-03
URL https://arxiv.org/abs/1911.06400v1
PDF https://arxiv.org/pdf/1911.06400v1.pdf
PWC https://paperswithcode.com/paper/tracking-the-circulation-routes-of-fresh
Repo
Framework

HiCoRe: Visual Hierarchical Context-Reasoning

Title HiCoRe: Visual Hierarchical Context-Reasoning
Authors Pedro H. Bugatti, Priscila T. M. Saito, Larry S. Davis
Abstract Reasoning about images/objects and their hierarchical interactions is a key concept for the next generation of computer vision approaches. Here we present a new framework to deal with it through a visual hierarchical context-based reasoning. Current reasoning methods use the fine-grained labels from images’ objects and their interactions to predict labels to new objects. Our framework modifies this current information flow. It goes beyond and is independent of the fine-grained labels from the objects to define the image context. It takes into account the hierarchical interactions between different abstraction levels (i.e. taxonomy) of information in the images and their bounding-boxes. Besides these connections, it considers their intrinsic characteristics. To do so, we build and apply graphs to graph convolution networks with convolutional neural networks. We show a strong effectiveness over widely used convolutional neural networks, reaching a gain 3 times greater on well-known image datasets. We evaluate the capability and the behavior of our framework under different scenarios, considering distinct (superclass, subclass and hierarchical) granularity levels. We also explore attention mechanisms through graph attention networks and pre-processing methods considering dimensionality expansion and/or reduction of the features’ representations. Further analyses are performed comparing supervised and semi-supervised approaches.
Tasks
Published 2019-09-02
URL https://arxiv.org/abs/1909.00848v1
PDF https://arxiv.org/pdf/1909.00848v1.pdf
PWC https://paperswithcode.com/paper/hicore-visual-hierarchical-context-reasoning
Repo
Framework

Convergence Analysis of Machine Learning Algorithms for the Numerical Solution of Mean Field Control and Games: I – The Ergodic Case

Title Convergence Analysis of Machine Learning Algorithms for the Numerical Solution of Mean Field Control and Games: I – The Ergodic Case
Authors René Carmona, Mathieu Laurière
Abstract We propose two algorithms for the solution of the optimal control of ergodic McKean-Vlasov dynamics. Both algorithms are based on the approximation of the theoretical solutions by neural networks, the latter being characterized by their architecture and a set of parameters. This allows the use of modern machine learning tools, and efficient implementations of stochastic gradient descent. The first algorithm is based on the idiosyncrasies of the ergodic optimal control problem. We provide a mathematical proof of the convergence of the algorithm, and we analyze rigorously the approximation by controlling the different sources of error. The second method is an adaptation of the deep Galerkin method to the system of partial differential equations issued from the optimality condition. We demonstrate the efficiency of these algorithms on several numerical examples, some of them being chosen to show that our algorithms succeed where existing ones failed. We also argue that both methods can easily be applied to problems in dimensions larger than what can be found in the existing literature. Finally, we illustrate the fact that, although the first algorithm is specifically designed for mean field control problems, the second one is more general and can also be applied to the partial differential equation systems arising in the theory of mean field games.
Tasks
Published 2019-07-13
URL https://arxiv.org/abs/1907.05980v1
PDF https://arxiv.org/pdf/1907.05980v1.pdf
PWC https://paperswithcode.com/paper/convergence-analysis-of-machine-learning
Repo
Framework

Constrained Reinforcement Learning Has Zero Duality Gap

Title Constrained Reinforcement Learning Has Zero Duality Gap
Authors Santiago Paternain, Luiz F. O. Chamon, Miguel Calvo-Fullana, Alejandro Ribeiro
Abstract Autonomous agents must often deal with conflicting requirements, such as completing tasks using the least amount of time/energy, learning multiple tasks, or dealing with multiple opponents. In the context of reinforcement learning~(RL), these problems are addressed by (i)~designing a reward function that simultaneously describes all requirements or (ii)~combining modular value functions that encode them individually. Though effective, these methods have critical downsides. Designing good reward functions that balance different objectives is challenging, especially as the number of objectives grows. Moreover, implicit interference between goals may lead to performance plateaus as they compete for resources, particularly when training on-policy. Similarly, selecting parameters to combine value functions is at least as hard as designing an all-encompassing reward, given that the effect of their values on the overall policy is not straightforward. The later is generally addressed by formulating the conflicting requirements as a constrained RL problem and solved using Primal-Dual methods. These algorithms are in general not guaranteed to converge to the optimal solution since the problem is not convex. This work provides theoretical support to these approaches by establishing that despite its non-convexity, this problem has zero duality gap, i.e., it can be solved exactly in the dual domain, where it becomes convex. Finally, we show this result basically holds if the policy is described by a good parametrization~(e.g., neural networks) and we connect this result with primal-dual algorithms present in the literature and we establish the convergence to the optimal solution.
Tasks
Published 2019-10-29
URL https://arxiv.org/abs/1910.13393v1
PDF https://arxiv.org/pdf/1910.13393v1.pdf
PWC https://paperswithcode.com/paper/191013393
Repo
Framework

MorphIC: A 65-nm 738k-Synapse/mm$^2$ Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning

Title MorphIC: A 65-nm 738k-Synapse/mm$^2$ Quad-Core Binary-Weight Digital Neuromorphic Processor with Stochastic Spike-Driven Online Learning
Authors Charlotte Frenkel, Jean-Didier Legat, David Bol
Abstract Recent trends in the field of neural network accelerators investigate weight quantization as a means to increase the resource- and power-efficiency of hardware devices. As full on-chip weight storage is necessary to avoid the high energy cost of off-chip memory accesses, memory reduction requirements for weight storage pushed toward the use of binary weights, which were demonstrated to have a limited accuracy reduction on many applications when quantization-aware training techniques are used. In parallel, spiking neural network (SNN) architectures are explored to further reduce power when processing sparse event-based data streams, while on-chip spike-based online learning appears as a key feature for applications constrained in power and resources during the training phase. However, designing power- and area-efficient spiking neural networks still requires the development of specific techniques in order to leverage on-chip online learning on binary weights without compromising the synapse density. In this work, we demonstrate MorphIC, a quad-core binary-weight digital neuromorphic processor embedding a stochastic version of the spike-driven synaptic plasticity (S-SDSP) learning rule and a hierarchical routing fabric for large-scale chip interconnection. The MorphIC SNN processor embeds a total of 2k leaky integrate-and-fire (LIF) neurons and more than two million plastic synapses for an active silicon area of 2.86mm$^2$ in 65nm CMOS, achieving a high density of 738k synapses/mm$^2$. MorphIC demonstrates an order-of-magnitude improvement in the area-accuracy tradeoff on the MNIST classification task compared to previously-proposed SNNs, while having no penalty in the energy-accuracy tradeoff.
Tasks Quantization
Published 2019-04-17
URL https://arxiv.org/abs/1904.08513v2
PDF https://arxiv.org/pdf/1904.08513v2.pdf
PWC https://paperswithcode.com/paper/morphic-a-65-nm-738k-synapsemm2-quad-core
Repo
Framework

Generalizing Natural Language Analysis through Span-relation Representations

Title Generalizing Natural Language Analysis through Span-relation Representations
Authors Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig
Abstract A large number of natural language processing tasks exist to analyze syntax, semantics, and information content of human language. These seemingly very different tasks are usually solved by specially designed architectures. In this paper, we provide the simple insight that a great variety of tasks can be represented in a single unified format consisting of labeling spans and relations between spans, thus a single task-independent model can be used across different tasks. We perform extensive experiments to test this insight on 10 disparate tasks as broad as dependency parsing (syntax), semantic role labeling (semantics), relation extraction (information content), aspect based sentiment analysis (sentiment), and many others, achieving comparable performance as state-of-the-art specialized models. We further demonstrate benefits in multi-task learning. We convert these datasets into a unified format to build a benchmark, which provides a holistic testbed for evaluating future models for generalized natural language analysis.
Tasks Aspect-Based Sentiment Analysis, Dependency Parsing, Multi-Task Learning, Relation Extraction, Semantic Role Labeling, Sentiment Analysis
Published 2019-11-10
URL https://arxiv.org/abs/1911.03822v1
PDF https://arxiv.org/pdf/1911.03822v1.pdf
PWC https://paperswithcode.com/paper/generalizing-natural-language-analysis-1
Repo
Framework

Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning

Title Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning
Authors Shuai Zheng, James T. Kwok
Abstract Stochastic methods with coordinate-wise adaptive stepsize (such as RMSprop and Adam) have been widely used in training deep neural networks. Despite their fast convergence, they can generalize worse than stochastic gradient descent. In this paper, by revisiting the design of Adagrad, we propose to split the network parameters into blocks, and use a blockwise adaptive stepsize. Intuitively, blockwise adaptivity is less aggressive than adaptivity to individual coordinates, and can have a better balance between adaptivity and generalization. We show theoretically that the proposed blockwise adaptive gradient descent has comparable convergence rate as its counterpart with coordinate-wise adaptive stepsize, but is faster up to some constant. We also study its uniform stability and show that blockwise adaptivity can lead to lower generalization error than coordinate-wise adaptivity. Experimental results show that blockwise adaptive gradient descent converges faster and improves generalization performance over Nesterov’s accelerated gradient and Adam.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09899v1
PDF https://arxiv.org/pdf/1905.09899v1.pdf
PWC https://paperswithcode.com/paper/blockwise-adaptivity-faster-training-and
Repo
Framework

Feedback-Based Self-Learning in Large-Scale Conversational AI Agents

Title Feedback-Based Self-Learning in Large-Scale Conversational AI Agents
Authors Pragaash Ponnusamy, Alireza Roshan Ghias, Chenlei Guo, Ruhi Sarikaya
Abstract Today, most large-scale conversational AI agents (e.g. Alexa, Siri, or Google Assistant) are built using manually annotated data to train the different components of the system. Typically, the accuracy of the ML models in these components are improved by manually transcribing and annotating data. As the scope of these systems increase to cover more scenarios and domains, manual annotation to improve the accuracy of these components becomes prohibitively costly and time consuming. In this paper, we propose a system that leverages user-system interaction feedback signals to automate learning without any manual annotation. Users here tend to modify a previous query in hopes of fixing an error in the previous turn to get the right results. These reformulations, which are often preceded by defective experiences caused by errors in ASR, NLU, ER or the application. In some cases, users may not properly formulate their requests (e.g. providing partial title of a song), but gleaning across a wider pool of users and sessions reveals the underlying recurrent patterns. Our proposed self-learning system automatically detects the errors, generate reformulations and deploys fixes to the runtime system to correct different types of errors occurring in different components of the system. In particular, we propose leveraging an absorbing Markov Chain model as a collaborative filtering mechanism in a novel attempt to mine these patterns. We show that our approach is highly scalable, and able to learn reformulations that reduce Alexa-user errors by pooling anonymized data across millions of customers. The proposed self-learning system achieves a win/loss ratio of 11.8 and effectively reduces the defect rate by more than 30% on utterance level reformulations in our production A/B tests. To the best of our knowledge, this is the first self-learning large-scale conversational AI system in production.
Tasks
Published 2019-11-06
URL https://arxiv.org/abs/1911.02557v1
PDF https://arxiv.org/pdf/1911.02557v1.pdf
PWC https://paperswithcode.com/paper/feedback-based-self-learning-in-large-scale
Repo
Framework

Credible Sample Elicitation

Title Credible Sample Elicitation
Authors Yang Liu, Zuyue Fu, Zhuoran Yang, Zhaoran Wang
Abstract It is important to collect credible training samples $(x,y)$ for building data-intensive learning systems (e.g., a deep learning system). In the literature, there is a line of studies on eliciting distributional information from self-interested agents who hold relevant information. Asking people to report complex distribution $p(x)$, though theoretically viable, is challenging in practice. This is primarily due to the heavy cognitive loads required for human agents to reason and report this high dimensional information. This paper introduces a deep learning aided method to incentivize credible sample contributions from selfish and rational agents. The challenge to do so is to design an incentive-compatible score function to score each reported sample to induce truthful reports, instead of an arbitrary or even adversarial one. We show that with accurate estimation of a certain $f$-divergence function we are able to achieve approximate incentive compatibility in eliciting truthful samples. We then present an efficient estimator with theoretical guarantee via studying the variational forms of $f$-divergence function. Our work complements the literature of information elicitation via introducing \emph{sample elicitation}. We also show a connection between this sample elicitation problem and $f$-GAN, and how this connection can help reconstruct an estimator of the distribution based on collected samples. Thorough numerical experiments are conducted to validate our designed mechanisms.
Tasks
Published 2019-10-08
URL https://arxiv.org/abs/1910.03155v2
PDF https://arxiv.org/pdf/1910.03155v2.pdf
PWC https://paperswithcode.com/paper/credible-sample-elicitation-by-deep-learning
Repo
Framework

Discriminative Consistent Domain Generation for Semi-supervised Learning

Title Discriminative Consistent Domain Generation for Semi-supervised Learning
Authors Jun Chen, Heye Zhang, Yanping Zhang, Shu Zhao, Raad Mohiaddin, Tom Wong, David Firmin, Guang Yang, Jennifer Keegan
Abstract Deep learning based task systems normally rely on a large amount of manually labeled training data, which is expensive to obtain and subject to operator variations. Moreover, it does not always hold that the manually labeled data and the unlabeled data are sitting in the same distribution. In this paper, we alleviate these problems by proposing a discriminative consistent domain generation (DCDG) approach to achieve a semi-supervised learning. The discriminative consistent domain is achieved by a double-sided domain adaptation. The double-sided domain adaptation aims to make a fusion of the feature spaces of labeled data and unlabeled data. In this way, we can fit the differences of various distributions between labeled data and unlabeled data. In order to keep the discriminativeness of generated consistent domain for the task learning, we apply an indirect learning for the double-sided domain adaptation. Based on the generated discriminative consistent domain, we can use the unlabeled data to learn the task model along with the labeled data via a consistent image generation. We demonstrate the performance of our proposed DCDG on the late gadolinium enhancement cardiac MRI (LGE-CMRI) images acquired from patients with atrial fibrillation in two clinical centers for the segmentation of the left atrium anatomy (LA) and proximal pulmonary veins (PVs). The experiments show that our semi-supervised approach achieves compelling segmentation results, which can prove the robustness of DCDG for the semi-supervised learning using the unlabeled data along with labeled data acquired from a single center or multicenter studies.
Tasks Domain Adaptation, Image Generation
Published 2019-07-24
URL https://arxiv.org/abs/1907.10267v1
PDF https://arxiv.org/pdf/1907.10267v1.pdf
PWC https://paperswithcode.com/paper/discriminative-consistent-domain-generation
Repo
Framework

Improving Randomized Learning of Feedforward Neural Networks by Appropriate Generation of Random Parameters

Title Improving Randomized Learning of Feedforward Neural Networks by Appropriate Generation of Random Parameters
Authors Grzegorz Dudek
Abstract In this work, a method of random parameters generation for randomized learning of a single-hidden-layer feedforward neural network is proposed. The method firstly, randomly selects the slope angles of the hidden neurons activation functions from an interval adjusted to the target function, then randomly rotates the activation functions, and finally distributes them across the input space. For complex target functions the proposed method gives better results than the approach commonly used in practice, where the random parameters are selected from the fixed interval. This is because it introduces the steepest fragments of the activation functions into the input hypercube, avoiding their saturation fragments.
Tasks
Published 2019-08-15
URL https://arxiv.org/abs/1908.05542v1
PDF https://arxiv.org/pdf/1908.05542v1.pdf
PWC https://paperswithcode.com/paper/improving-randomized-learning-of-feedforward
Repo
Framework

Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning

Title Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning
Authors Han-Jia Ye, Hexiang Hu, De-Chuan Zhan, Fei Sha
Abstract Object recognition in real-world requires handling long-tailed or even open-ended data. An ideal visual system needs to reliably recognize the populated visual concepts and meanwhile efficiently learn about emerging new categories with a few training instances. Class-balanced many-shot learning and few-shot learning tackle one side of this problem, via either learning strong classifiers for populated categories or learning to learn few-shot classifiers for the tail classes. In this paper, we investigate the problem of generalized few-shot learning (GFSL) – a model during the deployment is required to not only learn about “tail” categories with few shots but simultaneously classify the “head” and “tail” categories. We propose the ClAssifier SynThesis LEarning (CASTLE), a learning framework that learns how to synthesize calibrated few-shot classifiers in addition to the multi-class classifiers of “head” classes with a shared neural dictionary, shedding light upon the inductive GFSL. Furthermore, we propose an adaptive version of CASTLE (ACASTLE) that adapts the “head” classifiers conditioned on the incoming “tail” training examples, yielding a framework that allows effective backward knowledge transfer. As a consequence, ACASTLE can handle generalized few-shot learning with classes from heterogeneous domains effectively. CASTLE and ACASTLE demonstrate superior performances than existing GFSL algorithms and strong baselines on MiniImageNet as well as TieredImageNet data sets. More interestingly, it outperforms previous state-of-the-art methods when evaluated on standard few-shot learning.
Tasks Few-Shot Learning, Object Recognition, Transfer Learning
Published 2019-06-07
URL https://arxiv.org/abs/1906.02944v3
PDF https://arxiv.org/pdf/1906.02944v3.pdf
PWC https://paperswithcode.com/paper/learning-classifier-synthesis-for-generalized
Repo
Framework

GREASE: A Generative Model for Relevance Search over Knowledge Graphs

Title GREASE: A Generative Model for Relevance Search over Knowledge Graphs
Authors Tianshuo Zhou, Ziyang Li, Gong Cheng, Jun Wang, Yu’Ang Wei
Abstract Relevance search is to find top-ranked entities in a knowledge graph (KG) that are relevant to a query entity. Relevance is ambiguous, particularly over a schema-rich KG like DBpedia which supports a wide range of different semantics of relevance based on numerous types of relations and attributes. As users may lack the expertise to formalize the desired semantics, supervised methods have emerged to learn the hidden user-defined relevance from user-provided examples. Along this line, in this paper we propose a novel generative model over KGs for relevance search, named GREASE. The model applies to meta-path based relevance where a meta-path characterizes a particular type of semantics of relating the query entity to answer entities. It is also extended to support properties that constrain answer entities. Extensive experiments on two large-scale KGs demonstrate that GREASE has advanced the state of the art in effectiveness, expressiveness, and efficiency.
Tasks Knowledge Graphs
Published 2019-10-11
URL https://arxiv.org/abs/1910.04927v1
PDF https://arxiv.org/pdf/1910.04927v1.pdf
PWC https://paperswithcode.com/paper/grease-a-generative-model-for-relevance
Repo
Framework

FCNHSMRA_HRS: Improve the performance of the movie hybrid recommender system using resource allocation approach

Title FCNHSMRA_HRS: Improve the performance of the movie hybrid recommender system using resource allocation approach
Authors Mostafa Khalaji, Nilufar Mohammadnejad
Abstract Recommender systems are systems that are capable of offering the most suitable services and products to users. Through specific methods and techniques, the recommender systems try to identify the most appropriate items, such as types of information and goods and propose the closest to the user’s tastes. Collaborative filtering offering active user suggestions based on the rating of a set of users is one of the simplest and most comprehensible and successful models for finding people in the same tastes in the recommender systems. In this model, with increasing number of users and movie, the system is subject to scalability. On the other hand, it is important to improve the performance of the system when there is little information available on the ratings. In this paper, a movie hybrid recommender system based on FNHSM_HRS structure using resource allocation approach called FCNHSMRA_HRS is presented. The FNHSM_HRS structure was based on the heuristic similarity measure (NHSM), along with fuzzy clustering. Using the fuzzy clustering method in the proposed system improves the scalability problem and increases the accuracy of system suggestions. The proposed systems is based on collaborative filtering and, by using the heuristic similarity measure and applying the resource allocation approach, improves the performance, accuracy and precision of the system. The experimental results using MAE, Accuracy, Precision and Recall metrics based on MovieLens dataset show that the performance of the system is improved and the accuracy of recommendations in comparison of FNHSM_HRS and collaborative filtering methods that use other similarity measures for finding similarity, is increased
Tasks Recommendation Systems
Published 2019-08-13
URL https://arxiv.org/abs/1908.05608v1
PDF https://arxiv.org/pdf/1908.05608v1.pdf
PWC https://paperswithcode.com/paper/fcnhsmra_hrs-improve-the-performance-of-the
Repo
Framework

Towards Zero-resource Cross-lingual Entity Linking

Title Towards Zero-resource Cross-lingual Entity Linking
Authors Shuyan Zhou, Shruti Rijhwani, Graham Neubig
Abstract Cross-lingual entity linking (XEL) grounds named entities in a source language to an English Knowledge Base (KB), such as Wikipedia. XEL is challenging for most languages because of limited availability of requisite resources. However, much previous work on XEL has been on simulated settings that actually use significant resources (e.g. source language Wikipedia, bilingual entity maps, multilingual embeddings) that are unavailable in truly low-resource languages. In this work, we first examine the effect of these resource assumptions and quantify how much the availability of these resource affects overall quality of existing XEL systems. Next, we propose three improvements to both entity candidate generation and disambiguation that make better use of the limited data we do have in resource-scarce scenarios. With experiments on four extremely low-resource languages, we show that our model results in gains of 6-23% in end-to-end linking accuracy.
Tasks Cross-Lingual Entity Linking, Entity Linking
Published 2019-09-29
URL https://arxiv.org/abs/1909.13180v2
PDF https://arxiv.org/pdf/1909.13180v2.pdf
PWC https://paperswithcode.com/paper/towards-zero-resource-cross-lingual-entity
Repo
Framework
comments powered by Disqus