Paper Group ANR 305
Moment Matching for Multi-Source Domain Adaptation. On Oracle-Efficient PAC RL with Rich Observations. Deep Neural Maps. Memory-Augmented Neural Networks for Predictive Process Analytics. Field-Programmable Deep Neural Network (DNN) Learning and Inference accelerator: a concept. Multilayered Model of Speech. Efficient Distributed Hessian Free Algor …
Moment Matching for Multi-Source Domain Adaptation
Title | Moment Matching for Multi-Source Domain Adaptation |
Authors | Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, Bo Wang |
Abstract | Conventional unsupervised domain adaptation (UDA) assumes that training data are sampled from a single domain. This neglects the more practical scenario where training data are collected from multiple sources, requiring multi-source domain adaptation. We make three major contributions towards addressing this problem. First, we collect and annotate by far the largest UDA dataset, called DomainNet, which contains six domains and about 0.6 million images distributed among 345 categories, addressing the gap in data availability for multi-source UDA research. Second, we propose a new deep learning approach, Moment Matching for Multi-Source Domain Adaptation M3SDA, which aims to transfer knowledge learned from multiple labeled source domains to an unlabeled target domain by dynamically aligning moments of their feature distributions. Third, we provide new theoretical insights specifically for moment matching approaches in both single and multiple source domain adaptation. Extensive experiments are conducted to demonstrate the power of our new dataset in benchmarking state-of-the-art multi-source domain adaptation methods, as well as the advantage of our proposed model. Dataset and Code are available at \url{http://ai.bu.edu/M3SDA/}. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2018-12-04 |
URL | https://arxiv.org/abs/1812.01754v4 |
https://arxiv.org/pdf/1812.01754v4.pdf | |
PWC | https://paperswithcode.com/paper/moment-matching-for-multi-source-domain |
Repo | |
Framework | |
On Oracle-Efficient PAC RL with Rich Observations
Title | On Oracle-Efficient PAC RL with Rich Observations |
Authors | Christoph Dann, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire |
Abstract | We study the computational tractability of PAC reinforcement learning with rich observations. We present new provably sample-efficient algorithms for environments with deterministic hidden state dynamics and stochastic rich observations. These methods operate in an oracle model of computation – accessing policy and value function classes exclusively through standard optimization primitives – and therefore represent computationally efficient alternatives to prior algorithms that require enumeration. With stochastic hidden state dynamics, we prove that the only known sample-efficient algorithm, OLIVE, cannot be implemented in the oracle model. We also present several examples that illustrate fundamental challenges of tractable PAC reinforcement learning in such general settings. |
Tasks | |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00606v4 |
http://arxiv.org/pdf/1803.00606v4.pdf | |
PWC | https://paperswithcode.com/paper/on-oracle-efficient-pac-rl-with-rich |
Repo | |
Framework | |
Deep Neural Maps
Title | Deep Neural Maps |
Authors | Mehran Pesteie, Purang Abolmaesumi, Robert Rohling |
Abstract | We introduce a new unsupervised representation learning and visualization using deep convolutional networks and self organizing maps called Deep Neural Maps (DNM). DNM jointly learns an embedding of the input data and a mapping from the embedding space to a two-dimensional lattice. We compare visualizations of DNM with those of t-SNE and LLE on the MNIST and COIL-20 data sets. Our experiments show that the DNM can learn efficient representations of the input data, which reflects characteristics of each class. This is shown via back-projecting the neurons of the map on the data space. |
Tasks | Representation Learning, Unsupervised Representation Learning |
Published | 2018-10-16 |
URL | http://arxiv.org/abs/1810.07291v1 |
http://arxiv.org/pdf/1810.07291v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-maps |
Repo | |
Framework | |
Memory-Augmented Neural Networks for Predictive Process Analytics
Title | Memory-Augmented Neural Networks for Predictive Process Analytics |
Authors | Asjad Khan, Hung Le, Kien Do, Truyen Tran, Aditya Ghose, Hoa Dam, Renuka Sindhgatta |
Abstract | Process analytics involves a sophisticated layer of data analytics built over the traditional notion of process mining. The flexible execution of business process instances involves multiple critical decisions including what task to perform next and what resources to allocate to a task. In this paper, we explore the application of deep learning techniques for solving various process analytics related problems. Based on recent advances in the field we specifically look at memory-augmented neural networks (MANN)s and adapt the latest model to date, namely the Differential Neural Computer. We introduce two modifications to account for a variety of tasks in predictive process analytics: (i) separating the encoding phase and decoding phase, resulting dual controllers, one for each phase; (ii) implementing a write-protected policy for the memory during the decoding phase. We demonstrate the feasibility and usefulness of our approach by solving a number of common process analytics tasks such as next activity prediction, time to completion and suffix prediction. We also introduce the notion of MANN based process analytics recommendation machinery that once deployed can serve as an effective business process recommendation engine enabling organizations to answer various prescriptive process analytics related questions.Using real-world datasets, we benchmark our results against those obtained from the state-of-art methods. We show that MANNs based process analytics methods can acheive state-of-the-art performance and have a lot of value to offer for enterprise specific process anlaytics applications. |
Tasks | Activity Prediction |
Published | 2018-02-03 |
URL | http://arxiv.org/abs/1802.00938v1 |
http://arxiv.org/pdf/1802.00938v1.pdf | |
PWC | https://paperswithcode.com/paper/memory-augmented-neural-networks-for |
Repo | |
Framework | |
Field-Programmable Deep Neural Network (DNN) Learning and Inference accelerator: a concept
Title | Field-Programmable Deep Neural Network (DNN) Learning and Inference accelerator: a concept |
Authors | Luiz M Franca-Neto |
Abstract | An accelerator is a specialized integrated circuit designed to perform specific computations faster than if those were performed by CPU or GPU. A Field-Programmable DNN learning and inference accelerator (FProg-DNN) using hybrid systolic and non-systolic techniques, distributed information-control and deep pipelined structure is proposed and its microarchitecture and operation presented here. Reconfigurability attends diverse DNN designs and allows for different number of workers to be assigned to different layers as a function of the relative difference in computational load among layers. The computational delay per layer is made roughly the same along pipelined accelerator structure. VGG-16 and recently proposed Inception Modules are used for showing the flexibility of the FProg-DNN reconfigurability. Special structures were also added for a combination of convolution layer, map coincidence and feedback for state of the art learning with small set of examples, which is the focus of a companion paper by the author (Franca-Neto, 2018). The accelerator described is able to reconfigure from (1) allocating all a DNN computations to a single worker in one extreme of sub-optimal performance to (2) optimally allocating workers per layer according to computational load in each DNN layer to be realized. Due the pipelined architecture, more than 50x speedup is achieved relative to GPUs or TPUs. This speed-up is consequence of hiding the delay in transporting activation outputs from one layer to the next in a DNN behind the computations in the receiving layer. This FProg-DNN concept has been simulated and validated at behavioral-functional level. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04899v4 |
http://arxiv.org/pdf/1802.04899v4.pdf | |
PWC | https://paperswithcode.com/paper/field-programmable-deep-neural-network-dnn |
Repo | |
Framework | |
Multilayered Model of Speech
Title | Multilayered Model of Speech |
Authors | Andrey Chistyakov |
Abstract | Human speech is the most important part of General Artificial Intelligence and subject of much research. The hypothesis proposed in this article provides explanation of difficulties that modern science tackles in the field of human brain simulation. The hypothesis is based on the author’s conviction that the brain of any given person has different ability to process and store information. Therefore, the approaches that are currently used to create General Artificial Intelligence have to be altered. |
Tasks | |
Published | 2018-01-08 |
URL | https://arxiv.org/abs/1801.04170v2 |
https://arxiv.org/pdf/1801.04170v2.pdf | |
PWC | https://paperswithcode.com/paper/multilayered-model-of-speech |
Repo | |
Framework | |
Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy
Title | Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy |
Authors | Majid Jahani, Xi He, Chenxin Ma, Aryan Mokhtari, Dheevatsa Mudigere, Alejandro Ribeiro, Martin Takáč |
Abstract | In this paper, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE) method in which sample size is gradually increasing to quickly obtain a solution whose empirical loss is under satisfactory statistical accuracy. Our proposed method is multistage in which the solution of a stage serves as a warm start for the next stage which contains more samples (including the samples in the previous stage). The proposed multistage algorithm reduces the number of passes over data to achieve the statistical accuracy of the full training set. Moreover, our algorithm in nature is easy to be distributed and shares the strong scaling property indicating that acceleration is always expected by using more computing nodes. Various iteration complexity results regarding descent direction computation, communication efficiency and stopping criteria are analyzed under convex setting. Our numerical results illustrate that the proposed method outperforms other comparable methods for solving learning problems including neural networks. |
Tasks | |
Published | 2018-10-26 |
URL | https://arxiv.org/abs/1810.11507v2 |
https://arxiv.org/pdf/1810.11507v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-distributed-hessian-free-algorithm |
Repo | |
Framework | |
Understanding Meanings in Multilingual Customer Feedback
Title | Understanding Meanings in Multilingual Customer Feedback |
Authors | Chao-Hong Liu, Declan Groves, Akira Hayakawa, Alberto Poncelas, Qun Liu |
Abstract | Understanding and being able to react to customer feedback is the most fundamental task in providing good customer service. However, there are two major obstacles for international companies to automatically detect the meaning of customer feedback in a global multilingual environment. Firstly, there is no widely acknowledged categorisation (classes) of meaning for customer feedback. Secondly, the applicability of one meaning categorisation, if it exists, to customer feedback in multiple languages is questionable. In this paper, we extracted representative real world samples of customer feedback from Microsoft Office customers in multiple languages, English, Spanish and Japanese,and concluded a five-class categorisation(comment, request, bug, complaint and meaningless) for meaning classification that could be used across languages in the realm of customer feedback analysis. |
Tasks | |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01694v1 |
http://arxiv.org/pdf/1806.01694v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-meanings-in-multilingual |
Repo | |
Framework | |
An enhanced computational feature selection method for medical synonym identification via bilingualism and multi-corpus training
Title | An enhanced computational feature selection method for medical synonym identification via bilingualism and multi-corpus training |
Authors | K. Lei, S. Si, D. Wen, Y. Shen |
Abstract | Medical synonym identification has been an important part of medical natural language processing (NLP). However, in the field of Chinese medical synonym identification, there are problems like low precision and low recall rate. To solve the problem, in this paper, we propose a method for identifying Chinese medical synonyms. We first selected 13 features including Chinese and English features. Then we studied the synonym identification results of each feature alone and different combinations of the features. Through the comparison among identification results, we present an optimal combination of features for Chinese medical synonym identification. Experiments show that our selected features have achieved 97.37% precision rate, 96.00% recall rate and 97.33% F1 score. |
Tasks | Feature Selection |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01879v1 |
http://arxiv.org/pdf/1812.01879v1.pdf | |
PWC | https://paperswithcode.com/paper/an-enhanced-computational-feature-selection |
Repo | |
Framework | |
Language-Independent Representor for Neural Machine Translation
Title | Language-Independent Representor for Neural Machine Translation |
Authors | Long Zhou, Yuchen Liu, Jiajun Zhang, Chengqing Zong, Guoping Huang |
Abstract | Current Neural Machine Translation (NMT) employs a language-specific encoder to represent the source sentence and adopts a language-specific decoder to generate target translation. This language-dependent design leads to large-scale network parameters and makes the duality of the parallel data underutilized. To address the problem, we propose in this paper a language-independent representor to replace the encoder and decoder by using weight sharing. This shared representor can not only reduce large portion of network parameters, but also facilitate us to fully explore the language duality by jointly training source-to-target, target-to-source, left-to-right and right-to-left translations within a multi-task learning framework. Experiments show that our proposed framework can obtain significant improvements over conventional NMT models on resource-rich and low-resource translation tasks with only a quarter of parameters. |
Tasks | Machine Translation, Multi-Task Learning |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00258v1 |
http://arxiv.org/pdf/1811.00258v1.pdf | |
PWC | https://paperswithcode.com/paper/language-independent-representor-for-neural |
Repo | |
Framework | |
Variational Composite Autoencoders
Title | Variational Composite Autoencoders |
Authors | Jiangchao Yao, Ivor Tsang, Ya Zhang |
Abstract | Learning in the latent variable model is challenging in the presence of the complex data structure or the intractable latent variable. Previous variational autoencoders can be low effective due to the straightforward encoder-decoder structure. In this paper, we propose a variational composite autoencoder to sidestep this issue by amortizing on top of the hierarchical latent variable model. The experimental results confirm the advantages of our model. |
Tasks | |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04435v1 |
http://arxiv.org/pdf/1804.04435v1.pdf | |
PWC | https://paperswithcode.com/paper/variational-composite-autoencoders |
Repo | |
Framework | |
Approximation by filter functions
Title | Approximation by filter functions |
Authors | Ivo Düntsch, Günther Gediga, Hui Wang |
Abstract | In this exploratory article, we draw attention to the common formal ground among various estimators such as the belief functions of evidence theory and their relatives, approximation quality of rough set theory, and contextual probability. The unifying concept will be a general filter function composed of a basic probability and a weighting which varies according to the problem at hand. To compare the various filter functions we conclude with a simulation study with an example from the area of item response theory. |
Tasks | |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07685v1 |
http://arxiv.org/pdf/1806.07685v1.pdf | |
PWC | https://paperswithcode.com/paper/approximation-by-filter-functions |
Repo | |
Framework | |
Explaining and Generalizing Back-Translation through Wake-Sleep
Title | Explaining and Generalizing Back-Translation through Wake-Sleep |
Authors | Ryan Cotterell, Julia Kreutzer |
Abstract | Back-translation has become a commonly employed heuristic for semi-supervised neural machine translation. The technique is both straightforward to apply and has led to state-of-the-art results. In this work, we offer a principled interpretation of back-translation as approximate inference in a generative model of bitext and show how the standard implementation of back-translation corresponds to a single iteration of the wake-sleep algorithm in our proposed model. Moreover, this interpretation suggests a natural iterative generalization, which we demonstrate leads to further improvement of up to 1.6 BLEU. |
Tasks | Machine Translation |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04402v1 |
http://arxiv.org/pdf/1806.04402v1.pdf | |
PWC | https://paperswithcode.com/paper/explaining-and-generalizing-back-translation |
Repo | |
Framework | |
Measuring Conflict in a Multi-Source Environment as a Normal Measure
Title | Measuring Conflict in a Multi-Source Environment as a Normal Measure |
Authors | Pan Wei, John E. Ball, Derek T. Anderson, Archit Harsh, Christopher Archibald |
Abstract | In a multi-source environment, each source has its own credibility. If there is no external knowledge about credibility then we can use the information provided by the sources to assess their credibility. In this paper, we propose a way to measure conflict in a multi-source environment as a normal measure. We examine our algorithm using three simulated examples of increasing conflict and one experimental example. The results demonstrate that the proposed measure can represent conflict in a meaningful way similar to what a human might expect and from it we can identify conflict within our sources. |
Tasks | |
Published | 2018-03-12 |
URL | http://arxiv.org/abs/1803.04556v1 |
http://arxiv.org/pdf/1803.04556v1.pdf | |
PWC | https://paperswithcode.com/paper/measuring-conflict-in-a-multi-source |
Repo | |
Framework | |
Effective Exploration for Deep Reinforcement Learning via Bootstrapped Q-Ensembles under Tsallis Entropy Regularization
Title | Effective Exploration for Deep Reinforcement Learning via Bootstrapped Q-Ensembles under Tsallis Entropy Regularization |
Authors | Gang Chen, Yiming Peng, Mengjie Zhang |
Abstract | Recently deep reinforcement learning (DRL) has achieved outstanding success on solving many difficult and large-scale RL problems. However the high sample cost required for effective learning often makes DRL unaffordable in resource-limited applications. With the aim of improving sample efficiency and learning performance, we will develop a new DRL algorithm in this paper that seamless integrates entropy-induced and bootstrap-induced techniques for efficient and deep exploration of the learning environment. Specifically, a general form of Tsallis entropy regularizer will be utilized to drive entropy-induced exploration based on efficient approximation of optimal action-selection policies. Different from many existing works that rely on action dithering strategies for exploration, our algorithm is efficient in exploring actions with clear exploration value. Meanwhile, by employing an ensemble of Q-networks under varied Tsallis entropy regularization, the diversity of the ensemble can be further enhanced to enable effective bootstrap-induced exploration. Experiments on Atari game playing tasks clearly demonstrate that our new algorithm can achieve more efficient and effective exploration for DRL, in comparison to recently proposed exploration methods including Bootstrapped Deep Q-Network and UCB Q-Ensemble. |
Tasks | |
Published | 2018-09-02 |
URL | http://arxiv.org/abs/1809.00403v2 |
http://arxiv.org/pdf/1809.00403v2.pdf | |
PWC | https://paperswithcode.com/paper/effective-exploration-for-deep-reinforcement |
Repo | |
Framework | |