Paper Group AWR 15
Practical Federated Gradient Boosting Decision Trees. OpenTapioca: Lightweight Entity Linking for Wikidata. Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use. Skin Lesion Synthesis with Generative Adversarial Networks. Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot R …
Practical Federated Gradient Boosting Decision Trees
Title | Practical Federated Gradient Boosting Decision Trees |
Authors | Qinbin Li, Zeyi Wen, Bingsheng He |
Abstract | Gradient Boosting Decision Trees (GBDTs) have become very successful in recent years, with many awards in machine learning and data mining competitions. There have been several recent studies on how to train GBDTs in the federated learning setting. In this paper, we focus on horizontal federated learning, where data samples with the same features are distributed among multiple parties. However, existing studies are not efficient or effective enough for practical use. They suffer either from the inefficiency due to the usage of costly data transformations such as secret sharing and homomorphic encryption, or from the low model accuracy due to differential privacy designs. In this paper, we study a practical federated environment with relaxed privacy constraints. In this environment, a dishonest party might obtain some information about the other parties’ data, but it is still impossible for the dishonest party to derive the actual raw data of other parties. Specifically, each party boosts a number of trees by exploiting similarity information based on locality-sensitive hashing. We prove that our framework is secure without exposing the original record to other parties, while the computation overhead in the training process is kept low. Our experimental studies show that, compared with normal training with the local data of each party, our approach can significantly improve the predictive accuracy, and achieve comparable accuracy to the original GBDT with the data from all parties. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04206v2 |
https://arxiv.org/pdf/1911.04206v2.pdf | |
PWC | https://paperswithcode.com/paper/practical-federated-gradient-boosting |
Repo | https://github.com/Xtra-Computing/PrivML |
Framework | none |
OpenTapioca: Lightweight Entity Linking for Wikidata
Title | OpenTapioca: Lightweight Entity Linking for Wikidata |
Authors | Antonin Delpeuch |
Abstract | We propose a simple Named Entity Linking system that can be trained from Wikidata only. This demonstrates the strengths and weaknesses of this data source for this task and provides an easily reproducible baseline to compare other systems against. Our model is lightweight to train, to run and to keep synchronous with Wikidata in real time. |
Tasks | Entity Linking |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09131v1 |
http://arxiv.org/pdf/1904.09131v1.pdf | |
PWC | https://paperswithcode.com/paper/opentapioca-lightweight-entity-linking-for |
Repo | https://github.com/wetneb/opentapioca |
Framework | none |
Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use
Title | Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use |
Authors | Janarthanan Rajendran, Jatin Ganhotra, Lazaros Polymenakos |
Abstract | Neural end-to-end goal-oriented dialog systems showed promise to reduce the workload of human agents for customer service, as well as reduce wait time for users. However, their inability to handle new user behavior at deployment has limited their usage in real world. In this work, we propose an end-to-end trainable method for neural goal-oriented dialog systems which handles new user behaviors at deployment by transferring the dialog to a human agent intelligently. The proposed method has three goals: 1) maximize user’s task success by transferring to human agents, 2) minimize the load on the human agents by transferring to them only when it is essential and 3) learn online from the human agent’s responses to reduce human agents load further. We evaluate our proposed method on a modified-bAbI dialog task that simulates the scenario of new user behaviors occurring at test time. Experimental results show that our proposed method is effective in achieving the desired goals. |
Tasks | Goal-Oriented Dialog |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07638v1 |
https://arxiv.org/pdf/1907.07638v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-end-to-end-goal-oriented-dialog-with-2 |
Repo | https://github.com/IBM/modified-bAbI-dialog-tasks |
Framework | none |
Skin Lesion Synthesis with Generative Adversarial Networks
Title | Skin Lesion Synthesis with Generative Adversarial Networks |
Authors | Alceu Bissoto, Fábio Perez, Eduardo Valle, Sandra Avila |
Abstract | Skin cancer is by far the most common type of cancer. Early detection is the key to increase the chances for successful treatment significantly. Currently, Deep Neural Networks are the state-of-the-art results on automated skin cancer classification. To push the results further, we need to address the lack of annotated data, which is expensive and require much effort from specialists. To bypass this problem, we propose using Generative Adversarial Networks for generating realistic synthetic skin lesion images. To the best of our knowledge, our results are the first to show visually-appealing synthetic images that comprise clinically-meaningful information. |
Tasks | Medical Image Generation, Skin Cancer Classification |
Published | 2019-02-08 |
URL | http://arxiv.org/abs/1902.03253v1 |
http://arxiv.org/pdf/1902.03253v1.pdf | |
PWC | https://paperswithcode.com/paper/skin-lesion-synthesis-with-generative |
Repo | https://github.com/alceubissoto/gan-skin-lesion |
Framework | tf |
Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations
Title | Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations |
Authors | Xin Lv, Yuxian Gu, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu |
Abstract | Multi-hop knowledge graph (KG) reasoning is an effective and explainable method for predicting the target entity via reasoning paths in query answering (QA) task. Most previous methods assume that every relation in KGs has enough training triples, regardless of those few-shot relations which cannot provide sufficient triples for training robust reasoning models. In fact, the performance of existing multi-hop reasoning methods drops significantly on few-shot relations. In this paper, we propose a meta-based multi-hop reasoning method (Meta-KGR), which adopts meta-learning to learn effective meta parameters from high-frequency relations that could quickly adapt to few-shot relations. We evaluate Meta-KGR on two public datasets sampled from Freebase and NELL, and the experimental results show that Meta-KGR outperforms the current state-of-the-art methods in few-shot scenarios. Our code and datasets can be obtained from https://github.com/ THU-KEG/MetaKGR. |
Tasks | Meta-Learning |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11513v1 |
https://arxiv.org/pdf/1908.11513v1.pdf | |
PWC | https://paperswithcode.com/paper/adapting-meta-knowledge-graph-information-for |
Repo | https://github.com/THU-KEG/MetaKGR |
Framework | pytorch |
Better Word Embeddings by Disentangling Contextual n-Gram Information
Title | Better Word Embeddings by Disentangling Contextual n-Gram Information |
Authors | Prakhar Gupta, Matteo Pagliardini, Martin Jaggi |
Abstract | Pre-trained word vectors are ubiquitous in Natural Language Processing applications. In this paper, we show how training word embeddings jointly with bigram and even trigram embeddings, results in improved unigram embeddings. We claim that training word embeddings along with higher n-gram embeddings helps in the removal of the contextual information from the unigrams, resulting in better stand-alone word embeddings. We empirically show the validity of our hypothesis by outperforming other competing word representation models by a significant margin on a wide variety of tasks. We make our models publicly available. |
Tasks | Word Embeddings |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05033v1 |
http://arxiv.org/pdf/1904.05033v1.pdf | |
PWC | https://paperswithcode.com/paper/better-word-embeddings-by-disentangling |
Repo | https://github.com/epfml/sent2vec |
Framework | none |
Self-Balanced Dropout
Title | Self-Balanced Dropout |
Authors | Shen Li, Chenhao Su, Renfen Hu, Zhengdong Lu |
Abstract | Dropout is known as an effective way to reduce overfitting via preventing co-adaptations of units. In this paper, we theoretically prove that the co-adaptation problem still exists after using dropout due to the correlations among the inputs. Based on the proof, we further propose Self-Balanced Dropout, a novel dropout method which uses a trainable variable to balance the influence of the input correlation on parameter update. We evaluate Self-Balanced Dropout on a range of tasks with both simple and complex models. The experimental results show that the mechanism can effectively solve the co-adaption problem to some extent and significantly improve the performance on all tasks. |
Tasks | |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.01968v1 |
https://arxiv.org/pdf/1908.01968v1.pdf | |
PWC | https://paperswithcode.com/paper/self-balanced-dropout |
Repo | https://github.com/shenshen-hungry/Self-Balanced-Dropout |
Framework | tf |
The Replica Dataset: A Digital Replica of Indoor Spaces
Title | The Replica Dataset: A Digital Replica of Indoor Spaces |
Authors | Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J. Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma, Anton Clarkson, Mingfei Yan, Brian Budge, Yajie Yan, Xiaqing Pan, June Yon, Yuyang Zou, Kimberly Leon, Nigel Carter, Jesus Briales, Tyler Gillingham, Elias Mueggler, Luis Pesqueira, Manolis Savva, Dhruv Batra, Hauke M. Strasdat, Renzo De Nardi, Michael Goesele, Steven Lovegrove, Richard Newcombe |
Abstract | We introduce Replica, a dataset of 18 highly photo-realistic 3D indoor scene reconstructions at room and building scale. Each scene consists of a dense mesh, high-resolution high-dynamic-range (HDR) textures, per-primitive semantic class and instance information, and planar mirror and glass reflectors. The goal of Replica is to enable machine learning (ML) research that relies on visually, geometrically, and semantically realistic generative models of the world - for instance, egocentric computer vision, semantic segmentation in 2D and 3D, geometric inference, and the development of embodied agents (virtual robots) performing navigation, instruction following, and question answering. Due to the high level of realism of the renderings from Replica, there is hope that ML systems trained on Replica may transfer directly to real world image and video data. Together with the data, we are releasing a minimal C++ SDK as a starting point for working with the Replica dataset. In addition, Replica is `Habitat-compatible’, i.e. can be natively used with AI Habitat for training and testing embodied agents. | |
Tasks | 3D Scene Reconstruction, Question Answering, Semantic Segmentation |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05797v1 |
https://arxiv.org/pdf/1906.05797v1.pdf | |
PWC | https://paperswithcode.com/paper/the-replica-dataset-a-digital-replica-of |
Repo | https://github.com/matthewbegun/awesome-stars |
Framework | tf |
Towards Knowledge-Based Recommender Dialog System
Title | Towards Knowledge-Based Recommender Dialog System |
Authors | Qibin Chen, Junyang Lin, Yichang Zhang, Ming Ding, Yukuo Cen, Hongxia Yang, Jie Tang |
Abstract | In this paper, we propose a novel end-to-end framework called KBRD, which stands for Knowledge-Based Recommender Dialog System. It integrates the recommender system and the dialog generation system. The dialog system can enhance the performance of the recommendation system by introducing knowledge-grounded information about users’ preferences, and the recommender system can improve that of the dialog generation system by providing recommendation-aware vocabulary bias. Experimental results demonstrate that our proposed model has significant advantages over the baselines in both the evaluation of dialog generation and recommendation. A series of analyses show that the two systems can bring mutual benefits to each other, and the introduced knowledge contributes to both their performances. |
Tasks | Recommendation Systems |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.05391v2 |
https://arxiv.org/pdf/1908.05391v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-knowledge-based-recommender-dialog |
Repo | https://github.com/THUDM/KBRD |
Framework | pytorch |
An Image Clustering Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids and MMD Distance
Title | An Image Clustering Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids and MMD Distance |
Authors | Qiuyu Zhu, Zhengyong Wang |
Abstract | In this paper, we propose a novel, effective and simpler end-to-end image clustering auto-encoder algorithm: ICAE. The algorithm uses PEDCC (Predefined Evenly-Distributed Class Centroids) as the clustering centers, which ensures the inter-class distance of latent features is maximal, and adds data distribution constraint, data augmentation constraint, auto-encoder reconstruction constraint and Sobel smooth constraint to improve the clustering performance. Specifically, we perform one-to-one data augmentation to learn the more effective features. The data and the augmented data are simultaneously input into the autoencoder to obtain latent features and the augmented latent features whose similarity are constrained by an augmentation loss. Then, making use of the maximum mean discrepancy distance (MMD), we combine the latent features and augmented latent features to make their distribution close to the PEDCC distribution (uniform distribution between classes, Dirac distribution within the class) to further learn clustering-oriented features. At the same time, the MSE of the original input image and reconstructed image is used as reconstruction constraint, and the Sobel smooth loss to build generalization constraint to improve the generalization ability. Finally, extensive experiments on three common datasets MNIST, Fashion-MNIST, COIL20 are conducted. The experimental results show that the algorithm has achieved the best clustering results so far. In addition, we can use the predefined PEDCC class centers, and the decoder to clearly generate the samples of each class. The code can be downloaded at https://github.com/zyWang-Power/Clustering! |
Tasks | Data Augmentation, Image Clustering |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.03905v2 |
https://arxiv.org/pdf/1906.03905v2.pdf | |
PWC | https://paperswithcode.com/paper/an-image-clustering-auto-encoder-based-on |
Repo | https://github.com/Zhengyong-Wang/Clustering |
Framework | pytorch |
Targeted Example Generation for Compilation Errors
Title | Targeted Example Generation for Compilation Errors |
Authors | Umair Z. Ahmed, Renuka Sindhgatta, Nisheeth Srivastava, Amey Karkare |
Abstract | We present TEGCER, an automated feedback tool for novice programmers. TEGCER uses supervised classification to match compilation errors in new code submissions with relevant pre-existing errors, submitted by other students before. The dense neural network used to perform this classification task is trained on 15000+ error-repair code examples. The proposed model yields a test set classification Pred@3 accuracy of 97.7% across 212 error category labels. Using this model as its base, TEGCER presents students with the closest relevant examples of solutions for their specific error on demand. |
Tasks | |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00769v2 |
https://arxiv.org/pdf/1909.00769v2.pdf | |
PWC | https://paperswithcode.com/paper/targeted-example-generation-for-compilation |
Repo | https://github.com/umairzahmed/tegcer |
Framework | none |
Efficient Large-Scale Multi-Drone Delivery Using Transit Networks
Title | Efficient Large-Scale Multi-Drone Delivery Using Transit Networks |
Authors | Shushman Choudhury, Kiril Solovey, Mykel J. Kochenderfer, Marco Pavone |
Abstract | We consider the problem of controlling a large fleet of drones to deliver packages simultaneously across broad urban areas. To conserve energy, drones hop between public transit vehicles (e.g., buses and trams). We design a comprehensive algorithmic framework that strives to minimize the maximum time to complete any delivery. We address the multifaceted complexity of the problem through a two-layer approach. First, the upper layer assigns drones to package delivery sequences with a near-optimal polynomial-time task allocation algorithm. Then, the lower layer executes the allocation by periodically routing the fleet over the transit network while employing efficient bounded-suboptimal multi-agent pathfinding techniques tailored to our setting. Experiments demonstrate the efficiency of our approach on settings with up to $200$ drones, $5000$ packages, and transit networks with up to $8000$ stops in San Francisco and Washington DC. Our results show that the framework computes solutions within a few seconds (up to $2$ minutes at most) on commodity hardware, and that drones travel up to $450 %$ of their flight range with public transit. |
Tasks | |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.11840v2 |
https://arxiv.org/pdf/1909.11840v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-large-scale-multi-drone-delivery |
Repo | https://github.com/sisl/MultiAgentAllocationTransit.jl |
Framework | none |
LUTNet: Rethinking Inference in FPGA Soft Logic
Title | LUTNet: Rethinking Inference in FPGA Soft Logic |
Authors | Erwei Wang, James J. Davis, Peter Y. K. Cheung, George A. Constantinides |
Abstract | Research has shown that deep neural networks contain significant redundancy, and that high classification accuracies can be achieved even when weights and activations are quantised down to binary values. Network binarisation on FPGAs greatly increases area efficiency by replacing resource-hungry multipliers with lightweight XNOR gates. However, an FPGA’s fundamental building block, the K-LUT, is capable of implementing far more than an XNOR: it can perform any K-input Boolean operation. Inspired by this observation, we propose LUTNet, an end-to-end hardware-software framework for the construction of area-efficient FPGA-based neural network accelerators using the native LUTs as inference operators. We demonstrate that the exploitation of LUT flexibility allows for far heavier pruning than possible in prior works, resulting in significant area savings while achieving comparable accuracy. Against the state-of-the-art binarised neural network implementation, we achieve twice the area efficiency for several standard network models when inferencing popular datasets. We also demonstrate that even greater energy efficiency improvements are obtainable. |
Tasks | |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00938v1 |
http://arxiv.org/pdf/1904.00938v1.pdf | |
PWC | https://paperswithcode.com/paper/lutnet-rethinking-inference-in-fpga-soft |
Repo | https://github.com/awai54st/LUTNet |
Framework | tf |
Magnetoresistive RAM for error resilient XNOR-Nets
Title | Magnetoresistive RAM for error resilient XNOR-Nets |
Authors | Michail Tzoufras, Marcin Gajek, Andrew Walker |
Abstract | We trained three Binarized Convolutional Neural Network architectures (LeNet-4, Network-In-Network, AlexNet) on a variety of datasets (MNIST, CIFAR-10, CIFAR-100, extended SVHN, ImageNet) using error-prone activations and tested them without errors to study the resilience of the training process. With the exception of the AlexNet when trained on the ImageNet dataset, we found that Bit Error Rates of a few percent during training do not degrade the test accuracy. Furthermore, by training the AlexNet on progressively smaller subsets of ImageNet classes, we observed increasing tolerance to activation errors. The ability to operate with high BERs is critical for reducing power consumption in existing hardware and for facilitating emerging memory technologies. We discuss how operating at moderate BER can enable Magnetoresistive RAM with higher endurance, speed and density. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10927v1 |
https://arxiv.org/pdf/1905.10927v1.pdf | |
PWC | https://paperswithcode.com/paper/magnetoresistive-ram-for-error-resilient-xnor |
Repo | https://github.com/michail-tzoufras/ConvNets_with_Activation_Errors |
Framework | pytorch |
CTCModel: a Keras Model for Connectionist Temporal Classification
Title | CTCModel: a Keras Model for Connectionist Temporal Classification |
Authors | Yann Soullard, Cyprien Ruffino, Thierry Paquet |
Abstract | We report an extension of a Keras Model, called CTCModel, to perform the Connectionist Temporal Classification (CTC) in a transparent way. Combined with Recurrent Neural Networks, the Connectionist Temporal Classification is the reference method for dealing with unsegmented input sequences, i.e. with data that are a couple of observation and label sequences where each label is related to a subset of observation frames. CTCModel makes use of the CTC implementation in the Tensorflow backend for training and predicting sequences of labels using Keras. It consists of three branches made of Keras models: one for training, computing the CTC loss function; one for predicting, providing sequences of labels; and one for evaluating that returns standard metrics for analyzing sequences of predictions. |
Tasks | |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.07957v1 |
http://arxiv.org/pdf/1901.07957v1.pdf | |
PWC | https://paperswithcode.com/paper/ctcmodel-a-keras-model-for-connectionist |
Repo | https://github.com/ysoullard/CTCModel |
Framework | tf |