Paper Group ANR 822
Event Extraction with Generative Adversarial Imitation Learning. Graph Diffusion-Embedding Networks. Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners. Automated Detection of Acute Leukemia using K-mean Clustering Algorithm. Communication Compression for Decentralized Training. Finding Answers from th …
Event Extraction with Generative Adversarial Imitation Learning
Title | Event Extraction with Generative Adversarial Imitation Learning |
Authors | Tongtao Zhang, Heng Ji |
Abstract | We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse – e.g., an ambiguous but correctly detected trigger or argument should receive high gains – while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering. |
Tasks | Feature Engineering, Imitation Learning |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07881v1 |
http://arxiv.org/pdf/1804.07881v1.pdf | |
PWC | https://paperswithcode.com/paper/event-extraction-with-generative-adversarial |
Repo | |
Framework | |
Graph Diffusion-Embedding Networks
Title | Graph Diffusion-Embedding Networks |
Authors | Bo Jiang, Doudou Lin, Jin Tang |
Abstract | We present a novel graph diffusion-embedding networks (GDEN) for graph structured data. GDEN is motivated by our closed-form formulation on regularized feature diffusion on graph. GDEN integrates both regularized feature diffusion and low-dimensional embedding simultaneously in a unified network model. Moreover, based on GDEN, we can naturally deal with structured data with multiple graph structures. Experiments on semi-supervised learning tasks on several benchmark datasets demonstrate the better performance of the proposed GDEN when comparing with the traditional GCN models. |
Tasks | |
Published | 2018-10-01 |
URL | http://arxiv.org/abs/1810.00797v1 |
http://arxiv.org/pdf/1810.00797v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-diffusion-embedding-networks |
Repo | |
Framework | |
Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners
Title | Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners |
Authors | Yao Zhou, Arun Reddy Nelakurthi, Jingrui He |
Abstract | With the increasing demand for large amount of labeled data, crowdsourcing has been used in many large-scale data mining applications. However, most existing works in crowdsourcing mainly focus on label inference and incentive design. In this paper, we address a different problem of adaptive crowd teaching, which is a sub-area of machine teaching in the context of crowdsourcing. Compared with machines, human beings are extremely good at learning a specific target concept (e.g., classifying the images into given categories) and they can also easily transfer the learned concepts into similar learning tasks. Therefore, a more effective way of utilizing crowdsourcing is by supervising the crowd to label in the form of teaching. In order to perform the teaching and expertise estimation simultaneously, we propose an adaptive teaching framework named JEDI to construct the personalized optimal teaching set for the crowdsourcing workers. In JEDI teaching, the teacher assumes that each learner has an exponentially decayed memory. Furthermore, it ensures comprehensiveness in the learning process by carefully balancing teaching diversity and learner’s accurate learning in terms of teaching usefulness. Finally, we validate the effectiveness and efficacy of JEDI teaching in comparison with the state-of-the-art techniques on multiple data sets with both synthetic learners and real crowdsourcing workers. |
Tasks | |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06481v2 |
http://arxiv.org/pdf/1804.06481v2.pdf | |
PWC | https://paperswithcode.com/paper/unlearn-what-you-have-learned-adaptive-crowd |
Repo | |
Framework | |
Automated Detection of Acute Leukemia using K-mean Clustering Algorithm
Title | Automated Detection of Acute Leukemia using K-mean Clustering Algorithm |
Authors | Sachin Kumar, Sumita Mishra, Pallavi Asthana, Pragya |
Abstract | Leukemia is a hematologic cancer which develops in blood tissue and triggers rapid production of immature and abnormal shaped white blood cells. Based on statistics it is found that the leukemia is one of the leading causes of death in men and women alike. Microscopic examination of blood sample or bone marrow smear is the most effective technique for diagnosis of leukemia. Pathologists analyze microscopic samples to make diagnostic assessments on the basis of characteristic cell features. Recently, computerized methods for cancer detection have been explored towards minimizing human intervention and providing accurate clinical information. This paper presents an algorithm for automated image based acute leukemia detection systems. The method implemented uses basic enhancement, morphology, filtering and segmenting technique to extract region of interest using k-means clustering algorithm. The proposed algorithm achieved an accuracy of 92.8% and is tested with Nearest Neighbor (KNN) and Naive Bayes Classifier on the data-set of 60 samples. |
Tasks | |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.08544v1 |
http://arxiv.org/pdf/1803.08544v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-detection-of-acute-leukemia-using-k |
Repo | |
Framework | |
Communication Compression for Decentralized Training
Title | Communication Compression for Decentralized Training |
Authors | Hanlin Tang, Shaoduo Gan, Ce Zhang, Tong Zhang, Ji Liu |
Abstract | Optimizing distributed learning systems is an art of balancing between computation and communication. There have been two lines of research that try to deal with slower networks: {\em communication compression} for low bandwidth networks, and {\em decentralization} for high latency networks. In this paper, We explore a natural question: {\em can the combination of both techniques lead to a system that is robust to both bandwidth and latency?} Although the system implication of such combination is trivial, the underlying theoretical principle and algorithm design is challenging: unlike centralized algorithms, simply compressing exchanged information, even in an unbiased stochastic way, within the decentralized network would accumulate the error and fail to converge. In this paper, we develop a framework of compressed, decentralized training and propose two different strategies, which we call {\em extrapolation compression} and {\em difference compression}. We analyze both algorithms and prove both converge at the rate of $O(1/\sqrt{nT})$ where $n$ is the number of workers and $T$ is the number of iterations, matching the convergence rate for full precision, centralized training. We validate our algorithms and find that our proposed algorithm outperforms the best of merely decentralized and merely quantized algorithm significantly for networks with {\em both} high latency and low bandwidth. |
Tasks | |
Published | 2018-03-17 |
URL | http://arxiv.org/abs/1803.06443v5 |
http://arxiv.org/pdf/1803.06443v5.pdf | |
PWC | https://paperswithcode.com/paper/communication-compression-for-decentralized |
Repo | |
Framework | |
Finding Answers from the Word of God: Domain Adaptation for Neural Networks in Biblical Question Answering
Title | Finding Answers from the Word of God: Domain Adaptation for Neural Networks in Biblical Question Answering |
Authors | Helen Jiahe Zhao, Jiamou Liu |
Abstract | Question answering (QA) has significantly benefitted from deep learning techniques in recent years. However, domain-specific QA remains a challenge due to the significant amount of data required to train a neural network. This paper studies the answer sentence selection task in the Bible domain and answer questions by selecting relevant verses from the Bible. For this purpose, we create a new dataset BibleQA based on bible trivia questions and propose three neural network models for our task. We pre-train our models on a large-scale QA dataset, SQuAD, and investigate the effect of transferring weights on model accuracy. Furthermore, we also measure the model accuracies with different answer context lengths and different Bible translations. We affirm that transfer learning has a noticeable improvement in the model accuracy. We achieve relatively good results with shorter context lengths, whereas longer context lengths decreased model accuracy. We also find that using a more modern Bible translation in the dataset has a positive effect on the task. |
Tasks | Domain Adaptation, Question Answering, Transfer Learning |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.12118v1 |
http://arxiv.org/pdf/1810.12118v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-answers-from-the-word-of-god-domain |
Repo | |
Framework | |
Towards Explanation of DNN-based Prediction with Guided Feature Inversion
Title | Towards Explanation of DNN-based Prediction with Guided Feature Inversion |
Authors | Mengnan Du, Ninghao Liu, Qingquan Song, Xia Hu |
Abstract | While deep neural networks (DNN) have become an effective computational tool, the prediction results are often criticized by the lack of interpretability, which is essential in many real-world applications such as health informatics. Existing attempts based on local interpretations aim to identify relevant features contributing the most to the prediction of DNN by monitoring the neighborhood of a given input. They usually simply ignore the intermediate layers of the DNN that might contain rich information for interpretation. To bridge the gap, in this paper, we propose to investigate a guided feature inversion framework for taking advantage of the deep architectures towards effective interpretation. The proposed framework not only determines the contribution of each feature in the input but also provides insights into the decision-making process of DNN models. By further interacting with the neuron of the target category at the output layer of the DNN, we enforce the interpretation result to be class-discriminative. We apply the proposed interpretation model to different CNN architectures to provide explanations for image data and conduct extensive experiments on ImageNet and PASCAL VOC07 datasets. The interpretation results demonstrate the effectiveness of our proposed framework in providing class-discriminative interpretation for DNN-based prediction. |
Tasks | Decision Making |
Published | 2018-03-19 |
URL | http://arxiv.org/abs/1804.00506v2 |
http://arxiv.org/pdf/1804.00506v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-explanation-of-dnn-based-prediction |
Repo | |
Framework | |
Recommending Outfits from Personal Closet
Title | Recommending Outfits from Personal Closet |
Authors | Pongsate Tangseng, Kota Yamaguchi, Takayuki Okatani |
Abstract | We consider grading a fashion outfit for recommendation, where we assume that users have a closet of items and we aim at producing a score for an arbitrary combination of items in the closet. The challenge in outfit grading is that the input to the system is a bag of item pictures that are unordered and vary in size. We build a deep neural network-based system that can take variable-length items and predict a score. We collect a large number of outfits from a popular fashion sharing website, Polyvore, and evaluate the performance of our grading system. We compare our model with a random-choice baseline, both on the traditional classification evaluation and on people’s judgment using a crowdsourcing platform. With over 84% in classification accuracy and 91% matching ratio to human annotators, our model can reliably grade the quality of an outfit. We also build an outfit recommender on top of our grader to demonstrate the practical application of our model for a personal closet assistant. |
Tasks | |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.09979v1 |
http://arxiv.org/pdf/1804.09979v1.pdf | |
PWC | https://paperswithcode.com/paper/recommending-outfits-from-personal-closet |
Repo | |
Framework | |
Deep Learning for Forecasting Stock Returns in the Cross-Section
Title | Deep Learning for Forecasting Stock Returns in the Cross-Section |
Authors | Masaya Abe, Hideki Nakayama |
Abstract | Many studies have been undertaken by using machine learning techniques, including neural networks, to predict stock returns. Recently, a method known as deep learning, which achieves high performance mainly in image recognition and speech recognition, has attracted attention in the machine learning field. This paper implements deep learning to predict one-month-ahead stock returns in the cross-section in the Japanese stock market and investigates the performance of the method. Our results show that deep neural networks generally outperform shallow neural networks, and the best networks also outperform representative machine learning models. These results indicate that deep learning shows promise as a skillful machine learning method to predict stock returns in the cross-section. |
Tasks | Speech Recognition |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.01777v4 |
http://arxiv.org/pdf/1801.01777v4.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-forecasting-stock-returns |
Repo | |
Framework | |
CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs
Title | CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs |
Authors | Liangzhen Lai, Naveen Suda, Vikas Chandra |
Abstract | Deep Neural Networks are becoming increasingly popular in always-on IoT edge devices performing data analytics right at the source, reducing latency as well as energy consumption for data communication. This paper presents CMSIS-NN, efficient kernels developed to maximize the performance and minimize the memory footprint of neural network (NN) applications on Arm Cortex-M processors targeted for intelligent IoT edge devices. Neural network inference based on CMSIS-NN kernels achieves 4.6X improvement in runtime/throughput and 4.9X improvement in energy efficiency. |
Tasks | |
Published | 2018-01-19 |
URL | http://arxiv.org/abs/1801.06601v1 |
http://arxiv.org/pdf/1801.06601v1.pdf | |
PWC | https://paperswithcode.com/paper/cmsis-nn-efficient-neural-network-kernels-for |
Repo | |
Framework | |
End-to-End Learning for the Deep Multivariate Probit Model
Title | End-to-End Learning for the Deep Multivariate Probit Model |
Authors | Di Chen, Yexiang Xue, Carla P. Gomes |
Abstract | The multivariate probit model (MVP) is a popular classic model for studying binary responses of multiple entities. Nevertheless, the computational challenge of learning the MVP model, given that its likelihood involves integrating over a multidimensional constrained space of latent variables, significantly limits its application in practice. We propose a flexible deep generalization of the classic MVP, the Deep Multivariate Probit Model (DMVP), which is an end-to-end learning scheme that uses an efficient parallel sampling process of the multivariate probit model to exploit GPU-boosted deep neural networks. We present both theoretical and empirical analysis of the convergence behavior of DMVP’s sampling process with respect to the resolution of the correlation structure. We provide convergence guarantees for DMVP and our empirical analysis demonstrates the advantages of DMVP’s sampling compared with standard MCMC-based methods. We also show that when applied to multi-entity modelling problems, which are natural DMVP applications, DMVP trains faster than classical MVP, by at least an order of magnitude, captures rich correlations among entities, and further improves the joint likelihood of entities compared with several competitive models. |
Tasks | |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08591v4 |
http://arxiv.org/pdf/1803.08591v4.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-learning-for-the-deep-multivariate |
Repo | |
Framework | |
Design by adaptive sampling
Title | Design by adaptive sampling |
Authors | David H. Brookes, Jennifer Listgarten |
Abstract | We present a probabilistic modeling framework and adaptive sampling algorithm wherein unsupervised generative models are combined with black box predictive models to tackle the problem of input design. In input design, one is given one or more stochastic “oracle” predictive functions, each of which maps from the input design space (e.g. DNA sequences or images) to a distribution over a property of interest (e.g. protein fluorescence or image content). Given such stochastic oracles, the problem is to find an input that is expected to maximize one or more properties, or to achieve a specified value of one or more properties, or any combination thereof. We demonstrate experimentally that our approach substantially outperforms other recently presented methods for tackling a specific version of this problem, namely, maximization when the oracle is assumed to be deterministic and unbiased. We also demonstrate that our method can tackle more general versions of the problem. |
Tasks | |
Published | 2018-10-08 |
URL | https://arxiv.org/abs/1810.03714v4 |
https://arxiv.org/pdf/1810.03714v4.pdf | |
PWC | https://paperswithcode.com/paper/design-by-adaptive-sampling |
Repo | |
Framework | |
Policy Gradient With Value Function Approximation For Collective Multiagent Planning
Title | Policy Gradient With Value Function Approximation For Collective Multiagent Planning |
Authors | Duc Thien Nguyen, Akshat Kumar, Hoong Chuin Lau |
Abstract | Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDEC-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDEC-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to train the critic based on local reward signals. Comparisons on a synthetic benchmark and a real-world taxi fleet optimization problem show that our new AC approach provides better quality solutions than previous best approaches. |
Tasks | Decision Making |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.02884v1 |
http://arxiv.org/pdf/1804.02884v1.pdf | |
PWC | https://paperswithcode.com/paper/policy-gradient-with-value-function |
Repo | |
Framework | |
Not Just Privacy: Improving Performance of Private Deep Learning in Mobile Cloud
Title | Not Just Privacy: Improving Performance of Private Deep Learning in Mobile Cloud |
Authors | Ji Wang, Jianguo Zhang, Weidong Bao, Xiaomin Zhu, Bokai Cao, Philip S. Yu |
Abstract | The increasing demand for on-device deep learning services calls for a highly efficient manner to deploy deep neural networks (DNNs) on mobile devices with limited capacity. The cloud-based solution is a promising approach to enabling deep learning applications on mobile devices where the large portions of a DNN are offloaded to the cloud. However, revealing data to the cloud leads to potential privacy risk. To benefit from the cloud data center without the privacy risk, we design, evaluate, and implement a cloud-based framework ARDEN which partitions the DNN across mobile devices and cloud data centers. A simple data transformation is performed on the mobile device, while the resource-hungry training and the complex inference rely on the cloud data center. To protect the sensitive information, a lightweight privacy-preserving mechanism consisting of arbitrary data nullification and random noise addition is introduced, which provides strong privacy guarantee. A rigorous privacy budget analysis is given. Nonetheless, the private perturbation to the original data inevitably has a negative impact on the performance of further inference on the cloud side. To mitigate this influence, we propose a noisy training method to enhance the cloud-side network robustness to perturbed data. Through the sophisticated design, ARDEN can not only preserve privacy but also improve the inference performance. To validate the proposed ARDEN, a series of experiments based on three image datasets and a real mobile application are conducted. The experimental results demonstrate the effectiveness of ARDEN. Finally, we implement ARDEN on a demo system to verify its practicality. |
Tasks | |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03428v3 |
http://arxiv.org/pdf/1809.03428v3.pdf | |
PWC | https://paperswithcode.com/paper/not-just-privacy-improving-performance-of |
Repo | |
Framework | |
Topologically Controlled Lossy Compression
Title | Topologically Controlled Lossy Compression |
Authors | Maxime Soler, Melanie Plainchault, Bruno Conche, Julien Tierny |
Abstract | This paper presents a new algorithm for the lossy compression of scalar data defined on 2D or 3D regular grids, with topological control. Certain techniques allow users to control the pointwise error induced by the compression. However, in many scenarios it is desirable to control in a similar way the preservation of higher-level notions, such as topological features , in order to provide guarantees on the outcome of post-hoc data analyses. This paper presents the first compression technique for scalar data which supports a strictly controlled loss of topological features. It provides users with specific guarantees both on the preservation of the important features and on the size of the smaller features destroyed during compression. In particular, we present a simple compression strategy based on a topologically adaptive quantization of the range. Our algorithm provides strong guarantees on the bottleneck distance between persistence diagrams of the input and decompressed data, specifically those associated with extrema. A simple extension of our strategy additionally enables a control on the pointwise error. We also show how to combine our approach with state-of-the-art compressors, to further improve the geometrical reconstruction. Extensive experiments, for comparable compression rates, demonstrate the superiority of our algorithm in terms of the preservation of topological features. We show the utility of our approach by illustrating the compatibility between the output of post-hoc topological data analysis pipelines, executed on the input and decompressed data, for simulated or acquired data sets. We also provide a lightweight VTK-based C++ implementation of our approach for reproduction purposes. |
Tasks | Quantization, Topological Data Analysis |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.02731v1 |
http://arxiv.org/pdf/1802.02731v1.pdf | |
PWC | https://paperswithcode.com/paper/topologically-controlled-lossy-compression |
Repo | |
Framework | |