Paper Group ANR 849
The Tools Challenge: Rapid Trial-and-Error Learning in Physical Problem Solving. Emergence of Numeric Concepts in Multi-Agent Autonomous Communication. A Method for Identifying Origin of Digital Images Using a Convolution Neural Network. A frame semantic overview of NLP-based information extraction for cancer-related EHR notes. Remaining Useful Lif …
The Tools Challenge: Rapid Trial-and-Error Learning in Physical Problem Solving
Title | The Tools Challenge: Rapid Trial-and-Error Learning in Physical Problem Solving |
Authors | Kelsey R. Allen, Kevin A. Smith, Joshua B. Tenenbaum |
Abstract | Many animals, and an increasing number of artificial agents, display sophisticated capabilities to perceive and manipulate objects. But human beings remain distinctive in their capacity for flexible, creative tool use – using objects in new ways to act on the world, achieve a goal, or solve a problem. Here we introduce the “Tools” game, a simple but challenging domain for studying this behavior in human and artificial agents. Players place objects in a dynamic scene to accomplish a goal that can only be achieved if those objects interact with other scene elements in appropriate ways: for instance, launching, blocking, supporting or tipping them. Only a few attempts are permitted, requiring rapid trial-and-error learning if a solution is not found at first. We propose a “Sample, Simulate, Update” (SSUP) framework for modeling how people solve these challenges, based on exploiting rich world knowledge to sample actions that would lead to successful outcomes, simulate candidate actions before trying them out, and update beliefs about which tools and actions are best in a rapid learning loop. SSUP captures human performance well across 20 levels of the Tools game, and fits significantly better than alternate accounts based on deep reinforcement learning or learning the simulator parameters online. We discuss how the Tools challenge might guide the development of better physical reasoning agents in AI, as well as better accounts of human physical reasoning and tool use. |
Tasks | |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09620v2 |
https://arxiv.org/pdf/1907.09620v2.pdf | |
PWC | https://paperswithcode.com/paper/the-tools-challenge-rapid-trial-and-error |
Repo | |
Framework | |
Emergence of Numeric Concepts in Multi-Agent Autonomous Communication
Title | Emergence of Numeric Concepts in Multi-Agent Autonomous Communication |
Authors | Shangmin Guo |
Abstract | With the rapid development of deep learning, most of current state-of-the-art techniques in natural langauge processing are based on deep learning models trained with argescaled static textual corpora. However, we human beings learn and understand in a different way. Thus, grounded language learning argues that models need to learn and understand language by the experience and perceptions obtained by interacting with enviroments, like how humans do. With the help of deep reinforcement learning techniques, there are already lots of works focusing on facilitating the emergence of communication protocols that have compositionalities like natural languages among computational agents population. Unlike these works, we, on the other hand, focus on the numeric concepts which correspond to abstractions in cognition and function words in natural language. Based on a specifically designed language game, we verify that computational agents are capable of transmitting numeric concepts during autonomous communication, and the emergent communication protocols can reflect the underlying structure of meaning space. Although their encodeing method is not compositional like natural languages from a perspective of human beings, the emergent languages can be generalised to unseen inputs and, more importantly, are easier for models to learn. Besides, iterated learning can help further improving the compositionality of the emergent languages, under the measurement of topological similarity. Furthermore, we experiment another representation method, i.e. directly encode numerals into concatenations of one-hot vectors, and find that the emergent languages would become compositional like human natural languages. Thus, we argue that there are 2 important factors for the emergence of compositional languages. |
Tasks | |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01098v1 |
https://arxiv.org/pdf/1911.01098v1.pdf | |
PWC | https://paperswithcode.com/paper/emergence-of-numeric-concepts-in-multi-agent |
Repo | |
Framework | |
A Method for Identifying Origin of Digital Images Using a Convolution Neural Network
Title | A Method for Identifying Origin of Digital Images Using a Convolution Neural Network |
Authors | Rong Huang, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen |
Abstract | The rapid development of deep learning techniques has created new challenges in identifying the origin of digital images because generative adversarial networks and variational autoencoders can create plausible digital images whose contents are not present in natural scenes. In this paper, we consider the origin that can be broken down into three categories: natural photographic image (NPI), computer generated graphic (CGG), and deep network generated image (DGI). A method is presented for effectively identifying the origin of digital images that is based on a convolutional neural network (CNN) and uses a local-to-global framework to reduce training complexity. By feeding labeled data, the CNN is trained to predict the origin of local patches cropped from an image. The origin of the full-size image is then determined by majority voting. Unlike previous forensic methods, the CNN takes the raw pixels as input without the aid of “residual map”. Experimental results revealed that not only the high-frequency components but also the middle-frequency ones contribute to origin identification. The proposed method achieved up to 95.21% identification accuracy and behaved robustly against several common post-processing operations including JPEG compression, scaling, geometric transformation, and contrast stretching. The quantitative results demonstrate that the proposed method is more effective than handcrafted feature-based methods. |
Tasks | |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00655v1 |
https://arxiv.org/pdf/1911.00655v1.pdf | |
PWC | https://paperswithcode.com/paper/a-method-for-identifying-origin-of-digital |
Repo | |
Framework | |
A frame semantic overview of NLP-based information extraction for cancer-related EHR notes
Title | A frame semantic overview of NLP-based information extraction for cancer-related EHR notes |
Authors | Surabhi Datta, Elmer V Bernstam, Kirk Roberts |
Abstract | Objective: There is a lot of information about cancer in Electronic Health Record (EHR) notes that can be useful for biomedical research provided natural language processing (NLP) methods are available to extract and structure this information. In this paper, we present a scoping review of existing clinical NLP literature for cancer. Methods: We identified studies describing an NLP method to extract specific cancer-related information from EHR sources from PubMed, Google Scholar, ACL Anthology, and existing reviews. Two exclusion criteria were used in this study. We excluded articles where the extraction techniques used were too broad to be represented as frames and also where very low-level extraction methods were used. 79 articles were included in the final review. We organized this information according to frame semantic principles to help identify common areas of overlap and potential gaps. Results: Frames were created from the reviewed articles pertaining to cancer information such as cancer diagnosis, tumor description, cancer procedure, breast cancer diagnosis, prostate cancer diagnosis and pain in prostate cancer patients. These frames included both a definition as well as specific frame elements (i.e. extractable attributes). We found that cancer diagnosis was the most common frame among the reviewed papers (36 out of 79), with recent work focusing on extracting information related to treatment and breast cancer diagnosis. Conclusion: The list of common frames described in this paper identifies important cancer-related information extracted by existing NLP techniques and serves as a useful resource for future researchers requiring cancer information extracted from EHR notes. We also argue, due to the heavy duplication of cancer NLP systems, that a general purpose resource of annotated cancer frames and corresponding NLP tools would be valuable. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01655v1 |
http://arxiv.org/pdf/1904.01655v1.pdf | |
PWC | https://paperswithcode.com/paper/a-frame-semantic-overview-of-nlp-based |
Repo | |
Framework | |
Remaining Useful Life Estimation Using Functional Data Analysis
Title | Remaining Useful Life Estimation Using Functional Data Analysis |
Authors | Qiyao Wang, Shuai Zheng, Ahmed Farahat, Susumu Serita, Chetan Gupta |
Abstract | Remaining Useful Life (RUL) of an equipment or one of its components is defined as the time left until the equipment or component reaches its end of useful life. Accurate RUL estimation is exceptionally beneficial to Predictive Maintenance, and Prognostics and Health Management (PHM). Data driven approaches which leverage the power of algorithms for RUL estimation using sensor and operational time series data are gaining popularity. Existing algorithms, such as linear regression, Convolutional Neural Network (CNN), Hidden Markov Models (HMMs), and Long Short-Term Memory (LSTM), have their own limitations for the RUL estimation task. In this work, we propose a novel Functional Data Analysis (FDA) method called functional Multilayer Perceptron (functional MLP) for RUL estimation. Functional MLP treats time series data from multiple equipment as a sample of random continuous processes over time. FDA explicitly incorporates both the correlations within the same equipment and the random variations across different equipment’s sensor time series into the model. FDA also has the benefit of allowing the relationship between RUL and sensor variables to vary over time. We implement functional MLP on the benchmark NASA C-MAPSS data and evaluate the performance using two popularly-used metrics. Results show the superiority of our algorithm over all the other state-of-the-art methods. |
Tasks | Time Series |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06442v1 |
http://arxiv.org/pdf/1904.06442v1.pdf | |
PWC | https://paperswithcode.com/paper/remaining-useful-life-estimation-using |
Repo | |
Framework | |
Deep Learning with Anatomical Priors: Imitating Enhanced Autoencoders in Latent Space for Improved Pelvic Bone Segmentation in MRI
Title | Deep Learning with Anatomical Priors: Imitating Enhanced Autoencoders in Latent Space for Improved Pelvic Bone Segmentation in MRI |
Authors | Duc Duy Pham, Gurbandurdy Dovletov, Sebastian Warwas, Stefan Landgraeber, Marcus Jäger, Josef Pauli |
Abstract | We propose a 2D Encoder-Decoder based deep learning architecture for semantic segmentation, that incorporates anatomical priors by imitating the encoder component of an autoencoder in latent space. The autoencoder is additionally enhanced by means of hierarchical features, extracted by an U-Net module. Our suggested architecture is trained in an end-to-end manner and is evaluated on the example of pelvic bone segmentation in MRI. A comparison to the standard U-Net architecture shows promising improvements. |
Tasks | Semantic Segmentation |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09263v1 |
http://arxiv.org/pdf/1903.09263v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-with-anatomical-priors |
Repo | |
Framework | |
Security of Facial Forensics Models Against Adversarial Attacks
Title | Security of Facial Forensics Models Against Adversarial Attacks |
Authors | Rong Huang, Fuming Fang, Huy H. Nguyen, Junichi Yamagishi, Isao Echizen |
Abstract | Deep neural networks (DNNs) have been used in forensics to identify fake facial images. We investigated several DNN-based forgery forensics models (FFMs) to determine whether they are secure against adversarial attacks. We experimentally demonstrated the existence of individual adversarial perturbations (IAPs) and universal adversarial perturbations (UAPs) that can lead a well-performed FFM to misbehave. Based on iterative procedure, gradient information is used to generate two kinds of IAPs that can be used to fabricate classification and segmentation outputs. In contrast, UAPs are generated on the basis of over-firing. We designed a new objective function that encourages neurons to over-fire, which makes UAP generation feasible even without using training data. Experiments demonstrated the transferability of UAPs across unseen datasets and unseen FFMs. Moreover, we are the first to conduct subjective assessment for imperceptibility of the adversarial perturbations, revealing that the crafted UAPs are visually negligible. There findings provide a baseline for evaluating the adversarial security of FFMs. |
Tasks | |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00660v1 |
https://arxiv.org/pdf/1911.00660v1.pdf | |
PWC | https://paperswithcode.com/paper/security-of-facial-forensics-models-against |
Repo | |
Framework | |
RIS-GAN: Explore Residual and Illumination with Generative Adversarial Networks for Shadow Removal
Title | RIS-GAN: Explore Residual and Illumination with Generative Adversarial Networks for Shadow Removal |
Authors | Ling Zhang, Chengjiang Long, Xiaolong Zhang, Chunxia Xiao |
Abstract | Residual images and illumination estimation have been proved very helpful in image enhancement. In this paper, we propose a general and novel framework RIS-GAN which explores residual and illumination with Generative Adversarial Networks for shadow removal. Combined with the coarse shadow-removal image, the estimated negative residual images and inverse illumination maps can be used to generate indirect shadow-removal images to refine the coarse shadow-removal result to the fine shadow-free image in a coarse-to-fine fashion. Three discriminators are designed to distinguish whether the predicted negative residual images, shadow-removal images, and the inverse illumination maps are real or fake jointly compared with the corresponding ground-truth information. To our best knowledge, we are the first one to explore residual and illumination for shadow removal. We evaluate our proposed method on two benchmark datasets, i.e., SRD and ISTD, and the extensive experiments demonstrate that our proposed method achieves the superior performance to state-of-the-arts, although we have no particular shadow-aware components designed in our generators. |
Tasks | Image Enhancement |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.09178v2 |
https://arxiv.org/pdf/1911.09178v2.pdf | |
PWC | https://paperswithcode.com/paper/ris-gan-explore-residual-and-illumination |
Repo | |
Framework | |
Pre-training of Deep Contextualized Embeddings of Words and Entities for Named Entity Disambiguation
Title | Pre-training of Deep Contextualized Embeddings of Words and Entities for Named Entity Disambiguation |
Authors | Ikuya Yamada, Hiroyuki Shindo |
Abstract | Deep contextualized embeddings trained using unsupervised language modeling (e.g., ELMo and BERT) are successful in a wide range of NLP tasks. In this paper, we propose a new contextualized embedding model of words and entities for named entity disambiguation (NED). Our model is based on the bidirectional transformer encoder and produces contextualized embeddings for words and entities in the input text. The embeddings are trained using a new masked entity prediction task that aims to train the model by predicting randomly masked entities in entity-annotated texts. We trained the model using entity-annotated texts obtained from Wikipedia. We evaluated our model by addressing NED using a simple NED model based on the trained contextualized embeddings. As a result, we achieved state-of-the-art or competitive results on several standard NED datasets. |
Tasks | Entity Disambiguation, Language Modelling |
Published | 2019-09-01 |
URL | https://arxiv.org/abs/1909.00426v1 |
https://arxiv.org/pdf/1909.00426v1.pdf | |
PWC | https://paperswithcode.com/paper/pre-training-of-deep-contextualized |
Repo | |
Framework | |
Unsupervised Neural Generative Semantic Hashing
Title | Unsupervised Neural Generative Semantic Hashing |
Authors | Casper Hansen, Christian Hansen, Jakob Grue Simonsen, Stephen Alstrup, Christina Lioma |
Abstract | Fast similarity search is a key component in large-scale information retrieval, where semantic hashing has become a popular strategy for representing documents as binary hash codes. Recent advances in this area have been obtained through neural network based models: generative models trained by learning to reconstruct the original documents. We present a novel unsupervised generative semantic hashing approach, \textit{Ranking based Semantic Hashing} (RBSH) that consists of both a variational and a ranking based component. Similarly to variational autoencoders, the variational component is trained to reconstruct the original document conditioned on its generated hash code, and as in prior work, it only considers documents individually. The ranking component solves this limitation by incorporating inter-document similarity into the hash code generation, modelling document ranking through a hinge loss. To circumvent the need for labelled data to compute the hinge loss, we use a weak labeller and thus keep the approach fully unsupervised. Extensive experimental evaluation on four publicly available datasets against traditional baselines and recent state-of-the-art methods for semantic hashing shows that RBSH significantly outperforms all other methods across all evaluated hash code lengths. In fact, RBSH hash codes are able to perform similarly to state-of-the-art hash codes while using 2-4x fewer bits. |
Tasks | Code Generation, Document Ranking, Information Retrieval |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00671v1 |
https://arxiv.org/pdf/1906.00671v1.pdf | |
PWC | https://paperswithcode.com/paper/190600671 |
Repo | |
Framework | |
Weighted Distributed Differential Privacy ERM: Convex and Non-convex
Title | Weighted Distributed Differential Privacy ERM: Convex and Non-convex |
Authors | Yilin Kang, Yong Liu, Weiping Wang |
Abstract | Distributed machine learning is an approach allowing different parties to learn a model over all data sets without disclosing their own data. In this paper, we propose a weighted distributed differential privacy (WD-DP) empirical risk minimization (ERM) method to train a model in distributed setting, considering different weights of different clients. We guarantee differential privacy by gradient perturbation, adding Gaussian noise, and advance the state-of-the-art on gradient perturbation method in distributed setting. By detailed theoretical analysis, we show that in distributed setting, the noise bound and the excess empirical risk bound can be improved by considering different weights held by multiple parties. Moreover, considering that the constraint of convex loss function in ERM is not easy to achieve in some situations, we generalize our method to non-convex loss functions which satisfy Polyak-Lojasiewicz condition. Experiments on real data sets show that our method is more reliable and we improve the performance of distributed differential privacy ERM, especially in the case that data scale on different clients is uneven. |
Tasks | |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10308v2 |
https://arxiv.org/pdf/1910.10308v2.pdf | |
PWC | https://paperswithcode.com/paper/weighted-distributed-differential-privacy-erm |
Repo | |
Framework | |
Estimation of perceptual scales using ordinal embedding
Title | Estimation of perceptual scales using ordinal embedding |
Authors | Siavash Haghiri, Felix Wichmann, Ulrike von Luxburg |
Abstract | In this paper, we address the problem of measuring and analysing sensation, the subjective magnitude of one’s experience. We do this in the context of the method of triads: the sensation of the stimulus is evaluated via relative judgments of the form: “Is stimulus S_i more similar to stimulus S_j or to stimulus S_k?". We propose to use ordinal embedding methods from machine learning to estimate the scaling function from the relative judgments. We review two relevant and well-known methods in psychophysics which are partially applicable in our setting: non-metric multi-dimensional scaling (NMDS) and the method of maximum likelihood difference scaling (MLDS). We perform an extensive set of simulations, considering various scaling functions, to demonstrate the performance of the ordinal embedding methods. We show that in contrast to existing approaches our ordinal embedding approach allows, first, to obtain reasonable scaling function from comparatively few relative judgments, second, the estimation of non-monotonous scaling functions, and, third, multi-dimensional perceptual scales. In addition to the simulations, we analyse data from two real psychophysics experiments using ordinal embedding methods. Our results show that in the one-dimensional, monotonically increasing perceptual scale our ordinal embedding approach works as well as MLDS, while in higher dimensions, only our ordinal embedding methods can produce a desirable scaling function. To make our methods widely accessible, we provide an R-implementation and general rules of thumb on how to use ordinal embedding in the context of psychophysics. |
Tasks | |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.07962v1 |
https://arxiv.org/pdf/1908.07962v1.pdf | |
PWC | https://paperswithcode.com/paper/190807962 |
Repo | |
Framework | |
DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting
Title | DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting |
Authors | Hao Xiong, Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Haifeng Wang |
Abstract | In this paper, we present DuTongChuan, a novel context-aware translation model for simultaneous interpreting. This model allows to constantly read streaming text from the Automatic Speech Recognition (ASR) model and simultaneously determine the boundaries of Information Units (IUs) one after another. The detected IU is then translated into a fluent translation with two simple yet effective decoding strategies: partial decoding and context-aware decoding. In practice, by controlling the granularity of IUs and the size of the context, we can get a good trade-off between latency and translation quality easily. Elaborate evaluation from human translators reveals that our system achieves promising translation quality (85.71% for Chinese-English, and 86.36% for English-Chinese), specially in the sense of surprisingly good discourse coherence. According to an End-to-End (speech-to-speech simultaneous interpreting) evaluation, this model presents impressive performance in reducing latency (to less than 3 seconds at most times). Furthermore, we successfully deploy this model in a variety of Baidu’s products which have hundreds of millions of users, and we release it as a service in our AI platform. |
Tasks | Speech Recognition |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.12984v2 |
https://arxiv.org/pdf/1907.12984v2.pdf | |
PWC | https://paperswithcode.com/paper/dutongchuan-context-aware-translation-model |
Repo | |
Framework | |
Interpretable Models of Human Interaction in Immersive Simulation Settings
Title | Interpretable Models of Human Interaction in Immersive Simulation Settings |
Authors | Nicholas Hoernle, Kobi Gal, Barbara Grosz, Leilah Lyons, Ada Ren, Andee Rubin |
Abstract | Immersive simulations are increasingly used for teaching and training in many societally important arenas including healthcare, disaster response and science education. The interactions of participants in such settings lead to a complex array of emergent outcomes that present challenges for analysis. This paper studies a central element of such an analysis, namely the interpretability of models for inferring structure in time series data. This problem is explored in the context of modeling student interactions in an immersive ecological-system simulation. Unsupervised machine learning is applied to data on system dynamics with the aim of helping teachers determine the effects of students’ actions on these dynamics. We address the question of choosing the optimal machine learning model, considering both statistical information criteria and interpretabilty quality. Our approach adapts two interpretability tests from the literature that measure the agreement between the model output and human judgment. The results of a user study show that the models that are the best understood by people are not those that optimize information theoretic criteria. In addition, a model using a fully Bayesian approach performed well on both statistical measures and on human-subject tests of interpretabilty, making it a good candidate for automated model selection that does not require human-in-the-loop evaluation. The results from this paper are already being used in the classroom and can inform the design of interpretable models for a broad range of socially relevant domains. |
Tasks | Model Selection, Time Series |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.11025v1 |
https://arxiv.org/pdf/1909.11025v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-models-of-human-interaction-in |
Repo | |
Framework | |
Neural Network Based Undersampling Techniques
Title | Neural Network Based Undersampling Techniques |
Authors | Md. Adnan Arefeen, Sumaiya Tabassum Nimi, M Sohel Rahman |
Abstract | Class imbalance problem is commonly faced while developing machine learning models for real-life issues. Due to this problem, the fitted model tends to be biased towards the majority class data, which leads to lower precision, recall, AUC, F1, G-mean score. Several researches have been done to tackle this problem, most of which employed resampling, i.e. oversampling and undersampling techniques to bring the required balance in the data. In this paper, we propose neural network based algorithms for undersampling. Then we resampled several class imbalanced data using our algorithms and also some other popular resampling techniques. Afterwards we classified these undersampled data using some common classifier. We found out that our resampling approaches outperform most other resampling techniques in terms of both AUC, F1 and G-mean score. |
Tasks | |
Published | 2019-08-18 |
URL | https://arxiv.org/abs/1908.06487v1 |
https://arxiv.org/pdf/1908.06487v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-based-undersampling-techniques |
Repo | |
Framework | |